Skip to content

Device Timeline View — Design Spec

Date: 2026-05-22 Status: Draft (awaiting user review) Authors: Brainstormed with user; recorded by Claude Code

Motivation

Operators and admins currently view metadata one session at a time (MetadataSessionDetailPage). When diagnosing a device they need to flip through every session manually to see how a sensor behaved over hours or days. There is no place to ask the question "show me this device's temperature, accel, and battery_voltage across the last 24 hours regardless of session boundary."

This spec adds a Grafana-style multi-panel timeline that joins all sessions for a single device into one continuous wall-clock view. Gaps in data (between sessions, or when the device was offline mid- session) render as line breaks — never as interpolated straight lines across missing data — and session boundaries are visually marked.

Non-goals

The following are explicitly out of scope for this spec. Each may become its own future spec.

  • Cross-device comparison (more than one device on the same chart).
  • Anomaly detection / alert wiring on timeline data.
  • Export (CSV / JSON / WAV) — the existing per-session export at /api/v1/metadata/sessions/{id}/export already covers the underlying need; a timeline-scoped export can be added later.
  • Operator app (frontend-app/). This spec implements the admin app (frontend/) only. Phase 2 reuses the same backend endpoint to surface a stripped-down version in the operator app under DeviceDetailPage. A separate spec captures phase 2.

High-level architecture

  • Route (admin): GET /devices/:id/timeline → new frontend/src/pages/DeviceTimelinePage.tsx.
  • Entry point: DevicesPage row gains a small "Timeline →" icon link in the per-row action cluster, next to the existing edit button. Row click behavior is unchanged.
  • URL state: ?from=<ISO>&to=<ISO>&streams=<comma-list>. No parameters → defaults to last 24 h plus every numeric/array stream.
  • API endpoint: GET /api/v1/devices/{device_id}/timeseries.
  • New backend module: crates/xylolabs-server/src/routes/device_timeline.rs (kept separate from devices.rs, which is already large).
  • RBAC: reuses the metadata_query.rs pattern — require_role(User) plus require_facility_access(device.facility_id). Super-admins see any device; facility-scoped users see only their facility's devices.

API contract

Request

GET /api/v1/devices/{device_id}/timeseries
  ?start_us=<i64>           # default: now − 24 h
  &end_us=<i64>             # default: now
  &streams=<comma list>     # optional; default: every non-bytes stream
  &downsample=<i32>         # default: 1000 points per stream (LTTB target)
  • All timestamps are microseconds since UNIX epoch, UTC wall-clock.
  • The endpoint accepts but ignores duplicate stream names.

Response

{
  "device_id": "uuid",
  "device_label": "Living-room-01",       // alias if present, else name
  "facility_id": "uuid",
  "start_us": 1716250000000000,
  "end_us":   1716336400000000,
  "session_count": 12,                    // sessions that contributed
                                          // ≥1 point to a returned
                                          // stream (after the streams=
                                          // filter is applied). A
                                          // session that overlaps the
                                          // range but has no matching
                                          // stream is NOT counted here.
  "session_boundaries": [
    {
      "session_id": "uuid",
      "start_us": 1716250500000000,
      "end_us": 1716252300000000,
      "status": "closed"                  // "active" or "closed"
    }
  ],
  "streams": [
    {
      "name": "temperature",
      "value_type": "f32",
      "unit": "°C",
      "sample_rate_hz": 0.00166,          // device-reported; may be inaccurate
      "points": [
        { "t_us": 1716250500000000, "v": 22.5, "s": "<session-id>" }
      ]
    },
    {
      "name": "audio_left",
      "value_type": "bytes",
      "unit": null,
      "sample_rate_hz": null,
      "points": [],                       // bytes streams have no scalar
                                          // points; events live in
                                          // recording_events below
    }
  ],
  "recording_events": [
    {
      "session_id": "uuid",
      "stream_name": "audio_left",
      "start_us": 1716250500000000,
      "end_us": 1716252300000000,
      "sample_count": 12
    }
  ]
}
  • points[].s is the session UUID. The frontend uses it to (a) attribute hover tooltips to a specific session and (b) detect session-boundary gaps without relying on session_boundaries.
  • recording_events only contains entries for bytes-typed streams. The frontend renders these as dots on the dedicated "Recording events" track instead of attempting a stitched waveform.

Backend chunk fetch strategy

  1. Load the device row; verify facility access.
  2. Query sessions for the device overlapping [start_us, end_us], capped at MAX_TIMELINE_SESSIONS = 200. Over-cap → BadRequest "timeline window matched more than 200 sessions; narrow the time range".
  3. Load all streams for those sessions in a single batched query (metadata_stream::list_by_sessions). Group by stream name.
  4. Filter by the streams query parameter if supplied.
  5. For each stream-group, query chunks in the time range with the paginated repo helper, accumulating across sessions and capped at MAX_TIMELINE_CHUNKS = 500 total. Over-cap → BadRequest "timeline query matched more than 500 chunks; narrow the time range".
  6. Download + decode chunks with bounded concurrency (MAX_CONCURRENT_CHUNK_DOWNLOADS = 8, matching existing handlers). Failed chunks log a warning and are skipped.
  7. Per stream-group: filter samples to range, sort by timestamp, dedupe exact (timestamp_us, session_id) collisions.
  8. Per stream-group: cap accumulated samples at MAX_TIMELINE_SAMPLES = 2_000_000. Over-cap → BadRequest "timeline samples exceed 2,000,000; narrow the time range or request stronger downsampling".
  9. Run LTTB downsampling per numeric/array stream to downsample target points. Bytes streams skip downsampling (they only emit recording_events).
  10. Apply device-clock anchor normalization on the server (a move from the existing frontend-only behavior in StreamChart). For every point, shift the timestamp so the first sample of its session aligns with that session's started_at. The previous behavior in StreamChart (cap the tail at min(now, sessionStart + 1 h) to neutralize device RTC drift) is also performed here per session, so the joined timeline never lands samples in the future. Centralizing this in the backend means a future DeviceTimelineChart cannot drift apart from StreamChart on the same data.
  11. Convert each stream's samples into the JSON shape above.

Error mapping

Every new error literal is added to resolveErrorKey in both frontend/src/lib/errors.ts and frontend-app/src/lib/errors.ts, plus EN/KO i18n entries:

Backend literal i18n key
timeline window matched more than \d+ sessions errors.tooManyTimelineSessions
timeline query matched more than \d+ chunks errors.tooManyTimelineChunks
timeline samples exceed \d+ errors.tooManyTimelineSamples

Other failures (device not found, forbidden) reuse existing 404 / 403 handling and surface verbatim through the new raw-message fallback in friendlyErrorMessage (cycle 5 change).

Frontend components

DeviceTimelinePage.tsx

  • Loads getDevice(id) plus the timeline endpoint with react-query. Loading skeleton on first fetch; in-place data swap on subsequent live polls.
  • Time range chip row: 1h, 6h, 24h (default), 7d, 30d, Custom. The custom picker is a lightweight two-input date-time modal (no new heavy date-picker dependency).
  • Stream selector dropdown in the page header; checkbox per stream name; "All" / "None" toggle. URL streams= reflects the selection.
  • Live pill in the header. While the pill is green the page polls every 15 s with refetchInterval and staleTime: 10s. A click on the pill (or a hover anywhere on a chart panel) sets refetchInterval to false until released; the pill then shows "Paused". A <button> next to the pill performs a manual refresh.
  • The polling-mode fetch sends start_us = lastSeenMaxTs and merges the response into the existing cache (append + dedup on (t_us, session_id)). The first fetch sends the full start_us/end_us window.
  • Each stream gets its own panel rendered with the existing LineChart (Recharts) component pattern, sharing a syncId so hovering one panel highlights the same wall-clock instant on every other panel.
  • Bytes streams render as a single 24 px row at the top of the panel stack — colored dots positioned at each recording's start_us. Dot radius scales with sample_count. Click → opens the underlying MetadataSessionDetailPage in a new tab.
  • Accelerometer streams (accel_x, accel_y, accel_z) reuse the existing AccelChart grouping pattern.

lib/timeline/ (new helper directory)

  • gap-detection.ts — pure function that walks a sorted point array, inserts { t_us, v: null, s: null } breakpoints whenever (a) the gap to the previous point exceeds 3 × (1_000_000 / sample_rate_hz) µs, with a 30 s floor when sample_rate_hz is null or below 1 / 30 Hz, or (b) the session id differs from the previous point's session id. Returns a new array; pure / unit-testable.
  • session-boundary.ts — pure function that converts session_boundaries into a list of ReferenceLine x={t_us} props for the Recharts overlay (faint slate-300 vertical line per session start).
  • url-state.tsuseSearchParams adapter that parses/serializes from/to/streams with sensible defaults and clamps invalid values (e.g. to < from).

API client

  • frontend/src/api/devices.ts gains getDeviceTimeseries(deviceId: string, params: TimelineQueryParams): Promise<DeviceTimeseriesResponse>.
  • The response type lives next to the function. Both EN and KO paths preserve the raw error text via the cycle-5 friendlyErrorMessage rewrite already shipped.

Performance budget

  • Default request: 24 h × 5 streams × ~50 chunks ≈ 250 chunks under the 500 cap. Each chunk averages ~50–250 kB compressed; total payload after LTTB downsample (1000 pts × 5 streams ≈ 40 kB JSON) is well under the existing 100 MB MAX_AUDIO_DATA_BYTES bound used by sister handlers.
  • Recharts safely renders 1000 SVG points per panel × 5 panels = 5000 nodes. The existing MetadataSessionDetailPage already pushes similar counts.
  • Live polling delta-fetch limits the steady-state request size to whatever arrived in the last 15 s — typically a few dozen samples.

Testing

Backend

  • tests/api_device_timeline.rs (new integration suite). Fixtures build a device with two sessions, each containing temperature, humidity, and audio_left streams; chunks are uploaded to the test S3. Tests:
  • Default range returns all numeric streams plus a non-empty recording_events array for the bytes stream.
  • streams= filter limits the response.
  • Time range exclusion (samples outside the window are dropped).
  • Each cap (MAX_TIMELINE_SESSIONS, MAX_TIMELINE_CHUNKS, MAX_TIMELINE_SAMPLES) returns the expected literal so the frontend i18n mapping matches.
  • Facility access enforcement (user from a different facility → 403).
  • Source-grep tripwire: handler uses list_by_stream_in_range_paginated, defines all three caps as constants, and uses bounded MAX_CONCURRENT_CHUNK_DOWNLOADS = 8.

Frontend

  • DeviceTimelinePage.test.tsx with mocked API responses:
  • Gaps between two clusters render as breaks in the Recharts <Line> (no straight line across).
  • Session boundaries appear as ReferenceLine overlays at the correct x values.
  • Bytes streams render as dots in the events track, not as a Recharts panel.
  • Stream picker toggles update the URL streams= parameter and cause refetch.
  • Live pill click toggles polling.
  • Hover on a chart panel pauses polling.
  • Unit tests for lib/timeline/gap-detection.ts covering: empty input, single point, points spanning two sessions, sample-rate- based gap above and below the threshold, the 30 s floor when sample_rate_hz is null.

End-to-end (manual + Playwright)

  • Mobile (375 × 812), tablet (768 × 1024), desktop (1280 × 800) screenshots. Assertions: no pageerror console events on any viewport; Live pill visible; time range chips reachable.
  • Hard-refresh after deploy to confirm new bundle is served.

Deploy verification

  • cargo check + cargo clippy + cargo test -p xylolabs-server.
  • pnpm test and npx tsc -b --noEmit in both frontend/ and frontend-app/ (frontend-app build must stay green even though it is not yet wired to the new endpoint).
  • npx vite build in both apps.
  • bash scripts/deploy.sh to api.xylolabs.com; confirm new JS bundle hash and Up (healthy) container.
  • Spot-check admin.api.xylolabs.com/devices/<id>/timeline for a real device. Verify a deliberately-narrow time range hits a cap error and renders the localized message.

Phase 2 placeholder (out of scope for this spec)

  • Operator app (frontend-app/) consumes the same backend endpoint with two changes:
  • Tighter facility scoping (RBAC already enforces this server-side).
  • Live polling cadence relaxed to 30 s on mobile to save battery.
  • Tracked separately; this spec does not change frontend-app/ files other than mirroring the new i18n keys and resolveErrorKey patterns that the cycle-5 fix established.