Skip to content

Xylolabs Knowledge Base

Indexed reference for the Xylolabs IoT audio and sensor monitoring platform. XAP and XMBP are patent-pending proprietary technologies of Xylolabs Inc.


Core Protocols (Patent Pending)

These two protocols form the foundation of the Xylolabs data pipeline. All device firmware, SDK code, and server ingest logic is built around them.

XAP — Xylolabs Audio Protocol

XAP is Xylolabs' proprietary MDCT-based spectral audio codec for real-time multi-channel audio compression on resource-constrained IoT and industrial monitoring hardware. Codec ID 0x03 in XMBP.

Property Value
Transform MDCT
Sample rates 8, 16, 24, 32, 48, 96 kHz
Channels 1–4
Frame durations 7.5 ms, 10 ms
Compression ratio 8:1–10:1
Bitrate range 16–320 kbps per channel
CPU requirement ~10 MIPS/channel (with DSP)
RAM per channel ~8 KB encoder state

XMBP — Xylolabs Metadata Binary Protocol

XMBP is the compact binary framing protocol for IoT sensor and motor telemetry. It is the on-wire format between all Xylolabs-SDK-equipped devices and the ingest server. Magic bytes: 0x58 0x4D 0x42 0x50 ("XMBP").

Property Value
Byte order Big-endian (network order)
Timestamps u64 microseconds
Min batch size 10 bytes (no device ID, no streams)
Allocation Zero — writes directly into caller-supplied buffer
Storage format XMCH (on-server)
  • XMBP Specification — Wire format, batch envelope, stream block layout, sample layout, value type registry, audio codec identifiers, encoding/decoding API, feature flags, transport, XMCH storage format, wire format examples
  • XMBP Specification (한국어)

Codec Analysis & Performance

Analysis and benchmark data for XAP and all evaluated competing codecs across MCU targets.

Architecture Diagrams

Visual references located in diagrams/:

Diagram File
System architecture overview diagrams/architecture.svg
Ingest pipeline diagrams/ingest-pipeline.svg
Codec comparison chart diagrams/codec-comparison.svg
SDK platform map diagrams/sdk-platforms.svg
Feasibility option A diagrams/feasibility-option-a.svg
Feasibility option B diagrams/feasibility-option-b.svg
Feasibility option C diagrams/feasibility-option-c.svg
Time sync diagram diagrams/feasibility-time-sync.svg
Window alignment diagram diagrams/feasibility-window-align.svg

Platform Guides

Per-MCU integration guides covering hardware setup, pin assignments, codec capability, and SDK wiring.

RP2350 / Pico 2 W (Primary Target)

The primary reference target. Dual Cortex-M33 @ 150 MHz, 520 KB SRAM, PIO-based I2S, CYW43 WiFi/BT, ARMv8-M DSP extensions. Requires external I2S ADC for 96 kHz/24-bit audio.

ESP32-S3 / ESP32-C3

Native WiFi — no external LTE modem required. ESP32-S3 supports 4ch XAP @ 96 kHz via 128-bit Xtensa SIMD. ESP32-C3 (RISC-V) is a sensor-only node.

STM32 (F103, F411, WB55, WBA55)

Cortex-M3/M4F/M33 targets. F411, WB55, and WBA55 support XAP (FPU + DSP). WB55 adds BLE with 4ch ADC at 48 kHz; WBA55 at 96 kHz. F103 is sensor-only (no FPU).

nRF52840 / nRF9160

Nordic Cortex-M4F / Cortex-M33 targets. nRF52840 transports via BLE 5.0; nRF9160 via LTE-M / NB-IoT.

RP2040 / Pico (ADPCM-only)

Dual Cortex-M0+ @ 133 MHz, 264 KB SRAM. No FPU or DSP — XAP is not feasible. ADPCM only. Cost-effective sensor node.


Hardware

Reference hardware documentation for the Xylolabs RP2350 full sensor node.

  • Hardware BOM — Complete bill of materials: MCU (RP2350), audio ADC (PCM1860QDBTRQ1), microphones (WM-61A), environment sensor (SEN0385/CHT832X), accelerometer (ADXL345), LTE modem (BG770A), passives, connectors, purchase links
  • Hardware BOM (한국어)

Deployment & Operations

Production is a single EC2 host (api.xylolabs.com) serving four subdomains behind nginx + Let's Encrypt. Deploys are triggered by scripts/deploy.sh, which builds the Docker image on the remote host, starts Docker Compose (app, postgres, minio), and reloads nginx.

Subdomain Purpose Backing
api.xylolabs.com REST + WebSocket ingestion Axum app container, port 3000
admin.api.xylolabs.com Legacy admin dashboard (React SPA) Axum app serves SPA + API
app.xylolabs.com Operator dashboard (frontend-app) Axum app serves static-app + API
docs.api.xylolabs.com Static documentation bundle Nginx-served static site generated from docs/

SQLx migrations (59 files in crates/xylolabs-db/migrations/) run on every app startup. Never reuse a migration version prefix, and never author a migration that depends on a column added by a later-prefixed migration — a misordered pair on 2026-04-18 (20260418000002 vs 20260418000003) caused a full prod crash-loop on 2026-04-24 that was resolved by renumbering. See the facility_id incident and full migration rules in Deployment Guide › Database Migrations.

  • Deployment Guide — environment variables, RBAC, migrations, nginx/certbot flow, deploy script walkthrough.

Firmware & OTA

  • Firmware releases are uploaded to S3 and tracked in the firmware_releases table
  • Deployments target specific devices via the firmware_deployments table with status tracking (pending → downloading → verified → applied)
  • Devices poll /api/v1/ota/check with their hardware_target to discover updates
  • Progress is reported back via /api/v1/ota/deployments/{id}/progress
  • Admin manages releases and deployments via the Firmware page in the dashboard

Server Features

XAP Server-Side Decoder

The xylolabs-transcode crate includes a full XAP decoder (xap_decode.rs) that reconstructs PCM from XAP-encoded audio. The transcode pipeline automatically detects .xap uploads and decodes them to WAV before FFmpeg transcoding.

  • Inverse MDCT transform: x[n] = (2/N) * Σ X[k] * cos(π/N * (n+0.5+N/4) * (k+0.5))
  • Adaptive dequantization: coeff = quantized_i8 * step_size
  • Supports all XAP sample rates (8–96 kHz), 1–4 channels
  • Sample rate inferred from frame header frame_samples field

Transcode Queue & Stale Job Reaper

Transcoding uses a Postgres-backed job queue with FOR UPDATE SKIP LOCKED for atomic claiming. The worker includes:

  • Bounded concurrency via Semaphore (configurable, 1–32)
  • Event-driven via pg_notify('transcode_queue') with 10s polling fallback
  • Stale job reaper: on startup and every 60s, requeues jobs stuck in 'running' past TRANSCODE_STALE_TIMEOUT_SECS (default 7200 = 2 hours)
  • Retry with backoff: failed jobs auto-requeue if under max_attempts

Live Audio Streaming

Coexists with the session-based batch ingest pipeline. Devices push continuous audio to wss://api.xylolabs.com/api/v1/live/streams/{stream_key}/ingest (API key + live:ingest scope); listeners subscribe at /api/v1/live/streams/{id}/listen.ws?format=lc3 (JWT or API key + live:listen scope). Browsers obtain a short-lived listener JWT via POST /api/v1/auth/live-token (5-minute expiry).

The pipeline lives in crates/xylolabs-server/src/live/manager.rs (LiveAudioManager):

  • Fan-out bus: per-stream tokio::sync::broadcast::Sender<Bytes> with 1024-slot capacity. Listeners that fall behind get dropped frames counted but no backpressure on the producer.
  • Stream-meta cache: mini_moka::sync::Cache with 5-minute TTL eliminates the per-flush DB roundtrip for the (facility_id, retention_days, codec) tuple.
  • Archive flush: every 60 s, raw LC3 frames are zstd-compressed inside tokio::task::spawn_blocking (off the runtime), uploaded to s3://.../live/{facility}/{stream}/lc3/{start}_{end}.bin.zst, and indexed in live_archive_segments. Two AtomicU64 counters (LIVE_ARCHIVE_FLUSH_SUCCESSES, LIVE_ARCHIVE_FLUSH_FAILURES) plus an AtomicI64 LIVE_ARCHIVE_FLUSH_LAST_US distinguish "silent" from "wedged".
  • Retention worker: every 30 min, prunes live_archive_segments rows older than retention_days (default 30) and deletes S3 objects with 8-way buffer_unordered parallelism. Counter LIVE_RETENTION_ROWS_PRUNED tracks total pruned rows.
  • Cleanup task: every 30 min, removes broadcast senders that have no subscribers and no recent producer activity.
  • Orphan reaper: on server startup, crates/xylolabs-server/src/main.rs closes all open live_stream_connections rows from a prior process (mirrors the ingest_sessions cleanup pattern). Prevents leaked connection records from skewing total_subscribers after a crash.

Validation at the REST/WS gate: channels 1–8, sample_rate_hz ∈ {8/16/24/32/48/96 kHz}, bitrate_per_channel_bps 16k–320k, display_name ≤ 200 B, channel_names length-must-match-channels and each ≤ 32 B, no control characters anywhere, transcode_profile ∈ {default, low, high}, retention_days 1–3650.

Audit-logged on every POST/PATCH/DELETE to live_streams (actor user or API key id, before/after state, IP/UA when available).

SuperAdmin observability via GET /api/v1/live/metrics (also rendered as a panel in the admin dashboard):

{
  "archive_flush_successes":    <u64>,
  "archive_flush_failures":     <u64>,
  "active_broadcast_streams":   <u64>,
  "total_subscribers":          <u64>,
  "last_archive_flush_at_us":   <i64>,  // 0 = no flush since boot
  "retention_rows_pruned_total":<u64>
}

Nginx redacts the listener JWT query param from access logs via a dedicated log_format live_scrubbed applied to /api/v1/live/streams/.../listen.ws (path-only; no $query_string). The ingest path uses the same format defensively even though it is API-key authenticated.

Database tables (migrations 20260522130000*): live_streams, live_stream_connections, live_archive_segments.

Full wire contract: API Documentation (English) — Section 25 | 한국어 — §25.

API Request Logging

A Tower middleware layer (api_request_log) captures HTTP request/response metadata for every API call and writes to the api_request_logs table asynchronously. Key features:

  • Privacy-safe by default: sensitive body fields (password, token, secret, api_key, refresh_token, access_token, authorization, cookie) are redacted recursively. Query string parameters matching sensitive keys are redacted. Authorization, Cookie, X-Api-Key, and Proxy-Authorization headers are stripped.
  • Body format tracking: classifies request/response bodies as json, text, base64, or omitted (multipart and oversized payloads). Stores content type, size, and a truncated preview.
  • Facility scoping: every log entry carries a facility_id for multi-tenant filtering. The GET /api/api-logs endpoint is SuperAdmin-only and supports filtering by method, path, status, user, facility, date range, and sort.
  • Health-check exclusion: /api/health and /api/health/ready are excluded from logging to avoid DB spam.
  • Backpressure: DB writes are bounded by a tokio::sync::Semaphore (100 concurrent) to prevent unbounded task accumulation under load.
  • Composite index: facility_id + created_at for fast filtered queries.

Ingest Session Modes

Sessions operate in one of two modes, configured at creation time:

Mode Timeout Use Case
continuous (default) session_timeout_secs (5 min) Nonstop streaming (vibration monitors, audio)
sampling max(sampling_timeout_secs, interval×3) (1h+) Periodic measurement with idle gaps (battery sensors)

Sampling-mode sessions carry additional fields: - sampling_interval_secs: expected seconds between measurement starts (e.g., 300 = 5 min) - sampling_duration_secs: expected seconds per measurement burst (e.g., 10)

The IngestManager uses per-session timeout instead of a global value. This prevents session explosion for devices that sleep between measurement cycles. The frontend detects gaps in time-series data and breaks chart lines at measurement boundaries.

Metadata Stream Visualization

The admin dashboard (admin.api.xylolabs.com) includes a metadata visualization frontend:

  • Uploads list (/metadata): paginated table with device filter, status filter, mode filter, stream counts, sample totals, data sizes
  • Session detail (/metadata/:id): per-stream time-series charts (Recharts LineChart), time range selector, sampling info panel, gap-aware rendering for sampling-mode sessions, non-numeric table fallback
  • Audio waveform (/uploads/:id): wavesurfer.js interactive waveform player with play/pause and time scrubbing
  • Device fleet (/): dashboard with device health status bar, active sessions panel, recent uploads panel
  • API request logs (/api-logs): sortable paginated table of captured HTTP logs with method/path/status/date filters, expandable detail rows showing request/response headers and body previews (JSON, base64 hex dump, or "omitted"), debounced text inputs, and facility-scoped access

Operator Dashboard (frontend-app)

The operator dashboard (app.xylolabs.com) is the primary day-to-day interface for facility operators. Built with React 19 + Vite + TailwindCSS 4, it replaces the legacy admin dashboard for routine monitoring tasks:

  • Home (/): facility overview with KPI cards (device count, active sessions, recent alerts), real-time SSE connection status, time-ago formatted timestamps, and skeleton loading states
  • Devices (/devices): paginated device fleet table with health indicators, detail drill-down, and facility-scoped access
  • Sessions (/sessions): ingest session history with metadata summary and detail view. Super Admin sees sessions across every facility by default (no facility filter); other roles are auto-scoped to their own facility by the backend. Pagination is stable: id DESC tiebreaker on the SQL ORDER BY plus a REPEATABLE READ snapshot for total + rows so pages don't shuffle under concurrent writes.
  • Alerts (/alerts): real-time anomaly feed with rule-based action guide cards, severity indicators, and historical alert browser. Super Admin sees alerts across every facility by default; other roles are auto-scoped to their own facility.
  • Trends (/trends): time-series analytics for facility-level metrics
  • Facility Map (/facility-map): spatial device and sensor visualization
  • Settings (/settings): per-user locale (EN/KO), display preferences, and theme (light/dark/auto)

Features: command palette (Ctrl+K), toast notifications, keyboard shortcuts, responsive mobile layout with bottom-tab navigation, auto night mode, and skeleton loading throughout.

Grafana-style Dashboard Primitives

Both frontends ship a shared dashboard primitive set modeled after Grafana's panel + global time range + auto-refresh pattern. Source lives in frontend/src/components/dashboard/ and a mirrored copy in frontend-app/src/components/dashboard/ (no monorepo package — adapted per-side for differing auth / routing surfaces).

Component Responsibility
DashboardProvider Exposes { timeRange, refreshIntervalMs } to nested panels; syncs to URL (?from=&to=&refresh=).
DashboardGrid 12-col responsive CSS grid. Tablet: 2-col. Mobile (≤ 768 px): single column.
Panel Canonical card primitive — header (title + optional info-tooltip + action menu), body, shared loading / empty / error slots. Named exports PanelSkeleton, PanelEmpty, PanelError.
StatPanel Single big-number panel with optional Recharts sparkline, threshold-color band, font-variant-numeric: tabular-nums, aria-label on the numeric value. Renders explicit empty state when value is null.
TimeRangePicker Relative ranges (Last 5 m / 1 h / 24 h / 7 d / 30 d) + custom datetime-local; aria-haspopup="listbox"; iOS font-size: 16px floor.
RefreshPicker Off / 10 s / 30 s / 1 m / 5 m / 15 m. Off maps to refetchInterval: false.
PanelSkeleton Pulse-animated rows; motion-safe:animate-pulse honors prefers-reduced-motion.

The admin Operations Dashboard (frontend/src/pages/DashboardPage.tsx) is composed of ≥ 10 panels including LiveMetricsCard (live audio pipeline), system health (/api/health), transcode jobs (4-up status strip), recent uploads, active sessions, device fleet, and facility map. Every panel reads time range + refresh from useDashboardContext(). SuperAdmin-only panels (e.g. live audio metrics) fail silently on 403 so the layout stays clean for non-admins.

The facility-user app dashboard (frontend-app/src/pages/DashboardPage.tsx at /dashboard, keyboard shortcut g b) renders 7 facility-scoped panels including My Active Live Streams, Recent Recordings, Recent Alerts, and the SVG DeviceLastSeenHeatmap (device × hour grid, color-banded by recency). All queries are scoped to the user's session facility_id; no cross-facility leakage by construction.

Accessibility / responsive contract: heading semantics on every Panel header, aria-haspopup/aria-expanded on every picker, visible focus rings on action links, motion-safe: on every pulse animation, single-column collapse ≤ 768 px, dark/light mode parity verified.

Internal API

The Internal API (/api/internal/*) provides machine-to-machine endpoints for the GPU inference fleet and anomaly reporting. Authentication is by API key with the internal scope, and every key is scoped to a single facility_id; responses never cross facilities.

  • GPU Server Management (crates/xylolabs-server/src/routes/gpu_servers.rs) -- Per-facility CRUD plus utilization reporting and snapshot endpoints. Each row carries an operational status (online / offline / draining / error) that gates job scheduling and a separate observability health_status (healthy / degraded / unknown) driven by the health checker. SSRF-prone inputs are rejected at create time (link-local, cloud metadata 169.254.169.254, loopback outside XYLOLABS_ENV=development/test). Direct status overrides via PATCH bypass the health-checker state machine and are recorded in the audit log.
  • Inference Models (crates/xylolabs-server/src/routes/inference.rs) -- Per-facility CRUD. Each model points to an S3 artifact and declares a framework (onnx / tensorrt / pytorch / custom) and an input_type (audio / sensor_fusion / image / text). is_active=false removes the model from job/proxy lookups without deleting it.
  • Inference Jobs (crates/xylolabs-server/src/routes/inference.rs) -- Submit, list, get, and cancel async jobs. The background inference_worker (default 4 tokio tasks) atomically claims rows with SELECT ... FOR UPDATE SKIP LOCKED and processes them on the assigned GPU. Submission validates that any pinned gpu_server_id belongs to the caller's facility and is online; payloads are bounded to 64 KB.
  • Inference Proxy (crates/xylolabs-server/src/routes/inference_proxy.rs) -- Synchronous low-latency path. The handler resolves (model_name, model_version) to an active model row, picks a GPU via gpu_server::find_first_available, and forwards POST /v1/inference with a 30 s timeout. Upstream responses are read with a streaming 1 MB bound; oversize bodies are rejected with 400. Persistent upstream errors mark the GPU server with last_error so the health checker can rotate it out; transient HTTP 429/503 are surfaced as 409 Conflict without taking the server offline.
  • Anomaly Detection (crates/xylolabs-server/src/routes/anomaly.rs) -- Reports come from three sources: realtime (inline detection in IngestManager::process_batch), batch (the placeholder POST /anomaly/batch endpoint, which currently records an info-severity marker report and broadcasts an event for downstream batch jobs to consume), and manual. Severities are info / warning / critical. The GET /anomaly/live SSE endpoint forwards events filtered by the subscriber's facility; an event: lag notice is emitted when the broadcast channel overflows.

Background services in crates/xylolabs-server/src/services/:

  • inference_worker.rs -- INFERENCE_WORKER_CONCURRENCY tokio tasks (default 4). Exponential 2 → 30 s backoff when the queue is empty. Cancellation is honoured between claim and mark_running. Response bodies are streamed through a 1 MB bounded reader, error bodies through a 10 KB bound, all failure messages truncated to 256 chars before persistence.
  • gpu_health_checker.rs -- runs every GPU_HEALTH_CHECK_INTERVAL_SECS (default 60, minimum 10). Probes every online server with buffer_unordered(10) parallelism. HTTP 429/503 are tolerated; other failures call mark_server_error.
  • alert_common.rs, alert_text.rs, alert_llm.rs, alert_trigger.rs -- bridge anomalies and configured alert rules to the user-facing alert pipeline (email, SMS via sms.rs, web push via push.rs, webhooks via webhook_dispatch.rs).

Concurrency / capacity knobs: INFERENCE_WORKER_CONCURRENCY, INFERENCE_JOB_STALE_TIMEOUT_SECS, GPU_HEALTH_CHECK_INTERVAL_SECS, ANOMALY_BROADCAST_CAPACITY.

Database tables: gpu_servers, inference_models, inference_jobs, anomaly_reports.

Full endpoint reference: API Documentation (English) -- Section 24 | API Documentation (한국어) -- Section 24


Inference Pipeline

External ML clients fetch session data, run local models, and post results back via the inference pipeline. Anomaly reports are broadcast via SSE to the operator dashboard in real time.

Full endpoint reference: API Documentation (English) -- Section 26 | API Documentation (한국어) -- Section 26

Inference Pipeline Architecture

Device (audio + sensors)
        │
        ▼
  Ingest endpoint  ──▶  PostgreSQL + S3
        │
        ▼
  Inference client polls for closed sessions
  (GET /api/v1/metadata/sessions, /inference-bundle)
        │
        ▼
  Client runs local ML model
  (anomaly detection, event classification, audio analysis)
        │
        ▼
  POST /api/internal/inference/results  (or /results/batch)
        │
        ▼
  API creates anomaly_reports row + broadcasts AnomalyEvent via SSE
        │
        ▼
  Operator app (app.xylolabs.com) renders real-time alert

The pipeline is facility-scoped end-to-end. An inference client authenticated with an internal-scope API key can only read sessions and write reports within its own facility.

GPU Server Management

GPU servers are the inference compute nodes registered per-facility. Each server row carries two orthogonal state flags:

  • status (online / offline / draining / error) — gates whether the server receives jobs and proxy calls. Only online servers are selected by the job scheduler and proxy handler.
  • health_status (healthy / degraded / unknown) — driven exclusively by the gpu_health_checker background task; operators read this as an observability signal, not a scheduling gate.

Direct overrides via PATCH /api/internal/gpu-servers/{id} (setting status manually) bypass the health-checker state machine and are recorded in the audit log. Use them for maintenance windows and emergency rotation.

Health checker (crates/xylolabs-server/src/services/gpu_health_checker.rs): probes every online server at GPU_HEALTH_CHECK_INTERVAL_SECS (default 60, minimum 10) via GET http://<ip>:<port>/health. Persistent failures (non-429/503) call mark_server_error. Transient 429/503 are tolerated without degrading status.

SSRF protection: link-local addresses, 169.254.169.254 (cloud metadata), and loopback are rejected at registration time (except in development/test environments).

Anomaly Detection Workflow

Anomaly reports are created from three sources:

Source How created Typical use
realtime Inline in IngestManager::process_batch when a sensor value crosses a configured threshold Immediate alerts during live ingest
batch POST /api/internal/anomaly/batch creates a placeholder info report and broadcasts a trigger event Kick off downstream batch analysis jobs
manual POST /api/internal/inference/results with results from an external ML model ML-driven anomaly detection after session close

Severity levels: info (informational, no immediate action), warning (investigate soon), critical (requires immediate attention).

Threshold-based detection runs synchronously inside the ingest path and must be O(1). It checks each incoming sensor sample against per-stream thresholds configured in the facility settings.

ML-driven detection runs asynchronously after session close. The inference client fetches the session bundle, runs the model, and calls POST /internal/inference/results. Reports appear on GET /internal/anomaly/live within milliseconds of submission.

Resolution: any report can be marked resolved via POST /api/internal/anomaly/reports/{id}/resolve. Resolved reports are retained for audit purposes; is_resolved: true and resolved_at are set.

Reclassification: PATCH /api/internal/anomaly/reports/{id} lets an operator or automated reviewer update anomaly_type, severity, confidence, or description without creating a new report.

Event and Label Management

anomaly_type is a free-form string (≤128 chars) defined by the application. Use it as a hierarchical classifier, for example:

  • bearing_fault / bearing_wear / bearing_spall
  • overtemperature / thermal_runaway
  • impact_event / resonance / imbalance

Consistency across the facility enables filtering, trend analysis, and alert rule matching. The PATCH /api/internal/anomaly/reports/{id} endpoint is the reclassification path when a label needs correction after human review.

The confidence field (float, 0–1) is set by the ML model and carried through to the alert pipeline. Alert rules can filter on confidence_min to suppress low-confidence noise.

Operator Notification Flow

When an anomaly report is created (from any source), the platform:

  1. Writes the anomaly_reports row to PostgreSQL.
  2. Broadcasts an AnomalyEvent on the tokio::sync::broadcast channel (capacity ANOMALY_BROADCAST_CAPACITY, default 10000).
  3. GET /api/internal/anomaly/live SSE subscribers receive the event, filtered by facility_id.
  4. The alert bridge (services/alert_trigger.rs) evaluates whether any configured alert rule matches the report. If a rule matches, it creates a user-facing alert and dispatches notifications (email, SMS, web push, webhook) per the facility's alert configuration.
  5. The operator app at app.xylolabs.com receives the SSE event and displays the alert card in real time on the /alerts page.

SSE backpressure: if the broadcast channel fills faster than a subscriber can drain it, the subscriber receives event: lag with a skipped count and jumps to the current tail. The ingest path is never back-pressured.

Keep-alive: a ping comment is sent every 30 seconds so HTTP intermediaries (proxies, load balancers) do not close the idle connection.

Inference Worker (Async Job Path)

crates/xylolabs-server/src/services/inference_worker.rs. Runs INFERENCE_WORKER_CONCURRENCY (default 4) independent tokio tasks. Each task:

  1. Polls inference_jobs for facilities with queued rows using SELECT ... FOR UPDATE SKIP LOCKED (prevents double-claim).
  2. Resolves the target GPU: pinned gpu_server_id if set, otherwise gpu_server::find_first_available.
  3. Transitions to running only if mark_running returns Some — if the job was cancelled between claim and transition, the worker drops it without contacting the GPU.
  4. POST /v1/inference to the GPU URL with a 1 MB streaming response bound. Stores the parsed JSON in inference_jobs.result.
  5. Error handling: HTTP 429/503 mark the job failed but do not take the GPU offline. Any other failure marks both the job and the GPU server with last_error.

Empty-queue backoff is exponential (2 → 30 s), resetting to 2 s on any successful claim.

Stale jobs (stuck in running past INFERENCE_JOB_STALE_TIMEOUT_SECS, default 3600) are requeued on worker startup and every 60 seconds thereafter.

Inference Proxy (Sync Path)

POST /api/internal/proxy/inference is the synchronous low-latency alternative to the job queue. It resolves (model_name, model_version) → active model → first available GPU and forwards POST /v1/inference with a 30 s timeout. Use this path for interactive or real-time inference where sub-second latency matters; use the job queue for long-running batch workloads.


API Reference

REST API documentation for the Xylolabs server.

  • API Documentation (English) — Full REST API reference: authentication, facilities, users, API keys, devices, audio upload, audio streaming, transcode jobs, tags, metadata ingest, metadata query, system configuration, dashboard stats, health, XMBP protocol reference, RBAC, error responses, data models, pagination, example workflows
  • API Documentation (한국어)

SDK — Rust Crates

The Rust SDK is the recommended path for all new firmware development. All crates are no_std-compatible.

Core SDK

HAL Crates (Platform Implementations)

Crate Target Transport Codec Embassy chip pin
xylolabs-hal-rp RP2350 (Pico 2) LTE-M1 modem via UART XAP embassy-rp 0.10
xylolabs-hal-esp ESP32-S3, ESP32-C3 Native WiFi (esp-wifi) XAP / ADPCM esp-hal =1.0.0-beta.0 + esp-wifi 0.13 (pinned for board1-v1 firmware)
xylolabs-hal-stm32 STM32F103, F411, U585, WB55, WBA55 LTE-M1 modem via UART / BLE GATT XAP / ADPCM embassy-stm32 0.6
xylolabs-hal-nrf nRF52840, nRF9160 BLE GATT / LTE-M XAP / ADPCM embassy-nrf 0.10

Shared HAL deps: embassy-time 0.5.1, embassy-sync 0.8, defmt 1.1. The xylolabs-hal-esp pin is intentional — xylolabs-platform/firmware/board1-v1 locks to the same beta set; revisit when upstream stabilises esp-hal-embassy 0.10+.

Korean versions: hal-rp · hal-esp · hal-stm32 · hal-nrf

SDK examples — workspace layout

sdk/examples/ ships 12 #![no_std] Embassy examples but cargo cannot build them all from a single workspace, because feature unification across members collides with embassy's chip pins (embassy-stm32 asserts a single chip feature), tick-rate exports (embassy-time exports one TICK_HZ per build), and links = "embassy-time-queue" uniqueness. Each chip / tick-rate / esp-hal series therefore declares its own [workspace] table. The remaining root workspace at sdk/examples/Cargo.toml holds only the two RP2350 examples; standalone workspaces re-share the same [patch.crates-io] block pinning the embassy crates to embassy main commit e9c32931b906 so embassy-stm32-wpan (publish = false on crates.io) can resolve alongside the version-pinned embassy crates.

Build status: - rp2350-sensor, rp2350-audio, stm32f103-sensor, stm32f411-audio, stm32u5-lowpower, stm32wb55-ble, stm32wba55-ble, nrf52840-ble, nrf9160-ltecargo check passes against their respective targets. - rp2350-full-hardware — excluded; PIO/I2C/SPI all moved in embassy-rp 0.10 and needs a focused rewrite. - esp32s3-wifi, esp32c3-wifi — kept on the same esp-hal beta pin as xylolabs-hal-esp.


Code Examples

Legacy C reference examples. For new development, use the Rust SDK.

Platform Example Description
RP2350 docs/examples/pico/ C examples: continuous sensor streaming, periodic sampling, audio upload via I2S + chunked HTTP
RP2350 Full Hardware docs/examples/rp2350-full-hardware/ Field-deployable node: PCM1860 + WM-61A + CHT832X + ADXL345 + BSS84 + BG770A wired to RP2350
ESP32 docs/examples/esp32/ C examples: ESP32-S3 full audio + sensors (XAP, WiFi), ESP32-C3 lightweight sensor-only (XMBP over WiFi)
STM32 docs/examples/stm32/ C examples: F411 audio + sensors (XAP, LTE-M1), F103 sensor-only (ADPCM, LTE-M1), WB55 BLE sensor node
nRF docs/examples/nrf/ C example: nRF52840 BLE sensor beacon with XMBP, ultra-low-power sleep

Korean versions: pico · rp2350-full-hardware · esp32 · stm32 · nrf


Quick Reference

Codec Selection by Platform

Platform XAP ADPCM Notes
RP2350 (Pico 2) Yes — 4ch @ 96 kHz Yes Primary target
ESP32-S3 Yes — 4ch @ 96 kHz Yes Native WiFi
ESP32-C3 No (no FPU) Yes Sensor node only
STM32F411 Yes — 4ch @ 48 kHz Yes
STM32WB55 Yes — 2ch @ 48 kHz Yes BLE offload
STM32F103 No (no FPU) Optional Sensor node only
nRF52840 Yes — 2ch @ 48 kHz Yes Via BLE gateway
nRF9160 No Yes Sensor node only
RP2040 (Pico) No (no FPU) Yes Sensor node only

XMBP Value Type Registry (quick lookup)

See XMBP Specification §5 for the full registry. Audio codec identifiers (including XAP 0x03) are defined in §6.

I16 and I8 Types — Compact Sensor Encoding

XMBP supports two compact integer types for bandwidth-sensitive sensor streams:

Type Wire Tag Value Size Total Sample Size Bandwidth vs F32
i16 0x0B 2 bytes 10 bytes −17% vs F32 (12 bytes)
i8 0x0C 1 byte 9 bytes −25% vs F32 (12 bytes)

Use cases:

  • ADXL345 accelerometer: Raw ADC output is 10–13 bits, fitting naturally in i16. Using i16 instead of f32 saves 17% bandwidth per axis per sample on Cat-M1 links.
  • CHT832X / SHT31 temperature and humidity: 14-bit temperature and 11-bit humidity readings can be stored as raw i16 counts (e.g., hundredths of a degree), avoiding floating-point conversion overhead on no-FPU targets.
  • Any sensor producing small integer ADC counts that would lose no precision in 8–16 bit representation.

SDK methods: meta_feed_i16(stream_index, value: i16) and meta_feed_i8(stream_index, value: i8). These mark the stream type automatically; no separate type declaration is needed per sample. The SDK flushes the stream using write_stream_i16_bulk / write_stream_i8_bulk for efficient batch encoding.


Session Changes — 2026-05-24

Timeline Visualization (Grafana-style)

The device timeline page (/timeline) received a major visualization overhaul modeled on Grafana panel conventions.

Chart upgrades (frontend/src/components/devices/DeviceTimelineChart.tsx): - LineChart replaced with AreaChart with gradient fill and a colored left border per stream - Colors drawn from the arrayColors Grafana-inspired palette (frontend/src/lib/colors.ts) - Shared X-axis: only the bottom chart in a multi-stream stack renders time labels, eliminating label repetition - Inline stats header per stream: min / max / avg / last values displayed above each panel - Compact panel height (120–150 px) instead of tall standalone cards - LTTB downsampling applied before Recharts render in all chart components (DeviceTimelineChart, VectorTripletChart, StreamChart) - Compact pill-style range selector (h-7 buttons) replaces the previous full-height control strip - Recording events track and live segments track rendered as flat bands — no Panel card wrapper - Unified container with a rounded border and divide-y separators instead of individual card stacks

VectorTripletChart (frontend/src/components/metadata/VectorTripletChart.tsx): - LTTB downsampling applied before render - Redundant sort skipped when data arrives pre-sorted from the API

StreamChart (frontend/src/components/metadata/StreamChart.tsx): - Upgraded to AreaChart with gradient fill and a stats header (min/max/avg/last) - Multi-channel audio detection: when audio/info reports > 1 channel, renders separate WaveformPlayer instances per channel - Adaptive tooltip precision

Chart grid theming (frontend/src/lib/chartTheme.ts): - Grid stroke lightened to slate-200 (light mode) / slate-800 (dark mode), matching Grafana's subtle grid style

Performance Improvements

Timeline API: 5.5 s → < 0.5 s

Before After
N+1 per-session DB queries (1 008 roundtrips for ~112 sessions) list_by_streams(stream_ids[]) batch query (~9 roundtrips)
Per-session sequential S3 chunk downloads All chunks downloaded in one buffer_unordered(32) parallel pass
RMS decode attempted on every stream type Bytes-type streams (recording events) skip decode entirely — use DB metadata only
MAX_CONCURRENT_CHUNK_DOWNLOADS = 8 Raised to 32 in device_timeline.rs

Implemented in crates/xylolabs-server/src/routes/device_timeline.rs. The batch DB function list_by_streams lives in crates/xylolabs-db/src/repo/metadata_chunk.rs.

Other performance fixes

  • f32_array mean aggregation: arrays are now mean-aggregated (one representative value per array sample) instead of fully unpacked before downsampling — reduces 2.3 M points to ~144 per axis in typical sessions
  • XAP decode (crates/xylolabs-transcode/src/xap_decode.rs): per-frame Vec allocations hoisted outside decode loops
  • Gap detection (frontend/src/lib/timeline/gap-detection.ts): early-return when no gaps exist in a series
  • VectorTripletChart: skip redundant sort when input data is already sorted

Per-channel Audio Playback and Download

Backend (crates/xylolabs-server/src/routes/metadata_query.rs):

  • GET /api/v1/metadata/sessions/{id}/streams/{stream_id}/audio now accepts ?channel=N (0-indexed). When supplied, the server extracts the requested channel from the decoded XAP PCM and returns a mono WAV. Without the param, the full multi-channel WAV is returned as before.
  • New endpoint GET /api/v1/metadata/sessions/{id}/streams/{stream_id}/audio/info reads only the first XAP chunk header and returns { channels, sample_rate_hz, total_samples_per_channel, frame_duration_us } without decoding the entire stream.
  • Both endpoints are also reachable via API key with the media:read scope.

Frontend (frontend/src/components/metadata/StreamChart.tsx):

  • StreamChart calls /audio/info on mount for Bytes-type streams. If channels > 1, it renders one WaveformPlayer per channel, each bound to ?channel=N. A single-channel stream falls back to the original behavior.

Clock Drift Correction

  • Non-time-filtered chunk queries: device clocks can be days ahead of server time, so chunk queries for the timeline no longer apply a server-side time window at the DB layer. The filter is now applied after clock-anchor correction.
  • Clock-anchor correction is now applied to recording events as well as numeric stream samples.
  • Time-range display filter moved to after clock-anchor correction in device_timeline.rs.

UI Fixes

Fix Location
Download icon was rendering an upload arrow frontend/src/ (icon usage fixed)
Heatmap cells before last_seen all showed red Amber shown for cells between first and last seen; only cells after last_seen + threshold show red
Nav: "Dashboard" appeared twice (sidebar + Dashboard page) Renamed to "Home" / "홈" in i18n
Admin sidebar logo and title were not clickable Now navigate to / on click (frontend/src/components/layout/Sidebar.tsx)
Chart grid too prominent in both themes chartTheme.ts updated to slate-200 / slate-800

New Frontend Source Files

The following files were added to frontend/src/:

File Purpose
lib/timeline/gap-detection.ts Gap detection for time-series data (early-return optimized)
lib/timeline/gap-detection.test.ts Unit tests for gap detection
lib/timeline/session-boundary.ts Session boundary detection for sampling-mode sessions
lib/timeline/session-boundary.test.ts Unit tests for session boundary detection
lib/timeline/url-state.ts URL state sync helpers for the timeline page
lib/timeline/url-state.test.ts Unit tests for URL state
lib/downsampling.ts LTTB downsampling implementation (gap-preserving variant included)
lib/downsampling.test.ts Unit tests for LTTB downsampling
components/devices/DeviceTimelineChart.tsx Grafana-style AreaChart timeline panel
components/devices/DeviceTimelineRecordingEvents.tsx Flat recording events track
components/devices/DeviceTimelineLiveSegments.tsx Flat live segments track
pages/DeviceTimelinePage.tsx Device timeline page
pages/DeviceTimelinePage.test.tsx Page-level smoke test
pages/TimelineIndexPage.tsx Timeline index / device picker landing
components/metadata/VectorTripletChart.tsx 3-axis vector chart (accel/gyro/mag) with LTTB
components/metadata/GyroChart.tsx Gyroscope panel
components/metadata/MagChart.tsx Magnetometer panel

Session Changes — 2026-05-29

1. Charts: Recharts (SVG) → uPlot (Canvas)

All time-series chart components in both frontends were migrated from Recharts (SVG-based) to uPlot (canvas-based), which renders orders of magnitude more points without frame drops.

Migrated components:

Component Frontend Notes
DeviceTimelineChart.tsx frontend/ (admin) Per-stream colors, synced crosshair, shared X-axis, gradient fill, inline stats strip
StreamChart.tsx — numeric and array paths frontend/ (admin) Mean-aggregated array values, LTTB downsampling before render
VectorTripletChart.tsx frontend/ (admin) Accel / gyro / mag triplet panels
DeviceTimelineChart.tsx frontend-app/ (operator) Same panel conventions as admin
StreamChart.tsx frontend-app/ (operator) Operator-facing sensor charts

Not migrated (intentional): WaveformPlayer (WaveSurfer canvas already), FFT spectrogram, boolean bar charts, data tables. These do not benefit from uPlot's time-series optimizations.

Per-stream colors are drawn from the arrayColors Grafana-inspired palette (frontend/src/lib/colors.ts). Crosshairs are synchronized across panels sharing the same X axis. LTTB downsampling is applied server-side for the timeline API and client-side before uPlot render for locally-held data.

2. Daily AI Report (Operator Dashboard)

A new endpoint GET /api/v1/facility/daily-report generates a Korean-language daily facility report using Gemini. The report is cached per-facility for 24 hours in a mini_moka in-memory cache (max 512 entries).

  • Model: configured via GEMINI_MODEL env var (default gemini-3.5-flash). The previous gemini-3.1-flash-lite-preview was retired and returned 404.
  • Handler: crates/xylolabs-server/src/routes/daily_report.rs
  • Auth: JWT, minimum Role::User
  • Query param: facility_id (UUID, optional — inferred from user's facility if omitted)
  • Response shape: { generated_at, facility_name, summary, metrics, sections[], recommendations[] } — see API §27 for full schema.
  • LLM behavior: system prompt forbids IT jargon (데이터베이스, 프로토콜, 세션, 샘플, etc.) and enforces the friendly ~요/~예요 register. Sections always include facility status, device status, and measurement summary. Annotations carry label, value, and trend (up/down/stable).
  • Frontend: rendered as DailyReportCard on the operator HomePage.

3. Operator UX Enhancements

  • FacilityHealthHero: traffic-light at-a-glance health status widget on the operator Home page. Aggregates device online/offline counts and open alert count into a single good/warning/critical signal.
  • Plain-language chart summaries: each sensor chart panel displays a one-line human-readable summary (e.g., "평균 23.4°C, 최고 26.1°C") beneath the title.
  • One-tap alert actions: alert preview cards on Home and the Alerts page support 확인 (acknowledge) and 해결 (resolve) directly in the list — no detail-page navigation required.
  • 44 px tap targets: all interactive controls (buttons, selectors, nav items) raised to 44 px minimum height throughout the operator frontend to meet mobile accessibility guidelines.
  • Dashboard nav icon: added a dedicated icon to the dashboard nav entry.
  • Dashboard logo → home link: clicking the Xylolabs logo in the operator sidebar navigates to / (Home).
  • Engineer /dashboard removed: the /dashboard route was removed from the operator nav; facility managers do not need raw engineering telemetry in their primary navigation.

4. Timeline API Performance (5.5 s → ~66 ms warm)

The device timeseries endpoint (GET /api/v1/devices/{id}/timeseries) was completely re-pipelined to eliminate sequential per-stream DB and S3 round trips.

Before After
N+1 sequential DB queries (one list_by_stream per stream name) One batch list_by_streams(stream_ids[]) query covering all streams at once
Sequential per-stream S3 download passes Single parallel buffered(128) S3 download pass for all chunks across all stream names
Synchronous chunk decode in async task spawn_blocking decode — CPU-bound work offloaded to blocking thread pool
No caching timeline_chunk_cache in-memory decoded-chunk cache (24 h TTL, keyed by S3 object key) with Arc<Vec<MetadataSample>> values for O(1) arc-clone cache hits
MAX_TIMELINE_CHUNKS = 500 Raised to 10 000 (non-time-filtered queries); MAX_TIMELINE_SESSIONS raised to 5 000

Clock-drift fix (ARCH-C5-24-01/02): chunk queries no longer apply a server-side time window at the DB layer. Device clocks can be ahead of server time by days; the time filter is now applied after the per-session clock-anchor correction. The same anchor logic is applied to recording events and numeric stream samples.

f32_array mean aggregation: F32Array, F64Array, and I32Array samples are now mean-aggregated (one representative F64 value per array) instead of unpacked element-by-element. This avoids materializing ~2 M points for typical accelerometer sessions and keeps the point count manageable for LTTB and uPlot.

Code: crates/xylolabs-server/src/routes/device_timeline.rs; batch DB function in crates/xylolabs-db/src/repo/metadata_chunk.rs (list_by_streams); cache lives on AppState as timeline_chunk_cache.

5. Per-Channel Audio Playback

Backend (crates/xylolabs-server/src/routes/metadata_query.rs):

  • GET /api/v1/metadata/sessions/{id}/streams/{stream_id}/audio?channel=N extracts a single channel (0-indexed) from the decoded XAP PCM and returns a mono WAV. Without ?channel, the full multi-channel WAV is returned as before.
  • GET /api/v1/metadata/sessions/{id}/streams/{stream_id}/audio/info reads only the first XAP chunk header and returns { channels, sample_rate_hz, total_samples_per_channel, frame_duration_us } — lightweight channel-count detection without a full decode.
  • Both endpoints accept JWT (Role::User) or API key (media:read scope).

Frontend (frontend/src/components/metadata/StreamChart.tsx):

  • StreamChart calls /audio/info on mount for Bytes-type streams. If channels > 1, it renders one WaveformPlayer per channel bound to ?channel=N. Single-channel streams fall back to the original behavior.
  • normalize: false on WaveformPlayer (absolute amplitude, not normalized to peak).
  • Max zoom raised to 96 000.

These endpoints were already present in the codebase before 2026-05-29; this session confirmed and documented them.

6. Infrastructure / Policy

  • nofile ulimit raised 1024 → 65536 in docker-compose.yml (both soft and hard). Root cause: under WiFi-flap / mass-power-cycle device reconnection storms, each live-stream WebSocket connection, listener fan-out socket, and S3 client connection consumes a file descriptor. The 1024 default was exhausted, surfacing as EMFILE: Too many open files bursts that self-healed only after the storm subsided. Fix: AGG-C1-D5.
  • Session TTL locked: refresh token TTL = 1 year (JWT_REFRESH_TTL_SECS=31536000), access token TTL = 1 day (JWT_ACCESS_TTL_SECS=86400). Defaults are hardcoded in config.rs and .env.example. The LOCKED policy is enforced in AGENTS.md and CLAUDE.md. Do not reduce these values.

7. Copy Quality — Jargon Purge

Operator-facing engineering jargon was systematically replaced with plain Korean throughout frontend-app/:

Old (jargon) New (plain)
스트림 측정 기록
세션 측정 기간
샘플 측정값
청크 기간별 측정값
타임라인 시간 흐름
다운샘플링 요약

The KO copy was also naturalized to remove AI-generated phrasing. A jargon-lint test (frontend-app/src/i18n/__tests__/jargon-lint.test.ts) guards against regression — it scans all KO translation strings for the banned terms at test time.

8. Known Limitations

See .context/plans/KNOWN-LIMITATIONS.md for the current 17-item ledger of known limitations and the locked-policy table. Key items relevant to this session's work:

  • Daily report requires GEMINI_API_KEY to be set; missing key returns 500 Internal Server Error (not a graceful degradation).
  • Timeline API cache (timeline_chunk_cache) is in-memory and does not survive server restart; first warm-up after deploy will see full S3 latency.
  • Per-channel audio extraction is limited to XAP-encoded Bytes-type streams; non-XAP byte streams return 400 Bad Request.
  • f32_array mean aggregation loses per-element detail — the timeline shows one representative point per array sample, not the full vector.

Revision: 2026-05-29