XAP — Xylolabs Audio Protocol Specification¶
PATENT PENDING — XAP (Xylolabs Audio Protocol) is a proprietary technology of Xylolabs Inc. Patent applications have been filed. Unauthorized use, reproduction, or distribution is prohibited.
Revision: 2026-03-27
Table of Contents¶
- Overview
- Encoder Architecture
- Frame Wire Format
- Configuration
- Platform Requirements
- Platform Compatibility Matrix
- Audio Quality Characteristics
- SDK Integration
- Comparison with IMA-ADPCM
- Related Documents
- Legal
1. Overview¶
XAP (Xylolabs Audio Protocol) is Xylolabs' proprietary MDCT-based spectral audio codec designed for real-time compression of multi-channel audio on resource-constrained IoT and industrial monitoring hardware.
Key Properties¶
| Property | Value |
|---|---|
| Transform | Modified Discrete Cosine Transform (MDCT) |
| Sample rates | 8, 16, 24, 32, 48, 96 kHz |
| Channels | 1–4 (encoded independently) |
| Frame durations | 7.5 ms, 10 ms |
| Compression ratio | 8:1–10:1 (bitrate-dependent) |
| Bitrate range | 16–320 kbps per channel |
| Typical bitrate | 64–80 kbps per channel |
| Algorithmic delay | 7.5–10 ms (one frame) |
| CPU requirement | ~10 MIPS per channel (with DSP acceleration) |
| RAM per channel | ~8 KB encoder state |
| Codec ID (XMBP) | 0x03 |
Design Goals¶
XAP is designed for industrial audio monitoring applications with the following constraints:
- MCU encode, server decode: The encoder runs on bare-metal or RTOS-based MCUs with no operating system. The decoder runs on the server with no resource constraints.
- LTE-M1 bandwidth budget: Four channels at 96 kHz must fit within approximately 30–40 KB/s sustained uplink, achieved at 64–80 kbps per channel (32–40 KB/s total).
- No dynamic memory allocation: The encoder uses only statically allocated buffers. All state fits within a single
XapEncoderstruct. - FPU/DSP acceleration: XAP requires a hardware floating-point unit and benefits significantly from ARM DSP extensions or SIMD instruction sets. Platforms lacking an FPU must use IMA-ADPCM instead.
2. Encoder Architecture¶
2.1 Signal Flow¶
Interleaved PCM input (i16[])
|
v
De-interleave
(per-channel extraction)
|
v
MDCT Forward Transform
N samples --> N/2 spectral coefficients
|
v
Adaptive Quantization
(step size derived from mean absolute coefficient)
|
v
Coefficient Packing
(2-byte quant header + 8-bit quantized coefficients)
|
v
XAP Frame Output
(5-byte frame header + per-channel payloads)
Each channel is encoded independently. The encoder accepts interleaved multi-channel PCM input, extracts each channel's samples into a contiguous mono buffer, applies the MDCT, quantizes the resulting spectral coefficients, and packs them into the output frame.
2.2 MDCT Forward Transform¶
The Modified Discrete Cosine Transform converts N time-domain PCM samples into N/2 frequency-domain spectral coefficients. The basis function is:
X[k] = sum_{n=0}^{N-1} x[n] * cos(pi/N * (n + 0.5 + N/4) * (k + 0.5))
where:
N = frame_samples (number of PCM samples per channel per frame)
k = coefficient index, 0 <= k < N/2
n = sample index, 0 <= n < N
The transform produces N/2 real-valued coefficients that represent the spectral energy of the frame.
Precomputed Cosine Table¶
For frame sizes where N <= 320 (sample rates up to 32 kHz at 10 ms, or up to 24 kHz at 7.5 ms), the encoder precomputes a fixed-point cosine table at initialization time:
Table dimensions: (N/2) x N entries of i32. Maximum size: 320 samples → 51,200 entries → 200 KB.
Using the precomputed table, the MDCT inner loop reduces to fixed-point multiply-accumulate with no trigonometric function calls, yielding O(N²) multiplies at very low constant overhead. Measured encode times on M-series host:
| Sample Rate | Frame Samples | Table Used | Encode Time (1ch, 10ms) |
|---|---|---|---|
| 8 kHz | 80 | Yes | 0.5 µs |
| 16 kHz | 160 | Yes | 1.1 µs |
| 24 kHz | 240 | Yes | 1.9 µs |
| 32 kHz | 320 | Yes (limit) | 3.0 µs |
| 48 kHz | 480 | No | 512.3 µs |
| 96 kHz | 960 | No | 1954.0 µs |
The 170x discontinuity at 48 kHz (software fallback path) is eliminated on MCU targets by DSP-accelerated MDCT paths (see Section 5).
Stack Allocation Note¶
XapEncoder contains a 200 KB cosine table. On embedded targets, allocate the encoder in a static variable rather than on the stack. The table is only populated when frame_samples <= 320.
2.3 Adaptive Quantization¶
After the MDCT, each coefficient X[k] is quantized to a signed 8-bit integer using an adaptive step size:
Step size derivation:
The step size adapts to the signal level of each frame, distributing the available quantization range across the actual coefficient magnitudes. A larger step size is used for louder frames; a smaller step size for quieter frames.
Quantization:
Coefficients that exceed the representable range after quantization are clipped to [-128, 127].
2.4 Per-Channel Budget Allocation¶
The total frame payload is divided equally among channels after subtracting the 5-byte frame header:
Each channel's packed output is zero-padded to exactly per_ch_budget bytes, producing fixed-size frames suitable for streaming without framing overhead.
3. Frame Wire Format¶
3.1 Frame Header (5 bytes)¶
Every XAP frame begins with a 5-byte fixed header in big-endian byte order:
Offset Size Field Type Description
------ ---- ----------- ------ ------------------------------------------
0 2 frame_samples u16 BE PCM samples per channel in this frame
2 1 channels u8 Number of channels (1–4)
3 2 frame_bytes u16 BE Total frame size including this header
0 1 2 3 4
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| frame_samples (u16 BE) | channels | frame_bytes (u16 BE)
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
^-- continues byte 3..4 -->
3.2 Per-Channel Payload¶
Immediately following the frame header, the payload contains channels sequential channel blocks, each of exactly per_ch_budget bytes. Within each channel block:
Offset Size Field Type Description
------ ---- ----------- ------ ------------------------------------------
0 2 step_size u16 BE Quantization step size for this channel
2 N coeffs[] i8[] Quantized MDCT coefficients (up to per_ch_budget - 2)
The number of coefficient bytes packed is min(N/2, per_ch_budget - 2). If the packed coefficient count is less than per_ch_budget - 2, the remaining bytes are zero-padded to maintain the fixed frame size.
3.3 Complete Frame Structure¶
[frame_samples: 2B][channels: 1B][frame_bytes: 2B] <- 5-byte header
[step_ch0: 2B][q0[0], q0[1], ..., q0[M-1]] <- channel 0 block (per_ch_budget bytes)
[step_ch1: 2B][q1[0], q1[1], ..., q1[M-1]] <- channel 1 block (per_ch_budget bytes)
...
[step_chN: 2B][qN[0], qN[1], ..., qN[M-1]] <- channel N block (per_ch_budget bytes)
where M = per_ch_budget - 2 (coefficient bytes per channel)
3.4 Codec ID in XMBP¶
When carried inside an XMBP (Xylolabs Metadata Batch Protocol) envelope, XAP audio streams use codec identifier 0x03. The XMBP stream header identifies the payload as XAP-encoded audio, and the receiver invokes the XAP decoder for each frame. See XMBP-SPECIFICATION.md for the full wire format of XMBP envelopes.
4. Configuration¶
4.1 XapConfig Fields¶
The XapConfig struct controls all encoder behavior:
pub struct XapConfig {
/// Sample rate in Hz.
/// Supported: 8000, 16000, 24000, 32000, 48000, 96000
pub sample_rate: u32,
/// Frame duration in microseconds.
/// Supported: 7500 (7.5 ms) or 10000 (10 ms)
pub frame_duration: u32,
/// Target bitrate in bits/sec per channel.
/// 0 = auto (~8:1 compression ratio).
/// Valid range when non-zero: 16000–320000 bps
pub bitrate: u32,
/// Number of audio channels. Range: 1–4.
pub channels: u8,
}
4.2 Supported Sample Rates¶
| Sample Rate | Frame Samples (7.5 ms) | Frame Samples (10 ms) | Cosine Table Used |
|---|---|---|---|
| 8,000 Hz | 60 | 80 | Yes |
| 16,000 Hz | 120 | 160 | Yes |
| 24,000 Hz | 180 | 240 | Yes |
| 32,000 Hz | 240 | 320 | Yes (limit) |
| 44,100 Hz | — | — | Not supported |
| 48,000 Hz | 360 | 480 | No (runtime cosf) |
| 96,000 Hz | 720 | 960 | No (runtime cosf) |
44.1 kHz is not supported. All other standard professional rates up to 96 kHz are supported.
4.3 Bitrate and Compression¶
| Bitrate (per channel) | Compression Ratio | Typical Use |
|---|---|---|
| 16–32 kbps | ~12:1–24:1 | Highly constrained bandwidth |
| 64 kbps | ~10:1 | Standard industrial monitoring (recommended) |
| 80 kbps | ~8:1 | Broadcast-grade quality |
| 128 kbps | ~5:1 | High-fidelity monitoring |
| 320 kbps | ~2:1 | Near-lossless |
When bitrate = 0, the encoder targets approximately 8:1 compression based on the PCM frame size.
Frame byte calculation from bitrate:
4.4 Default Configuration¶
The SDK default for the XapEncoder::new_default(sample_rate) constructor:
XapConfig {
sample_rate: <caller-supplied>,
frame_duration: 10_000, // 10 ms
bitrate: 64_000, // 64 kbps per channel
channels: 1,
}
The SDK-level Config default uses audio_sample_rate = 16_000, audio_channels = 4, audio_batch_ms = 500.
5. Platform Requirements¶
5.1 Mandatory: Hardware FPU¶
XAP requires a hardware floating-point unit. The MDCT computation uses single-precision floating-point arithmetic (f32). Platforms without an FPU must use IMA-ADPCM.
Minimum supported core: ARM Cortex-M4F or equivalent (single-precision FPU + DSP extensions).
Platforms without FPU (Cortex-M0+, Cortex-M3, RISC-V without F extension) cannot run XAP. Use features = ["adpcm"] on these targets.
5.2 Recommended: DSP Extensions¶
DSP extensions (ARM SMLAD/SMLAL saturating MAC, or Xtensa PIE SIMD) dramatically reduce MDCT compute cost:
| DSP Architecture | Platforms | Mechanism | XAP Speedup |
|---|---|---|---|
| ARMv8-M DSP (Cortex-M33) | RP2350, nRF9160 | Dual 16x16 MAC (SMLAD), saturating arithmetic |
~30% |
| Cortex-M4F FPU+DSP | STM32F411, STM32WB55, nRF52840 | Hardware float MAC + arm_rfft_fast_f32 (3–5x for MDCT) |
~35–40% |
| Xtensa PIE SIMD (ESP32-S3) | ESP32-S3 | 128-bit SIMD: 4x f32 or 8x i16 per instruction | ~60% |
| None | RP2040, ESP32-C3, STM32F103 | Software multiply only | 0% (XAP not feasible) |
5.3 Resource Table¶
Encoder state and buffer requirements for common configurations (per encoder instance):
| Configuration | XAP Encoder State | Cosine Table | Ring Buffer | XMBP Frame | Total |
|---|---|---|---|---|---|
| 1ch @16kHz, 10ms | ~1 KB | 10 KB | 4 KB | 2 KB | ~17 KB |
| 4ch @16kHz, 10ms | ~4 KB | 10 KB | 16 KB | 4 KB | ~34 KB |
| 4ch @48kHz, 10ms | ~4 KB | — | 32 KB | 8 KB | ~44 KB |
| 4ch @96kHz, 10ms | ~4 KB | — | 64 KB | 8 KB | ~76 KB |
The cosine table (200 KB maximum for N <= 320) is embedded within the XapEncoder struct. For 48 kHz and above (N > 320), no cosine table is allocated; the encoder falls back to runtime computation on the host, or uses DSP-accelerated FFT on MCU targets.
5.4 DSP Acceleration Details¶
ARM Cortex-M33/M4F — CMSIS-DSP¶
Enable with Cargo feature cmsis-dsp or C define XYLOLABS_USE_CMSIS_DSP=1:
- The MDCT forward path dispatches to a dual-MAC inner loop using
SMLAD(Cortex-M33 fixed-point) orarm_rfft_fast_f32(Cortex-M4F floating-point). - Processes two samples per iteration using dual 16x16 multiply-accumulate.
- Saturating arithmetic (
QADD,SSAT) eliminates branch-based coefficient clipping. - Speedup: 30–40% reduction in MDCT compute time (30–60 MIPS savings at 4ch @96kHz).
Xtensa LX7 — ESP32-S3 PIE SIMD¶
Enable with Cargo feature esp32-simd or C define XYLOLABS_USE_ESP32S3_SIMD=1:
- The MDCT inner loop processes 4 samples per iteration using 128-bit vector registers.
- Maps to ESP32-S3 PIE (Processor Instruction Extensions) vector operations on real hardware.
- Hardware AES/SHA offloads TLS from the main CPU, freeing additional cycles for codec.
- Speedup: up to 60% reduction in total XAP CPU usage (50 → 20 MIPS for MDCT at 4ch @96kHz).
6. Platform Compatibility Matrix¶
Evaluated using measured MIPS profiles from burn-in testing and platform datasheets. CPU% represents total system utilization (codec + I/O + transport + sensors + housekeeping) for the maximum supported audio configuration.
| Target | Core | Clock | SRAM | DSP/FPU | Max Audio Config | CPU% | RAM% | Verdict |
|---|---|---|---|---|---|---|---|---|
| RP2350 (Pico 2) | Cortex-M33 | 150 MHz | 520 KB | M33 DSP+FPU | 4ch @96kHz XAP | 46.0% | 16.9% | COMFORTABLE |
| ESP32-S3 | Xtensa LX7 | 240 MHz | 512 KB + 8 MB PSRAM | PIE SIMD+FPU | 4ch @96kHz XAP | 17.7% | 24.2% | COMFORTABLE |
| STM32F411 | Cortex-M4F | 100 MHz | 128 KB | M4F DSP+FPU | 4ch @48kHz XAP | 40.0% | 34.4% | FEASIBLE |
| nRF52840 | Cortex-M4F | 64 MHz | 256 KB | M4F DSP+FPU | 2ch @48kHz XAP | 42.2% | 21.9% | FEASIBLE |
| nRF9160 | Cortex-M33 | 64 MHz | 256 KB | M33 DSP+FPU | 2ch @48kHz XAP | 44.5% | 21.9% | FEASIBLE |
| STM32WB55 | Cortex-M4F | 64 MHz | 256 KB | M4F DSP+FPU | 2ch @48kHz XAP | 42.2% | 17.2% | FEASIBLE |
| RP2040 (Pico) | Cortex-M0+ | 133 MHz | 264 KB | None | ADPCM 4ch @96kHz | 3.0% | 12.1% | ADPCM ONLY |
| ESP32-C3 | RISC-V | 160 MHz | 400 KB | M ext only | ADPCM 4ch @96kHz | 2.5% | 8.0% | ADPCM ONLY |
| STM32F103 | Cortex-M3 | 72 MHz | 20 KB | None | ADPCM 2ch @24kHz | 1.4% | 80.0% | SENSOR ONLY |
Verdict definitions:
| Verdict | CPU Utilization | Meaning |
|---|---|---|
| COMFORTABLE | < 50% | Ample headroom for OTA updates, additional processing, or future features. |
| FEASIBLE | 50–70% | Sufficient for stable operation with careful task scheduling. |
| TIGHT | 70–85% | Operational but may exhibit jitter under worst-case interrupt latency. |
| MARGINAL | 85–100% | Risk of frame drops under load. Not recommended for production. |
| ADPCM ONLY | N/A | No DSP/FPU; XAP encoder cannot run. IMA-ADPCM at 4:1 compression only. |
| SENSOR ONLY | N/A | Extreme SRAM constraint. Minimal ADPCM (1–2ch) plus sensor telemetry only. |
6.1 Detailed CPU Budget — RP2350 (4ch XAP @96kHz)¶
Dual-core Cortex-M33 at 150 MHz. Core 0: I2S DMA + XAP encode. Core 1: XMBP + HTTP + sensors.
| Component | Baseline MIPS | With DSP MIPS | % of 150 MHz |
|---|---|---|---|
| I2S DMA handling | 2 | 2 | 1.3% |
| XAP MDCT forward | 50 | 35 | 23.3% |
| XAP quantize+pack | 15 | 10 | 6.7% |
| XMBP batch encode | 5 | 5 | 3.3% |
| HTTP transport | 10 | 10 | 6.7% |
| Sensor sampling (26ch) | 5 | 5 | 3.3% |
| Watchdog + housekeeping | 2 | 2 | 1.3% |
| Total | 89 | 69 | 46.0% |
| Available headroom | 61 | 81 | 54.0% |
6.2 Detailed CPU Budget — ESP32-S3 (4ch XAP @96kHz)¶
Dual-core Xtensa LX7 at 240 MHz (480 MIPS total).
| Component | Baseline MIPS | With PIE MIPS | % of 480 MHz |
|---|---|---|---|
| I2S DMA handling | 2 | 2 | 0.4% |
| XAP MDCT forward | 50 | 20 | 4.2% |
| XAP quantize+pack | 15 | 6 | 1.3% |
| WiFi stack (FreeRTOS) | 30 | 30 | 6.3% |
| XMBP batch encode | 5 | 5 | 1.0% |
| HTTP/TLS transport | 20 | 12 | 2.5% |
| Sensor sampling (26ch) | 5 | 5 | 1.0% |
| PSRAM DMA management | 3 | 3 | 0.6% |
| Watchdog + housekeeping | 2 | 2 | 0.4% |
| Total | 132 | 85 | 17.7% |
| Available headroom | 348 | 395 | 82.3% |
7. Audio Quality Characteristics¶
7.1 Perceptual Quality by Bitrate¶
| Bitrate (per channel) | Quality Level | Description |
|---|---|---|
| 16–32 kbps | Basic | Recognizable audio; significant spectral artifacts at high frequencies. Suitable for voice monitoring. |
| 48 kbps | Good | Acceptable for broadband monitoring; minor artifacts above 12 kHz. |
| 64 kbps | Near-transparent | Perceptually transparent for most industrial monitoring applications. Recommended default. |
| 80 kbps | Broadcast-grade | Indistinguishable from original in blind tests. Full-spectrum fidelity to 48 kHz. |
| 128+ kbps | High-fidelity | Reference quality. Full spectral content preserved. |
7.2 Latency¶
XAP introduces exactly one frame of algorithmic delay:
| Frame Duration | Algorithmic Delay |
|---|---|
| 7.5 ms | 7.5 ms |
| 10 ms | 10 ms |
There is no look-ahead or overlap buffer in the encoder. End-to-end latency is bounded by frame duration plus network transit time. For real-time monitoring dashboards, the 10 ms frame duration is preferred as the longer encode window provides more headroom for MCU interrupt jitter.
7.3 Channel Scaling¶
XAP scales sub-linearly with channel count due to amortized per-frame overhead:
| Channels @16kHz | Encode Time (host) | Per-Channel | Scaling Factor |
|---|---|---|---|
| 1 | 1.0 µs | 1.0 µs | 1.00x |
| 2 | 1.9 µs | 1.0 µs | 1.90x |
| 3 | 2.8 µs | 0.9 µs | 2.80x |
| 4 | 3.7 µs | 0.9 µs | 3.70x |
The ~8% sub-linear efficiency gain per channel comes from amortized frame header writes and de-interleave overhead. Per-channel MDCT cost dominates.
8. SDK Integration¶
8.1 Rust SDK¶
Add the xap feature to the xylolabs-sdk dependency in Cargo.toml. For DSP-accelerated targets, also enable the appropriate DSP feature:
# RP2350 / nRF9160 (Cortex-M33) — CMSIS-DSP fixed-point path
xylolabs-sdk = { path = "../../crates/xylolabs-sdk", features = ["xap", "cmsis-dsp"] }
# STM32F411 / nRF52840 / STM32WB55 (Cortex-M4F) — CMSIS-DSP floating-point path
xylolabs-sdk = { path = "../../crates/xylolabs-sdk", features = ["xap", "cmsis-dsp"] }
# ESP32-S3 (Xtensa LX7 with PIE) — PIE SIMD floating-point path
xylolabs-sdk = { path = "../../crates/xylolabs-sdk", features = ["xap", "esp32-simd"] }
# Platforms without DSP/FPU (RP2040, ESP32-C3, STM32F103) — ADPCM only
xylolabs-sdk = { path = "../../crates/xylolabs-sdk", default-features = false, features = ["adpcm"] }
Encoder usage:
use xylolabs_sdk::codec::xap::{XapEncoder, XapConfig};
let encoder = XapEncoder::new(XapConfig {
sample_rate: 16_000,
frame_duration: 10_000, // 10 ms
bitrate: 64_000, // 64 kbps
channels: 4,
}).expect("invalid XAP configuration");
// Encode one frame of interleaved PCM
let mut out = vec![0u8; encoder.frame_bytes() as usize];
let written = encoder.encode_frame(&pcm_samples, &mut out);
8.2 C SDK¶
Use the compile-time defines in config.h to select the codec and DSP path. DSP acceleration is auto-detected from compiler flags:
#define XYLOLABS_CODEC_XAP 1 /* enable XAP encoder */
#define XYLOLABS_USE_CMSIS_DSP 1 /* Cortex-M4F / M33 targets */
/* or */
#define XYLOLABS_USE_ESP32S3_SIMD 1 /* ESP32-S3 targets */
Override explicitly via CMake when auto-detection is insufficient:
8.3 Server-Side Decoder¶
The Xylolabs server includes an XAP decoder library (crates/xylolabs-transcode/src/xap_decode.rs) that reconstructs PCM from XAP-encoded audio. The decoder performs the inverse of the encoder:
- Parse frame header (5 bytes): extract
frame_samples,channels,frame_bytes - Per-channel dequantization:
coeff_f32 = quantized_i8 * step_size - Inverse MDCT:
x[n] = (2/N) * Σ X[k] * cos(π/N * (n + 0.5 + N/4) * (k + 0.5)) - Clamp and convert to i16, interleave channels
The decoder reads frame parameters directly from the XAP frame header, making XAP self-describing at the frame level. Sample rate is inferred from frame_samples (e.g., 160 samples → 16 kHz at 10 ms).
Transcode pipeline integration: When a .xap file is uploaded via /api/v1/uploads, the transcode pipeline automatically detects the format, decodes the concatenated XAP frames to a temporary PCM WAV file, then passes it to FFmpeg for transcoding to the target format (Opus, FLAC, MP3, etc.).
9. Comparison with IMA-ADPCM¶
XAP and IMA-ADPCM are the two codecs supported by the Xylolabs SDK. This table summarizes the trade-offs:
| Property | XAP | IMA-ADPCM |
|---|---|---|
| Algorithm | MDCT spectral transform | Sample-by-sample delta quantization |
| Compression ratio | 8:1–10:1 | 4:1 (fixed) |
| Bitrate range | 16–320 kbps/ch | Fixed at sample_rate × 0.5 bytes/s |
| Perceptual quality @64kbps | Near-transparent | Fair — audible quantization noise |
| Spectral fidelity | Full spectrum preserved | High-frequency rolloff under load |
| Algorithmic delay | 7.5–10 ms (1 frame) | Sample-level (< 0.1 ms effective) |
| CPU (with DSP) | ~10 MIPS/ch | < 1 MIPS/ch |
| CPU (without DSP) | Not feasible | < 1 MIPS/ch |
| RAM per channel | ~8 KB | < 1 KB |
| FPU required | Yes | No |
| DSP recommended | Yes | No |
| Platforms supported | M4F, M33, Xtensa LX7 | All platforms |
| Implementation complexity | Higher (MDCT + quant) | Low (lookup tables only) |
| Typical use case | Industrial monitoring, full-spectrum audio | Legacy sensors, CPU-constrained MCUs |
| Integer-only arithmetic | No (requires FPU) | Yes (pure integer) |
Selection guidance:
- Use XAP when the MCU has an FPU (Cortex-M4F or better) and bandwidth efficiency matters. XAP achieves the same audio quality at half the bandwidth of ADPCM.
- Use IMA-ADPCM on platforms without FPU (RP2040, ESP32-C3, STM32F103), or when CPU budget for audio is less than 2 MIPS per channel, or when per-channel RAM must be below 2 KB.
- For mixed deployments, the XMBP protocol supports both codecs on the same stream endpoint. The server decoder auto-selects based on the codec identifier in the XMBP stream header.
10. Related Documents¶
| Document | Description |
|---|---|
| XMBP-SPECIFICATION.md | Xylolabs Metadata Batch Protocol wire format; XAP frames are carried as XMBP audio stream payloads |
| CODEC-ANALYSIS.md | Comparative analysis of 16+ audio codecs across 5 MCU platforms; detailed justification for XAP selection |
| PERFORMANCE-EVALUATION.md | Benchmark data: encode times per frame, channel scaling, DSP impact, MCU feasibility matrix |
| PERFORMANCE-PROFILE.md | DSP acceleration matrix, per-target CPU/memory budgets, CMSIS-DSP integration guide |
| PLATFORM-PICO.md | RP2350 (Pico 2) hardware setup, build configuration, and XAP integration |
| PLATFORM-STM32.md | STM32F103/F411/WB55/WBA55 configuration and CMSIS-DSP setup |
| PLATFORM-ESP32.md | ESP32-S3/C3 WiFi, ESP-IDF integration, PIE SIMD configuration |
| PLATFORM-NRF.md | nRF52840/nRF9160 BLE/LTE-M setup and CMSIS-DSP integration |
| FEASIBILITY-RP2350.md | Detailed feasibility analysis: 4ch @96kHz XAP on RP2350, CPU/memory budget breakdown |
11. Legal¶
XAP (Xylolabs Audio Protocol) is a proprietary technology of Xylolabs Inc.
Patent applications covering the XAP encoding algorithm, frame format, adaptive quantization scheme, and platform-specific DSP acceleration paths have been filed. All rights reserved.
Unauthorized use, reproduction, reverse engineering, distribution, or incorporation of XAP or any portion thereof into third-party products or services is strictly prohibited without a written license from Xylolabs Inc.
For licensing inquiries, contact: legal@xylolabs.com
Trademarks: "Xylolabs", "XAP", and "XMBP" are trademarks of Xylolabs Inc.
Xylolabs Inc. — XAP Specification — Revision 2026-03-27