Native Market Data Protocols: From ITCH/OUCH to Binary Feed Design

Exchanges expose market data through native, binary protocols that prioritize compactness and low latency. Unlike text-based protocols (e.g., FIX), native feeds are often fixed-layout, binary messages transmitted over multicast or TCP. This article explains how native market-data protocols work, how to build robust feed handlers, and the operational trade-offs you must consider when designing low-latency market-data ingestion.

Native vs standardized feeds #

Native feeds (e.g., NASDAQ ITCH/OUCH, CME MDP, LSE Millenium) are purpose-built by exchanges. They trade human readability for efficiency:

Binary, compact encodings to minimize bytes on the wire.
Well-defined, tight message layouts to enable simple parsing.
Often delivered over multicast (market data) to scale distribution and reduce infrastructure cost.

Standardized feeds (e.g., via FIX) are easier to integrate but add overhead and sometimes higher latency. When microseconds matter, firms prefer native binary feeds.

Common protocols and their characteristics #

NASDAQ ITCH (market data) + OUCH (order entry): ITCH is a feed of order book events; OUCH is an order entry protocol used by some matching engines.
CME MDP (market data platform): binary message formats with snapshot + incremental updates.
LSE Millenium, BATS, and other exchanges: similar binary protocols with small differences in message sets and encoding.

Characteristics to watch:

Message framing (fixed-length vs length-prefixed)
Timestamps (epoch vs 64-bit nanoseconds)
Endianness and alignment
Snapshot vs incremental update patterns

Multicast vs TCP #

Multicast
- Pros: low per-subscriber cost; hardware/middleware offloads distribution.
- Cons: packet loss is possible; recovery requires snapshots or replay streams.
TCP
- Pros: reliable by default; ordered delivery.
- Cons: more connections to manage, slightly higher latency in some stacks.

Many venues provide multicast for live data and TCP replay endpoints for gaps and snapshots.

Feed-handler architecture #

A robust feed handler transforms raw messages into a canonical internal representation and publishes updates to downstream consumers (OMS, SOR, risk engines, analytics).

High-level components:

Network layer: NIC tuning (RSS, CPU affinity), hugepages, socket options, kernel bypass (optional).
Parser: binary deserialization into typed messages.
Snapshot manager: apply a full snapshot, then apply incremental messages in sequence.
Order book builder: maintain L1/L2 state using fast in-memory structures (lock-free where possible).
Recovery manager: detect sequence gaps and fetch snapshots or replays as required.
Publisher: efficient message bus to downstream consumers (shared memory, ring buffers, Kafka depending on latency requirements).

NIC and OS tuning (practical tips)#

Use dedicated NICs for market data; disable energy-saving features.
Enable RSS/Receive-side scaling and set CPU affinity for NIC queues.
Turn off interrupts (use busy-polling / NAPI) for ultra-low latency and configure poll intervals.
Use hugepages and pre-allocated memory to avoid page faults on hot paths.
Consider kernel-bypass (DPDK, Netmap) when microsecond-level latency is required, but be aware of complexity and portability trade-offs.

Parsing patterns — correctness and performance #

Parsing binary messages can be straightforward but must be careful about alignment and bounds. Two common approaches:

Zero-copy parsing
- Map the received packet buffer and interpret fields in-place using struct casts or pointer arithmetic.
- Fast, but care required to avoid endianness/unaligned access issues.
Safe deserialization
- Read bytes into typed fields using explicit unpacking (e.g., struct.unpack_from in Python, read_* helpers in C/C++/Rust).
- Slightly slower, but easier to reason about and safer across platforms.

Example: simplified Python-style parser for an ITCH-like fixed message

python

1# Simplified conceptual example — not production-ready
2import struct
3
4# Example: message header has 1 byte type, 2 byte length
5HEADER_FMT = '!BH'  # network byte order: type (1 byte), length (2 bytes)
6
7def parse_packet(buf: bytes):
8    offset = 0
9    while offset < len(buf):
10        msg_type, msg_len = struct.unpack_from(HEADER_FMT, buf, offset)
11        offset += struct.calcsize(HEADER_FMT)
12        body = buf[offset:offset+msg_len]
13        offset += msg_len
14        if msg_type == 1:  # Add order
15            handle_add_order(body)
16        elif msg_type == 2:  # Reduce order
17            handle_reduce_order(body)
18        # ... other message types
19

For C/C++/Rust, prefer reading from a byte-slice and using safe parsing helpers or nom (Rust) to avoid undefined behavior.

Snapshot and incremental update handling #

Most exchange feeds provide a snapshot (full state) mechanism and then a stream of incremental updates. The typical flow:

On startup, request and apply a snapshot to build the base book.
Start applying incremental messages with strictly increasing sequence numbers.
If a sequence gap is detected (missing message), fetch a replay or resync with a fresh snapshot.

Implementation detail: sequence numbers and timestamps are crucial — keep them in 64-bit types to avoid wrap issues.

Recovery strategies #

Socket loss: if packet loss is detected on multicast, the feed handler should automatically request a TCP replay or snapshot.
Missed message: detect using sequence numbers and trigger a resync path.
Partial snapshot: validate checksum / snapshot digest where provided.

Always treat the incremental stream as unreliable and have a fast path to re-acquire a consistent book.

Example: building a tiny L1 book update in Rust (conceptual)#

rust

1// Pseudocode — conceptual only
2struct Book {
3    bid: Option<(u64, f64)>,
4    ask: Option<(u64, f64)>,
5}
6
7fn apply_message(book: &mut Book, msg: &Message) {
8    match msg {
9        Message::Add { side, price, size } => {
10            if side == Side::Bid {
11                book.bid = Some((size, price));
12            } else {
13                book.ask = Some((size, price));
14            }
15        }
16        Message::Trade { price, size } => {
17            // adjust quantities
18        }
19        _ => {}
20    }
21}
22

Publishing to downstream consumers #

Low-latency systems avoid allocations and copies on the hot path. Common patterns:

Ring buffers / LMAX Disruptor style queues to publish updates to multiple consumers.
Shared memory with small control blocks for consumer offsets.
Lightweight serialization (binary blobs + message descriptors) to reduce CPU.

If guaranteed delivery and durability are required, integrate a persistent pipeline (e.g., Kafka) but be aware of added latency.

Testing and CI #

Create a test harness that replays recorded pcap or raw feed captures and exercises parsing, snapshot application, and recovery.
Unit test parsers against sample message captures from exchange spec docs.
Add integration tests for snapshot+incremental sequences and for simulated packet loss scenarios.

Operational monitoring #

Track these key metrics:

Packets per second and messages per second per multicast feed.
Sequence gap count and time to re-sync.
Snapshot fetch latency and success rate.
Consumer lag (how far downstream consumers are behind the live publisher).

Graph these with histograms (p99/p999) and alert on regressions.

Tradeoffs: correctness vs microseconds #

Correctness-first: implement snapshots and recovery, durable logging of incoming raw messages for post-mortem.
Performance-first: zero-copy parsing and kernel-bypass to shave microseconds, but add complexity and testing burden.

Start with a correct, well-tested parser + snapshot/resync flow. Optimize the hot path once you have reliability.

Conclusion #

Native market-data feeds are the backbone of low-latency trading systems. Building robust feed handlers requires careful attention to parsing correctness, snapshot/recovery, NIC tuning, and efficient publication to downstream consumers. With a correct baseline and a clear testing and monitoring plan, you can safely optimize for latency while maintaining correctness.

Native vs standardized feeds #

Native feeds (e.g., NASDAQ ITCH/OUCH, CME MDP, LSE Millenium) are purpose-built by exchanges. They trade human readability for efficiency:

Binary, compact encodings to minimize bytes on the wire.
Well-defined, tight message layouts to enable simple parsing.
Often delivered over multicast (market data) to scale distribution and reduce infrastructure cost.

Standardized feeds (e.g., via FIX) are easier to integrate but add overhead and sometimes higher latency. When microseconds matter, firms prefer native binary feeds.

Common protocols and their characteristics #

NASDAQ ITCH (market data) + OUCH (order entry): ITCH is a feed of order book events; OUCH is an order entry protocol used by some matching engines.
CME MDP (market data platform): binary message formats with snapshot + incremental updates.
LSE Millenium, BATS, and other exchanges: similar binary protocols with small differences in message sets and encoding.

Characteristics to watch:

Message framing (fixed-length vs length-prefixed)
Timestamps (epoch vs 64-bit nanoseconds)
Endianness and alignment
Snapshot vs incremental update patterns

Multicast vs TCP #

Multicast
- Pros: low per-subscriber cost; hardware/middleware offloads distribution.
- Cons: packet loss is possible; recovery requires snapshots or replay streams.
TCP
- Pros: reliable by default; ordered delivery.
- Cons: more connections to manage, slightly higher latency in some stacks.

Many venues provide multicast for live data and TCP replay endpoints for gaps and snapshots.

Feed-handler architecture #

A robust feed handler transforms raw messages into a canonical internal representation and publishes updates to downstream consumers (OMS, SOR, risk engines, analytics).

High-level components:

Network layer: NIC tuning (RSS, CPU affinity), hugepages, socket options, kernel bypass (optional).
Parser: binary deserialization into typed messages.
Snapshot manager: apply a full snapshot, then apply incremental messages in sequence.
Order book builder: maintain L1/L2 state using fast in-memory structures (lock-free where possible).
Recovery manager: detect sequence gaps and fetch snapshots or replays as required.
Publisher: efficient message bus to downstream consumers (shared memory, ring buffers, Kafka depending on latency requirements).

NIC and OS tuning (practical tips)#

Use dedicated NICs for market data; disable energy-saving features.
Enable RSS/Receive-side scaling and set CPU affinity for NIC queues.
Turn off interrupts (use busy-polling / NAPI) for ultra-low latency and configure poll intervals.
Use hugepages and pre-allocated memory to avoid page faults on hot paths.
Consider kernel-bypass (DPDK, Netmap) when microsecond-level latency is required, but be aware of complexity and portability trade-offs.

Parsing patterns — correctness and performance #

Parsing binary messages can be straightforward but must be careful about alignment and bounds. Two common approaches:

Zero-copy parsing
- Map the received packet buffer and interpret fields in-place using struct casts or pointer arithmetic.
- Fast, but care required to avoid endianness/unaligned access issues.
Safe deserialization
- Read bytes into typed fields using explicit unpacking (e.g., struct.unpack_from in Python, read_* helpers in C/C++/Rust).
- Slightly slower, but easier to reason about and safer across platforms.

Example: simplified Python-style parser for an ITCH-like fixed message

python

1# Simplified conceptual example — not production-ready
2import struct
3
4# Example: message header has 1 byte type, 2 byte length
5HEADER_FMT = '!BH'  # network byte order: type (1 byte), length (2 bytes)
6
7def parse_packet(buf: bytes):
8    offset = 0
9    while offset < len(buf):
10        msg_type, msg_len = struct.unpack_from(HEADER_FMT, buf, offset)
11        offset += struct.calcsize(HEADER_FMT)
12        body = buf[offset:offset+msg_len]
13        offset += msg_len
14        if msg_type == 1:  # Add order
15            handle_add_order(body)
16        elif msg_type == 2:  # Reduce order
17            handle_reduce_order(body)
18        # ... other message types
19

For C/C++/Rust, prefer reading from a byte-slice and using safe parsing helpers or nom (Rust) to avoid undefined behavior.

Snapshot and incremental update handling #

Most exchange feeds provide a snapshot (full state) mechanism and then a stream of incremental updates. The typical flow:

On startup, request and apply a snapshot to build the base book.
Start applying incremental messages with strictly increasing sequence numbers.
If a sequence gap is detected (missing message), fetch a replay or resync with a fresh snapshot.

Implementation detail: sequence numbers and timestamps are crucial — keep them in 64-bit types to avoid wrap issues.

Recovery strategies #

Socket loss: if packet loss is detected on multicast, the feed handler should automatically request a TCP replay or snapshot.
Missed message: detect using sequence numbers and trigger a resync path.
Partial snapshot: validate checksum / snapshot digest where provided.

Always treat the incremental stream as unreliable and have a fast path to re-acquire a consistent book.

Example: building a tiny L1 book update in Rust (conceptual)#

rust

1// Pseudocode — conceptual only
2struct Book {
3    bid: Option<(u64, f64)>,
4    ask: Option<(u64, f64)>,
5}
6
7fn apply_message(book: &mut Book, msg: &Message) {
8    match msg {
9        Message::Add { side, price, size } => {
10            if side == Side::Bid {
11                book.bid = Some((size, price));
12            } else {
13                book.ask = Some((size, price));
14            }
15        }
16        Message::Trade { price, size } => {
17            // adjust quantities
18        }
19        _ => {}
20    }
21}
22

Publishing to downstream consumers #

Low-latency systems avoid allocations and copies on the hot path. Common patterns:

Ring buffers / LMAX Disruptor style queues to publish updates to multiple consumers.
Shared memory with small control blocks for consumer offsets.
Lightweight serialization (binary blobs + message descriptors) to reduce CPU.

If guaranteed delivery and durability are required, integrate a persistent pipeline (e.g., Kafka) but be aware of added latency.

Testing and CI #

Create a test harness that replays recorded pcap or raw feed captures and exercises parsing, snapshot application, and recovery.
Unit test parsers against sample message captures from exchange spec docs.
Add integration tests for snapshot+incremental sequences and for simulated packet loss scenarios.

Operational monitoring #

Track these key metrics:

Packets per second and messages per second per multicast feed.
Sequence gap count and time to re-sync.
Snapshot fetch latency and success rate.
Consumer lag (how far downstream consumers are behind the live publisher).

Graph these with histograms (p99/p999) and alert on regressions.

Tradeoffs: correctness vs microseconds #

Correctness-first: implement snapshots and recovery, durable logging of incoming raw messages for post-mortem.
Performance-first: zero-copy parsing and kernel-bypass to shave microseconds, but add complexity and testing burden.

Start with a correct, well-tested parser + snapshot/resync flow. Optimize the hot path once you have reliability.

Native Market Data Protocols: From ITCH/OUCH to Binary Feed Design

Native vs standardized feeds #

Common protocols and their characteristics #

Multicast vs TCP #

Feed-handler architecture #

NIC and OS tuning (practical tips)#

Parsing patterns — correctness and performance #

Snapshot and incremental update handling #

Recovery strategies #

Example: building a tiny L1 book update in Rust (conceptual)#

Publishing to downstream consumers #

Testing and CI #

Operational monitoring #

Tradeoffs: correctness vs microseconds #

Conclusion #

NordVarg Team

Join 1,000+ Engineers

Related Posts

Native Market Data Protocols: From ITCH/OUCH to Binary Feed Design

Native vs standardized feeds #

Common protocols and their characteristics #

Multicast vs TCP #

Feed-handler architecture #

NIC and OS tuning (practical tips)#

Parsing patterns — correctness and performance #

Snapshot and incremental update handling #

Recovery strategies #

Example: building a tiny L1 book update in Rust (conceptual)#

Publishing to downstream consumers #

Testing and CI #

Operational monitoring #

Tradeoffs: correctness vs microseconds #

Conclusion #

NordVarg Team

Join 1,000+ Engineers

Related Posts