OTLP Logs on the Wire: AnyValue’s Oneof Tags, Attribute KVLists, and a Zero-Allocation Rust Protobuf Fast-Path

Logs are “just strings” right up until you ship structured logs over OTLP and discover:

every attribute is another nested protobuf message,
AnyValue is a oneof (branching on the wire),
and the cost is dominated by length-delimited blobs (wire type 2) with varint lengths.

This deep dive is about the exact bytes involved when you send OTLP logs (over gRPC or HTTP):

protobuf tags ((field_number << 3) | wire_type) and varints
the on-the-wire layout of ExportLogsServiceRequest
how AnyValue encodes its oneof
why KeyValueList exists (and how it hurts)
a pointer-based Rust decoder that can “skim” an OTLP payload without allocations

If you can read these bytes, you can explain why:

your Collector CPU spikes when someone adds a 2KB JSON blob to body,
Loki ingest via OTLP behaves differently from Loki’s native push API,
and why “protobuf is fast” becomes untrue once you embed maps-of-maps.

Specs we rely on (manually verified)

These links were fetched and checked for the specific definitions cited.

OTLP spec: OTLP uses Protocol Buffers schema and is implemented over gRPC/HTTP.
- https://opentelemetry.io/docs/specs/otlp/
Protobuf encoding: varints; tag is (field_number << 3) | wire_type; wire types include LEN=2.
- https://protobuf.dev/programming-guides/encoding/
OTLP logs service request: ExportLogsServiceRequest { repeated ResourceLogs resource_logs = 1; }
- https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/collector/logs/v1/logs_service.proto
OTLP logs data model: LogsData, ResourceLogs, ScopeLogs, and LogRecord.
- https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/logs/v1/logs.proto
AnyValue, KeyValue, KeyValueList definitions.
- https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/common/v1/common.proto
Loki HTTP API docs: Loki exposes both native push and an OTLP logs ingest endpoint (POST /otlp/v1/logs).
- https://grafana.com/docs/loki/latest/reference/loki-http-api/

Layer 0 recap: protobuf records are TLV-ish (tag + payload)

From the protobuf wire format:

each field record starts with a tag varint
tag = (field_number << 3) | wire_type
for wire type LEN (2) the payload is: len_varint + len bytes

So most OTLP payloads look like an endless stream of:

[tag varint][len varint][len bytes]
[tag varint][len varint][len bytes]
...

Mermaid: the byte tape you’re actually parsing

flowchart LR
  subgraph REC["Protobuf record (wire type = LEN)"]
    T["tag varint\n(field<<3 | 2)"] --> L["len varint"] --> P["payload bytes (len)"]
  end

Operational implication: OTLP logs ingest is fundamentally scan + varint + bounds-check. If your implementation copies slices around, you lose.

ExportLogsServiceRequest: why the first byte is usually `0x0A`

The OTLP logs service request is:

message ExportLogsServiceRequest {
  repeated opentelemetry.proto.logs.v1.ResourceLogs resource_logs = 1;
}

Field 1, wire type LEN=2 → tag = (1 << 3) | 2 = 0x08 | 0x02 = 0x0A.

So a request containing exactly one ResourceLogs typically begins as:

bytes	meaning
`0A`	tag: field 1, LEN
`..`	varint length of the embedded `ResourceLogs` message
`..`	`ResourceLogs` bytes

This is not trivia: if you’re sampling or applying ingress limits, you can detect “payload is OTLP logs” and count nested elements without a full decode.

The real villain: `AnyValue` (a `oneof` that turns types into tags)

AnyValue is defined as a oneof:

message AnyValue {
  oneof value {
    string string_value = 1;
    bool bool_value = 2;
    int64 int_value = 3;
    double double_value = 4;
    ArrayValue array_value = 5;
    KeyValueList kvlist_value = 6;
    bytes bytes_value = 7;
    int32 string_value_strindex = 8;
  }
}

On the wire, the oneof is simply “which field number appears”.

Concrete: encoding `AnyValue { string_value = "hi" }`

field = 1
wire type = LEN (string)
tag = (1<<3)|2 = 0x0A

Bytes:

offset	byte(s)	meaning
0	`0A`	tag: string_value (field 1, LEN)
1	`02`	length = 2
2..3	`68 69`	ASCII `h i`

That’s the cheap case.

The expensive case: `kvlist_value` (maps become repeated messages)

kvlist_value is field 6, LEN type → tag = (6<<3)|2 = 0x32.

And inside KeyValueList:

message KeyValueList { repeated KeyValue values = 1; }
message KeyValue { string key = 1; AnyValue value = 2; }

So a single “map entry” becomes (at minimum):

one LEN field for the KeyValue message
inside it: one LEN field for key
and one LEN field for embedded AnyValue

In other words: maps inflate into nested TLV chains.

Mermaid: `KeyValue` as nested LEN fields

flowchart TB
  KV["KeyValue (embedded message)"] --> K["1: key (LEN string)"]
  KV --> V["2: value (LEN AnyValue)"]
  V --> ONEOF["AnyValue oneof field\n(1..8)"]

LogRecord.flags: bit fields you should actually use

The logs proto defines LogRecord.flags as fixed32 and also defines LogRecordFlags:

“Bits 0-7 are used for trace flags.”
LOG_RECORD_FLAGS_TRACE_FLAGS_MASK = 0x000000FF

That’s a direct invitation to avoid string parsing and build branch-free routing:

if trace-flags bit 0 (“sampled”) is set, keep the log
otherwise, drop or downsample

Bit layout (little-endian on the wire, but logically a 32-bit word)

bits	meaning
0..7	trace flags
8..31	reserved

A practical gotcha: fixed32 is wire type 5 (I32) and encoded as 4 raw bytes little-endian.

So if you see a tag with wire type I32, your next 4 bytes are the value.

Rust: a “skim decoder” for OTLP logs (count records, sample, extract a few keys)

Sometimes you don’t want to fully decode OTLP logs; you want a fast path to:

count LogRecords
pull severity_number, time_unix_nano
optionally extract body.string_value if present

…and otherwise skip bytes.

Below is a deliberately low-level protobuf reader:

pointer-based varint decode
skip unknown fields by wire type
no allocations
no prost structs

Core: varint + tag split

use core::ptr;
 
#[inline(always)]
unsafe fn read_u64_varint(mut p: *const u8, end: *const u8) -> Option<(u64, *const u8)> {
    let mut x: u64 = 0;
    let mut shift = 0;
    while p < end && shift < 70 {
        let b = ptr::read(p);
        p = p.add(1);
        x |= ((b & 0x7f) as u64) << shift;
        if (b & 0x80) == 0 {
            return Some((x, p));
        }
        shift += 7;
    }
    None
}
 
#[inline(always)]
fn split_tag(tag: u64) -> (u32, u8) {
    let wire = (tag & 0x7) as u8;
    let field = (tag >> 3) as u32;
    (field, wire)
}

Skip logic: make unknown fields cheap

#[inline(always)]
unsafe fn skip_field(wire: u8, mut p: *const u8, end: *const u8) -> Option<*const u8> {
    match wire {
        0 => { // VARINT
            let (_v, p2) = read_u64_varint(p, end)?;
            Some(p2)
        }
        1 => { // I64
            if end.offset_from(p) < 8 { return None; }
            Some(p.add(8))
        }
        2 => { // LEN
            let (len, p2) = read_u64_varint(p, end)?;
            let len = len as isize;
            if end.offset_from(p2) < len { return None; }
            Some(p2.offset(len))
        }
        5 => { // I32
            if end.offset_from(p) < 4 { return None; }
            Some(p.add(4))
        }
        _ => None, // groups deprecated; treat as invalid
    }
}

Skim `AnyValue` for `string_value` only

This is the trick that keeps your hot path from exploding: if your backend only needs the string body most of the time, you don’t decode arrays/maps.

/// Returns Some(&str bytes) if AnyValue is a string_value, otherwise None.
unsafe fn anyvalue_string<'a>(mut p: *const u8, end: *const u8) -> Option<(&'a [u8], *const u8)> {
    while p < end {
        let (tag, p2) = read_u64_varint(p, end)?;
        p = p2;
        let (field, wire) = split_tag(tag);
 
        // AnyValue.string_value = 1 (LEN)
        if field == 1 && wire == 2 {
            let (len, p3) = read_u64_varint(p, end)?;
            let len = len as isize;
            if end.offset_from(p3) < len { return None; }
            let bytes = core::slice::from_raw_parts(p3, len as usize);
            return Some((bytes, p3.offset(len)));
        }
 
        // skip other oneof arms
        p = skip_field(wire, p, end)?;
    }
    Some((&[], p))
}

This code is intentionally “unsafe and boring” because it matches the wire format precisely.

Where SIMD helps (and where it doesn’t)

SIMD can help with varint termination scanning (find first byte where MSB=0), but:
OTLP log payloads often have many small varints (1 byte) and many LEN fields where the expensive part is hashing keys and UTF-8 validation, not varint math.

In other words: don’t write SIMD until you’ve proven your hot path is “varint-bound”. It usually isn’t.

Architectural trade-offs: OTLP logs vs “just ship JSON”

OTLP logs (protobuf) trade-offs

Pros
- typed values (int64, double, nested arrays)
- consistent schema and semantic conventions
- can be transported over OTLP/gRPC with backpressure
Cons
- maps become nested messages (KeyValueList → KeyValue → AnyValue)
- decoding cost is dominated by length-delimited blobs + nested recursion
- string keys repeat (unless you introduce dictionary-like schemes — not broadly used for logs)

Loki endpoints: native push vs OTLP ingest

Loki’s docs show both:

POST /loki/api/v1/push (native Loki push)
POST /otlp/v1/logs (OTLP logs ingest)

This matters operationally:

Loki-native push has a well-understood “streams + entries” model and can be optimized for Loki’s internal chunking.
OTLP logs ingest has to translate from ResourceLogs/ScopeLogs/LogRecord and decode AnyValue (including deep KVLists), which can be CPU-expensive depending on your attribute shape.

Go vs Rust (the uncomfortable truth)

Rust can win on:

skipping unknown fields cheaply
pointer-based parsing with minimal bounds checks
avoiding allocations in the fast path

Go can win because:

the protobuf + gRPC stacks are brutally production-hardened
CPU profiles often show Go “wasting less time” on string handling due to runtime optimizations and mature libraries
the integration costs (load shedding, queues, backpressure, retry) are easier to get correct

I’ve repeatedly seen “a slower decoder” beat “a faster decoder” because it sits inside a better-shaped pipeline.

Provocative conclusion: the structured-logs paradox

Structured logs promise “more query power” because you ship more structure.

But the moment you ship more structure, you pay for:

repeated keys,
nested TLV re-encoding,
and deep AnyValue trees that have to be traversed somewhere.

Research Question:

Can we design an OTLP-compatible log transport that keeps the semantic model but adds dictionary + columnar encoding for attributes (à la Parquet), so “high-cardinality keys” stop dominating CPU?

If we can, why do some Go pipelines still outperform custom Rust decoders — is the limiting factor really parsing, or the emergent behavior of batching, queues, and backpressure under bursty log loads?

Explorer

Table of Contents

OTLP Logs on the Wire: AnyValue’s Oneof Tags, Attribute KVLists, and a Zero-Allocation Rust Protobuf Fast-Path

Specs we rely on (manually verified)

Layer 0 recap: protobuf records are TLV-ish (tag + payload)

Mermaid: the byte tape you’re actually parsing

ExportLogsServiceRequest: why the first byte is usually `0x0A`

The real villain: `AnyValue` (a `oneof` that turns types into tags)

Concrete: encoding `AnyValue { string_value = "hi" }`

The expensive case: `kvlist_value` (maps become repeated messages)

Mermaid: `KeyValue` as nested LEN fields

LogRecord.flags: bit fields you should actually use

Bit layout (little-endian on the wire, but logically a 32-bit word)

Rust: a “skim decoder” for OTLP logs (count records, sample, extract a few keys)

Core: varint + tag split

Skip logic: make unknown fields cheap

Skim `AnyValue` for `string_value` only

Where SIMD helps (and where it doesn’t)

Architectural trade-offs: OTLP logs vs “just ship JSON”

OTLP logs (protobuf) trade-offs

Loki endpoints: native push vs OTLP ingest

Go vs Rust (the uncomfortable truth)

Provocative conclusion: the structured-logs paradox

Table of Contents

Backlinks

Graph View

Explorer

Table of Contents

OTLP Logs on the Wire: AnyValue’s Oneof Tags, Attribute KVLists, and a Zero-Allocation Rust Protobuf Fast-Path

Specs we rely on (manually verified)

Layer 0 recap: protobuf records are TLV-ish (tag + payload)

Mermaid: the byte tape you’re actually parsing

ExportLogsServiceRequest: why the first byte is usually 0x0A

The real villain: AnyValue (a oneof that turns types into tags)

Concrete: encoding AnyValue { string_value = "hi" }

The expensive case: kvlist_value (maps become repeated messages)

Mermaid: KeyValue as nested LEN fields

LogRecord.flags: bit fields you should actually use

Bit layout (little-endian on the wire, but logically a 32-bit word)

Rust: a “skim decoder” for OTLP logs (count records, sample, extract a few keys)

Core: varint + tag split

Skip logic: make unknown fields cheap

Skim AnyValue for string_value only

Where SIMD helps (and where it doesn’t)

Architectural trade-offs: OTLP logs vs “just ship JSON”

OTLP logs (protobuf) trade-offs

Loki endpoints: native push vs OTLP ingest

Go vs Rust (the uncomfortable truth)

Provocative conclusion: the structured-logs paradox

Table of Contents

Backlinks

Graph View

ExportLogsServiceRequest: why the first byte is usually `0x0A`

The real villain: `AnyValue` (a `oneof` that turns types into tags)

Concrete: encoding `AnyValue { string_value = "hi" }`

The expensive case: `kvlist_value` (maps become repeated messages)

Mermaid: `KeyValue` as nested LEN fields

Skim `AnyValue` for `string_value` only