Skip to content

Discussion: Serialization strategy for TRAPI messages — TOM objects, split storage, and codec options #99

Description

@kennethmorton

Context

We're exploring ways to reduce the serialization/deserialization overhead when passing large TRAPI messages between shepherd workers. Each worker in the pipeline currently does a full decode (zstandard decompress + JSON/msgpack parse) of the entire message from Redis, processes it, then re-encodes and writes it back. For large messages this adds up across the pipeline.

We evaluated several options:

  • Codec swap (orjson → msgspec msgpack) — Phase 1 is on branch claude/jolly-bohr-KLfa9
  • Typed deserialization using TOM (translator_tom) Pydantic objects or msgspec Structs
  • Split Redis storage by message section so workers only decode what they need

This issue captures the analysis to date as a discussion point.


Performance Impact of Using TOM Objects in Shepherd

The Pipeline Cost Per Query

A typical Aragorn query passes through roughly 8 workers in sequence, each doing a get+save cycle against Redis. Here's the serialization budget for a 5,000-element message (a realistic large query with 5K nodes, 5K edges, 5K results):

Step Current (plain dicts) With TOM Objects
Decompress + decode to dicts ~83ms ~83ms
Pydantic model_validate ~250ms (est. ~3× decode time)
Worker logic varies same (but richer API)
Encode + compress ~6ms ~6ms
Per-worker overhead ~89ms ~339ms
8-worker pipeline total ~712ms ~2,712ms

The validation cost comes from constructing ~200,000 model instances per decode (one per node, edge, result, attribute, etc.) multiplied by 8 workers in the pipeline.

Message Size Breakdown

At the 5K scale, the knowledge graph dominates the message:

Section Size Share
knowledge_graph 2.96 MB 75%
results 0.81 MB 20%
auxiliary_graphs 0.14 MB 3%
query_graph <0.01 MB <1%

What Each Worker Actually Touches

Not every worker needs every section. Here's the access pattern across the pipeline:

Worker Reads Modifies Ser/deser cycles Notes
merge_message KG + results + aux (×3 messages) all 3 gets + 1 save Semantic merge, dedup, edge remapping — richest logic
aragorn_score full message results.analyses.score 1+1 Sets one field per analysis
aragorn_omnicorp KG + results node/edge attributes 1+1 Appends attributes
filter_results_top_n results only truncates list 1+1 results[:N]
filter_kgraph_orphans full message removes KG entries 1+1 Set operations on keys
sort_results_score results reorders list 1+1 Sort by score
filter_analyses_top_n results truncates analyses 1+1 Similar to filter_results
finish_query reads for callback none 1 get Read-only
aragorn/bte (entry) query_graph none 1 get Reads 2 fields
aragorn_lookup/bte_lookup query_graph + params params 1 get Small message portion

Where TOM Objects Add Value vs. Overhead

TOM provides a well-designed set of helpers for TRAPI operations — update() for merging knowledge graphs, normalize() for hash-based edge ID remapping, hash-based result deduplication, constraint checking, and more. These are genuinely useful for complex semantic operations.

However, in the shepherd pipeline, most workers perform straightforward dict operations (append to a list, sort by a key, truncate, set a field). For these workers, the Pydantic validation cost of constructing typed objects outweighs the benefit — there's no complex logic that benefits from the richer API.

The one worker where TOM's helpers would clearly pay for themselves is merge_message, which does real semantic operations (merge KGs, deduplicate results, remap edge IDs). The question is whether that one worker's benefit justifies the validation cost, or whether it's better to use TOM selectively.

Scaling Behavior

For reference, here's how serialization cost scales with message size:

Scale Raw Size Compressed Decode Time Encode Time Round-trip
Small (100 elements) 0.1 MB 5 KB 0.8ms 0.2ms 1.0ms
Medium (1K) 0.8 MB 32 KB 12ms 2ms 13ms
Large (5K) 3.9 MB 135 KB 103ms 12ms 115ms
XL (10K) 7.8 MB 250 KB 350ms 23ms 373ms
XXL (50K) 39.7 MB 1.1 MB 2,365ms 119ms 2,484ms

Decode dominates encode by ~10×. This is because decoding must construct hundreds of thousands of Python objects (dicts, lists, strings), while encoding just walks the existing object tree.


Possible Approaches (Not Mutually Exclusive)

1. Split Redis storage by message section

Store {id}:kg, {id}:results, {id}:aux, {id}:qg as separate keys. Workers fetch only the sections they need. Since the knowledge graph is 75% of the message and only 5 of 16 workers access it, this could reduce decode time by ~40–60% for most workers with no new dependencies.

2. Use TOM selectively

Use TOM objects in merge_message (and potentially other workers that benefit from semantic helpers) while keeping plain dicts for simple filter/sort workers. This gets the best of both worlds but means two code styles in the repo.

3. msgspec msgpack as the wire format (Phase 1 — done)

Branch claude/jolly-bohr-KLfa9 swaps the codec from orjson to msgspec.msgpack. Performance is roughly neutral for untyped dicts, but it establishes the binary wire format. Includes backward-compatible decoding for in-flight JSON messages during rolling deploys.

4. Typed deserialization (future)

If/when typed Structs or similar become viable with the constraints we need (extra field preservation, hash-based identity), typed decode could skip Python dict construction entirely. This remains the theoretical ceiling but has practical constraints today — see discussion in the analysis above.


Open Questions

  1. Is split storage worth the added complexity in Redis key management and TTL handling?
  2. For merge_message specifically — would it be cleaner to convert dicts → TOM objects only inside that worker, or to pass TOM objects through the pipeline?
  3. Are there other workers where TOM's helpers (constraint checking, normalization) would justify the validation overhead?
  4. What's the typical message size distribution in production? The cost/benefit shifts significantly between the 1K and 10K scales.

Feedback welcome from anyone familiar with the TRAPI pipeline or TOM internals.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions