Skip to content

Support the transactional outbox pattern with TiCDC #4325

@benmeadowcroft

Description

@benmeadowcroft

Is your feature request related to a problem?

Applications using the transactional outbox pattern write domain events to an outbox table in the same database transaction as the business change, then rely on a CDC tool to relay those events to a message broker. TiCDC can replicate the tabular data today, but existing protocols (open-protocol, canal-json, etc.) wrap the payload in an internal envelope format that downstream consumers must understand. Consumers are forced to parse out TiCDC-specific metadata and extraneous content, to extract the raw application payload. In addition to the envelop format wrapping the payload. TiCDC does not provide sufficent control over topic dispatch, Kafka header generation, or Kafka Key generation. All of this adds complexity and additional processing latency, that users want to avoid.

Describe the feature you'd like

A new outbox protocol designed specifically for the transactional outbox pattern, providing the control and flexibility that users who want to adopt this pattern need.

  • Raw payloads: The Kafka message key and value are taken directly from designated table columns (key-column, value-column), with no TiCDC envelope wrapping. Downstream consumers receive exactly the bytes stored in the database.
  • INSERT-only: Only INSERT events produce Kafka messages. UPDATEs and DELETEs (e.g., housekeeping deletes from the outbox table) are silently discarded. DDL events and checkpoint watermarks are not emitted.
  • Reserved Id header: Every message carries a Kafka record header Id whose value is taken from the id-column. This provides idempotency and deduplication keys to consumers without needing to parse the payload.
  • Exstensible column-to-header mapping: Additional table columns can be promoted to Kafka record headers, enabling propagation of out-of-band metadata (e.g. W3C TraceContext for distributed tracing) without embedding it in the payload.
  • Per-row topic routing: Topic expressions already support {schema} and {table} placeholders; TiCDC should also support dispatching rows to different topics based on a column value.

Describe alternatives you've considered

The primary alternative is having the application publish to Kafka directly, however without support for XA transactions this loses the transactional benefits of the outbox pattern and adds complexity to the applications.

Teachability, Documentation, Adoption, Migration Strategy

The primary users would be teams that are already implementing the transactional outbox pattern with their existing database and considering migrating to TiDB. For reference, you can see how Debezium approaches this pattern here - https://debezium.io/documentation/reference/stable/transformations/outbox-event-router.html

I have also been working on a proof of concept for this idea here - https://github.com/benmeadowcroft/ticdc/tree/outbox-support

Metadata

Metadata

Labels

contributionThis PR is from a community contributor.first-time-contributorIndicates that the PR was contributed by an external member and is a first-time contributor.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions