Skip to content

jwillberg/replicon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Replicon v1.0.2

Replicon is a high-throughput, lightweight, single-node per datacenter (DC) JSON ingest + replication engine over HTTP/HTTPS.

It accepts JSON payloads via POST /ingress/{channel}/add (single object or array of objects), buffers them durably (file queue or Redis/Valkey Streams), writes to local storage in background (file / PostgreSQL / MySQL), and asynchronously replicates events between DC nodes without cross-DC quorum (no write blocking on network partitions).

This repository is binary-only and contains deployable runtime assets (binaries, configs, install/uninstall scripts).

At a glance

  • HTTP and HTTPS ingest endpoints
  • Dynamic channels via channels.yml (per-channel API keys + validation, optional schema mapping)
  • Async pipeline: ingest -> durable queue -> writer workers -> storage
  • Queue backends: file, redis (Streams + consumer group + DLQ)
  • Storage backends: file, postgres, mysql, noop
  • Pull-based replication over HTTP or HTTPS (per-peer + per-channel cursors)
  • SQL schema auto-create on startup (tables + optional indexes from channels.yml)

Throughput note:

  • In lab tests with HTTP keep-alive and small JSON payloads, ingest acceptance has reached ~5000-8000 req/s on small 1-2 vCPU nodes. Actual throughput depends on payload size, validation, TLS, queue/storage backend, and writer batch settings.

Repository layout (published release repo)

.
├── README.md
├── INSTALL.md
├── CHANGELOG.md
├── install.sh
├── uninstall.sh
├── SHA256SUMS.txt
├── bin/
│   ├── replicon-linux-amd64
│   ├── replicon-linux-arm64
│   ├── replicon-linux-arm
│   ├── replicon-darwin-amd64
│   └── replicon-darwin-arm64
└── conf/
    ├── replicon.conf
    ├── channels.yml
    └── replicon.service

Quick install

Linux (default backend: PostgreSQL):

curl -fsSL https://raw.githubusercontent.com/jwillberg/replicon/main/install.sh -o install.sh
chmod +x install.sh
./install.sh --repo jwillberg/replicon --version v1.0.2 --provision

MySQL backend:

./install.sh --repo jwillberg/replicon --version v1.0.2 --db-backend mysql --provision

More install options are in INSTALL.md.

Installed paths (Linux installer defaults)

  • binary: /usr/local/bin/replicon
  • config: /etc/replicon/replicon.conf
  • channels: /etc/replicon/channels.yml
  • systemd unit: /etc/systemd/system/replicon.service
  • runtime data dir: /var/lib/replicon

The default systemd unit sets WorkingDirectory=/var/lib/replicon, so relative paths like data/queue/... and data/db/... in replicon.conf resolve under /var/lib/replicon. It also sets StartLimitIntervalSec=60 and StartLimitBurst=5 to avoid tight restart loops on startup errors.

Tested platforms (2026-02-19)

Smoke-tested end-to-end with:

  • install.sh --provision
  • file backend ingest test
  • redis + postgres ingest test
  • redis + mysql ingest test
  • uninstall.sh --purge

Tested operating systems:

  • Ubuntu 24.04
  • Ubuntu 22.04
  • Debian 13 (trixie)
  • Debian 12 (bookworm)
  • Fedora
  • CentOS
  • Rocky Linux 10
  • AlmaLinux 10
  • openSUSE Leap 16.0

Supported by installer package-manager logic:

  • apt-get (Ubuntu, Debian)
  • dnf / yum (Fedora, Rocky, Alma, CentOS)
  • zypper (openSUSE, SLES)

Key goals

  • Always writable per DC: local writes succeed even if inter-DC links are down.
  • Eventual consistency between DCs via pull-based replication.
  • Simple ops: no Kafka, no DB clustering required.
  • Idempotent replication: safe to retry, no duplicates on replication apply.

Architecture (per DC)

  • HTTP ingest API: POST /ingress/{channel}/add
  • Durable queue:
    • file backend (append-only log + ack cursor), or
    • redis backend (Redis/Valkey Streams + consumer group + DLQ)
  • Writer: consumes queue asynchronously and writes to configured storage backend
  • Replicator: pulls missing events from peers and applies locally

Any load balancer can route traffic to any healthy DC node.

Data model (minimal)

Replicon works best with a canonical append-only event log.

Canonical event identity

Each DC produces its own monotonic sequence:

  • origin_dc (string, e.g. dc1) is node_name
  • origin_seq (monotonic integer per origin_dc)

Primary key: (origin_dc, origin_seq)

This avoids global counters and makes replication trivial:

  • "Give me events after origin_seq=N for origin_dc=dc2"

Canonical raw log table (events_raw)

When canonical raw storage is needed (any raw_only channel, or any structured channel with store_raw=true), the SQL backends use a fixed canonical table name: events_raw.

Minimal schema (PostgreSQL):

CREATE TABLE IF NOT EXISTS events_raw (
  origin_dc    TEXT        NOT NULL,
  origin_seq   BIGINT      NOT NULL DEFAULT nextval('replicon_origin_seq'),
  node_name    TEXT        NOT NULL,
  channel      TEXT        NOT NULL,
  received_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  json_data    JSONB       NOT NULL,
  PRIMARY KEY (origin_dc, origin_seq)
);

Minimal schema (MySQL/MariaDB):

CREATE TABLE IF NOT EXISTS events_raw (
  origin_dc    VARCHAR(191) NOT NULL,
  origin_seq   BIGINT       NOT NULL,
  node_name    VARCHAR(191) NOT NULL,
  channel      VARCHAR(191) NOT NULL,
  received_at  DATETIME(6)  NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
  json_data    JSON         NOT NULL,
  PRIMARY KEY (origin_dc, origin_seq)
) ENGINE=InnoDB;

Replication cursor table (replication_progress)

Replication progress is tracked per peer and per channel:

CREATE TABLE IF NOT EXISTS replication_progress (
  peer_dc                  TEXT/VARCHAR NOT NULL,
  channel                  TEXT/VARCHAR NOT NULL,
  last_origin_seq_received BIGINT       NOT NULL DEFAULT 0,
  updated_at               TIMESTAMPTZ/DATETIME NOT NULL,
  PRIMARY KEY (peer_dc, channel)
);

The schema is auto-created on startup when storage_backend=postgres or storage_backend=mysql.

Runtime flow

  1. Client sends JSON object (single) or JSON array of objects (batch) to POST /ingress/{channel}/add
  2. Replicon validates auth + channel rules
  3. Event is accepted to queue (202 Accepted)
  4. Writer workers flush queued events to storage in batches
  5. Replicator pulls missing events from peers and applies locally

HTTP API

Health

  • GET /healthz

Returns node and ingest/writer status.

It includes:

  • node (node_name)
  • gomaxprocs (effective CPU parallelism)
  • ingest stats: queue_backend, queue_depth, storage_backend, writer_workers, counters, last_error

Public ingest

  • POST /ingress/{channel}/add
  • Headers:
    • Authorization: Bearer <channel_api_key>
    • Content-Type: application/json (recommended)

Body:

  • may be:
    • one JSON object
    • one JSON array of JSON objects
  • for array input, item count is limited by max_batch_items
  • max size is enforced by max_body_bytes
  • structured channels enforce per-channel schema/validation rules from channels.yml (per item)

Response codes:

  • 202 accepted to queue (accepted_count included for batch requests)
  • 400 invalid JSON / schema validation error / invalid batch shape
  • 401 auth error (missing/invalid API key)
  • 404 unknown/disabled channel
  • 405 method not allowed
  • 413 payload too large
  • 429 queue is full (only with queue_backend=file when queue_buffer is reached)
  • 503 enqueue failed / timeout

Batch enqueue note:

  • if enqueue fails after some items were already accepted, response is JSON with:
    • status: "partial"
    • accepted_count
    • failed_index
    • error

Example:

curl -X POST "http://127.0.0.1:8080/ingress/events_raw/add" \
  -H "Authorization: Bearer key-abc123" \
  -H "Content-Type: application/json" \
  -d '{"source":"collector-1","event":"login_fail","ts":"2026-02-19T12:00:00Z"}'

Structured channel example (abuse_reports after enabling it in channels.yml):

curl -X POST "http://127.0.0.1:8080/ingress/abuse_reports/add" \
  -H "Authorization: Bearer key-abuse-001" \
  -H "Content-Type: application/json" \
  -d '{"report_time":"2026-02-19T12:00:00Z","attacker_ip":"1.2.3.4","reporter_ip":"5.6.7.8","service":"ssh","comment":"blocked by firewall"}'

Batch example (same endpoint):

curl -X POST "http://127.0.0.1:8080/ingress/events_raw/add" \
  -H "Authorization: Bearer key-abc123" \
  -H "Content-Type: application/json" \
  -d '[{"source":"collector-1","event":"login_fail","ts":"2026-02-19T12:00:00Z"},{"source":"collector-1","event":"portscan","ts":"2026-02-19T12:00:01Z"}]'

Expected client-visible failures:

# wrong API key -> 401
curl -i -X POST "http://127.0.0.1:8080/ingress/events_raw/add" \
  -H "Authorization: Bearer wrong-key" \
  -H "Content-Type: application/json" \
  -d '{"source":"collector-1"}'

# disabled example channel (default conf) -> 404
curl -i -X POST "http://127.0.0.1:8080/ingress/abuse_reports/add" \
  -H "Authorization: Bearer key-abuse-001" \
  -H "Content-Type: application/json" \
  -d '{"report_time":"2026-02-19T12:00:00Z","attacker_ip":"1.2.3.4","reporter_ip":"5.6.7.8","service":"ssh"}'

# invalid batch item -> 400
curl -i -X POST "http://127.0.0.1:8080/ingress/events_raw/add" \
  -H "Authorization: Bearer key-abc123" \
  -H "Content-Type: application/json" \
  -d '[{"source":"collector-1"}, 123]'

Client implementation checklist:

  • Reuse connections (HTTP keep-alive).
  • Keep batch item count <= max_batch_items when using array payloads.
  • Retry on 429 and 503 with exponential backoff + jitter.
  • Treat 202 as accepted-to-queue (asynchronous write).
  • Treat 4xx as client/config issue (fix payload/key/channel before retrying).

Client connection behavior (important)

  • Reuse HTTP connections (keep-alive on). Do not open/close a new TCP connection for every event.
  • Keep one long-lived HTTP client instance per process and use its connection pool.
  • For HTTPS, connection reuse matters even more because TLS handshakes are expensive.
  • For load tests, ab without -k usually underestimates throughput because of connect/TIME_WAIT overhead.

Internal replication endpoint

  • GET /internal/events?channel=<name>&from_seq=<n>&limit=<k>
  • Header: X-Replicon-Token: <replication_token>

Used by Replicon peers only.

Notes:

  • limit is capped to 5000.
  • Only local-origin events are returned (where origin_dc == node_name).

Channel modes

Replicon supports two ingest/write behaviors per channel:

  1. raw_only
  • Payload accepts one JSON object or an array of JSON objects.
  • With SQL storage (Postgres/MySQL): payload is stored in canonical raw log (events_raw.json_data).
  • With file storage: payload is stored as-is in JSONL file (no SQL schema mapping).
  1. structured
  • Payload is validated against an allowlist of columns and types.
  • With SQL storage (Postgres/MySQL): payload is mapped into a structured table (created at startup if missing).
  • store_raw=true: structured row + canonical raw copy in events_raw
  • store_raw=false: structured-only write (no events_raw copy)

channels.yml (dynamic channels)

Replicon channels are defined in channels.yml.

  • Top-level general: shared defaults (optional)
  • channels.<name>: per-channel overrides

Default template contains:

  • events_raw: baseline raw_only channel (enabled)
  • abuse_reports: structured example (disabled by default)

Recommended model

  • Put shared defaults in top-level general.
  • Add only channel-specific differences under each channel key.
  • Keep optional examples disabled with enabled: false.

channels.yml schema reference

Top-level:

  • channels is required (must contain at least one channel key)
  • general is optional (shared defaults for all channels)

general keys (all optional):

  • enabled (true|false, default true)
  • api_keys (list of bearer keys)
  • mode (raw_only|structured, default raw_only)
  • on_unknown_field (reject|drop, default reject, structured mode only)
  • store_raw (true|false, default true, structured mode only)

Per-channel keys:

  • enabled (true|false, default from general.enabled, otherwise true)
    • enables is accepted as a typo-friendly alias for enabled
  • api_keys (required for enabled channel unless inherited from general.api_keys)
  • mode (raw_only|structured, default from general.mode, otherwise raw_only)
  • on_unknown_field (reject|drop, default from general.on_unknown_field, otherwise reject, structured mode only)
  • store_raw (true|false, default from general.store_raw, otherwise true, structured mode only)
  • table (required only when mode=structured)
  • columns (required only when mode=structured)
  • required (optional; must be subset of columns)
  • field_types (optional map; default type is text for unspecified columns)
  • indexes (optional list; structured mode only)
  • indexes[].name (optional; auto-generated if omitted)
  • indexes[].unique (optional; default false)
  • indexes[].columns (required; list of column or column DESC)

Example channels.yml

general:
  enabled: true
  api_keys: [key-abc123]
  on_unknown_field: reject
  store_raw: true

channels:
  # baseline raw channel name (you can rename/remove this)
  events_raw:
    mode: raw_only

  # structured example (disabled by default)
  abuse_reports:
    enabled: false
    api_keys: [key-abuse-001]
    mode: structured
    store_raw: false
    table: abuse_reports
    columns: [report_time, attacker_ip, reporter_ip, service, comment]
    required: [report_time, attacker_ip, reporter_ip, service]
    field_types:
      report_time: timestamptz
      attacker_ip: inet
      reporter_ip: inet
      service: text
      comment: text

Structured field_types supported

  • text (varchar, character varying, char, character aliases)
  • timestamptz (timestamp with time zone)
  • timestamp (timestamp without time zone)
  • date
  • time (time without time zone)
  • timetz (time with time zone)
  • inet, cidr, macaddr, macaddr8
  • smallint, integer, bigint
  • boolean
  • real, double precision
  • numeric (decimal alias)
  • uuid
  • bytea
  • jsonb (json alias)
  • one-dimensional arrays with [] suffix (for supported base types)

Also accepted as aliases with modifiers:

  • varchar(n), character varying(n), char(n), character(n) -> text
  • numeric(p,s), decimal(p,s) -> numeric
  • timestamp(n) -> timestamp
  • timestamp(n) with time zone -> timestamptz
  • time(n) -> time
  • time(n) with time zone -> timetz

Also accepted as common SQL synonyms:

  • int2 -> smallint
  • int / int4 -> integer
  • int8 -> bigint
  • bool -> boolean
  • float4 -> real
  • float8 / float -> double precision

Dialect notes:

  • PostgreSQL uses native types
  • MySQL stores network/timezone-specific types as validated strings/UTC timestamps
  • MySQL stores jsonb and arrays as JSON-compatible storage

Index support (structured channels only)

Structured channel indexes are optional and created at startup.

Notes:

  • Index names must be unique across all enabled channels.
  • Index columns may include:
    • structured payload columns defined in columns
    • canonical structured columns: origin_dc, origin_seq, node_name, channel, received_at
  • DESC ordering is accepted (column DESC), but exact behavior depends on your MySQL/MariaDB version.

SQL schema management (auto)

When storage_backend=postgres or storage_backend=mysql, Replicon auto-creates:

  • per-DC origin sequence (replicon_origin_seq)
  • canonical raw table events_raw + indexes (only if needed by enabled channels)
  • structured channel tables from channels.yml (and optional indexes)
  • replication cursor table replication_progress (per peer + per channel)

replicon.conf essentials

Important keys:

  • Identity/runtime:
    • node_name
    • channels_file
    • cpu (-1, *, or auto = runtime default/all cores; N fixed)
    • max_body_bytes
    • max_batch_items (max JSON objects when ingest body is an array)
  • Security:
    • client_api_key (fallback if channels file missing)
    • replication_token
  • Listeners:
    • http_listen
    • https_enabled, https_listen, tls_cert_file, tls_key_file
  • Queue:
    • queue_backend=file|redis
    • Redis Streams settings (queue_redis_*)
  • Writer:
    • writer_enabled
    • writer_workers
    • writer_batch_size
    • writer_batch_flush_ms
  • Storage:
    • storage_backend=file|postgres|mysql|noop
    • storage_postgres_dsn / storage_mysql_dsn
  • Replication:
    • replication_enabled
    • replication_peer (repeatable or comma-separated on one line)
    • replication_poll_interval_ms
    • replication_batch_limit

Replication model

  • Pull-based (no cross-DC quorum)
  • Per-channel cursors tracked in replication_progress
  • Idempotent apply on canonical keys (origin_dc, origin_seq)
  • Works over HTTP or HTTPS depending on peer URL (http:// / https://)
  • Requires storage_backend=postgres or storage_backend=mysql (replication reads/writes via SQL)

Timeouts and recovery:

  • replication_http_timeout_ms is per request to a peer.
  • If a peer is down for minutes/hours, Replicon keeps polling and will catch up when connectivity returns.
  • Replication follows enabled channel set from channels.yml (one cursor per peer + channel).

Example peers:

replication_enabled = true
replication_peer = dc2=http://10.0.1.2:8080
replication_peer = dc3=https://dc3.example.com

Replication loop (conceptually):

  1. read last seq for (peer, channel) from replication_progress
  2. call peer /internal/events?channel=<name>&from_seq=last+1&limit=batch_limit
  3. insert idempotently into local SQL storage
  4. update cursor to newest seq applied

Structured-only replication:

  • For structured channels with store_raw=false, Replicon replicates events by reading/writing the structured table directly (with the same (origin_dc, origin_seq) primary key).

Backend and queue recommendations

  • Small/simple node:
    • queue_backend=file, storage_backend=file
  • Production SQL ingest:
    • queue_backend=redis, storage_backend=postgres or mysql
  • Benchmark API overhead only:
    • storage_backend=noop

Native HTTPS

Enable in conf/replicon.conf (or /etc/replicon/replicon.conf):

https_enabled = true
https_listen = 0.0.0.0:8443
tls_cert_file = /etc/replicon/tls/server.crt
tls_key_file = /etc/replicon/tls/server.key

With HTTPS enabled, Replicon can serve both:

  • HTTP on http_listen (if set)
  • HTTPS on https_listen

To run HTTPS-only, set http_listen = (empty) and keep https_enabled=true.

CPU concurrency

Configure cpu in replicon.conf:

# Use runtime default (recommended)
cpu = -1

# Same as above:
# cpu = *
# cpu = auto

# Pin to 2 OS threads
# cpu = 2

Behavior:

  • cpu = -1, *, or auto: keep Go runtime default GOMAXPROCS
  • cpu = N (N > 0): force GOMAXPROCS=N

Check effective value:

curl -s http://127.0.0.1:8080/healthz

Queue + writer backends

Writer batching controls:

  • writer_batch_size: max events per storage write call.
  • writer_batch_flush_ms: max wait for filling a batch before flushing a smaller batch.

Start points for SQL backends:

  • writer_batch_size=200
  • writer_batch_flush_ms=20

File queue (no Redis required)

queue_backend = file
queue_file_path = data/queue/ingest.log
queue_file_ack_path = data/queue/ingest.ack
queue_file_sync_ms = 200
queue_buffer = 10000

writer_enabled = true
writer_workers = 1
writer_batch_size = 200
writer_batch_flush_ms = 20

storage_backend = file
storage_file_path = data/db/events_raw.jsonl
storage_file_sync_ms = 200

Notes:

  • File queue is durable with periodic fsync (queue_file_sync_ms).
  • Backpressure is enforced via queue_buffer:
    • if backlog reaches queue_buffer, ingest returns 429.

Redis/Valkey queue (Streams)

queue_backend = redis

# Option A: TCP
# queue_redis_network = tcp
# queue_redis_addr = 127.0.0.1:6379

# Option B: Unix socket
# queue_redis_network = unix
# queue_redis_addr = /var/run/redis/redis.sock

queue_redis_network = tcp
queue_redis_addr = 127.0.0.1:6379
queue_redis_stream = replicon:ingest
queue_redis_group = writer
queue_redis_consumer = replicon
queue_redis_dlq_stream = replicon:ingest:dlq
queue_redis_maxlen = 5000000
queue_redis_max_retries = 5
queue_redis_timeout_ms = 1500

Notes:

  • Redis backend uses XADD + XREADGROUP + XACK (consumer group).
  • Failed events are re-queued with incremented retry counter up to queue_redis_max_retries, then moved to queue_redis_dlq_stream.
  • If you set queue_redis_maxlen, Redis will trim the stream approximately: if writers cannot keep up, old queue entries may be dropped (data loss).
  • For durability, enable AOF in Redis/Valkey (appendonly yes, appendfsync everysec).

Storage backends

storage_backend=file

  • Writes append-only JSONL to storage_file_path.
  • Intended for simple setups and debugging, not for fast querying.
  • Structured table projection is not available (structured mode still validates payloads).

storage_backend=postgres

storage_backend = postgres
storage_postgres_dsn = postgres://replicon:replicon-secret@127.0.0.1:5432/replicon?sslmode=disable
storage_postgres_connect_timeout_ms = 3000
storage_postgres_max_open_conns = 16
storage_postgres_max_idle_conns = 16
storage_postgres_conn_max_lifetime_min = 30

storage_backend=mysql

storage_backend = mysql
storage_mysql_dsn = replicon:replicon-secret@tcp(127.0.0.1:3306)/replicon?parseTime=true
storage_mysql_connect_timeout_ms = 3000
storage_mysql_max_open_conns = 16
storage_mysql_max_idle_conns = 16
storage_mysql_conn_max_lifetime_min = 30

storage_backend=noop

  • Discards writes (benchmark/testing).

Client best practices

  • Reuse connections (HTTP keep-alive)
  • Prefer persistent client pools (do not reconnect every event)
  • Retry 429/503 with exponential backoff + jitter
  • Treat 202 as accepted-to-queue (async write)

Verification commands

Service + health:

systemctl status replicon --no-pager
curl -s http://127.0.0.1:8080/healthz

Quick ingest test:

curl -X POST "http://127.0.0.1:8080/ingress/events_raw/add" \
  -H "Authorization: Bearer key-abc123" \
  -H "Content-Type: application/json" \
  -d '{"source":"smoke","ts":"2026-02-19T12:00:00Z"}'

SQL verification (optional):

PostgreSQL:

psql "postgres://replicon:replicon-secret@127.0.0.1:5432/replicon?sslmode=disable" -c "SELECT COUNT(*) FROM events_raw;"

MySQL/MariaDB:

mysql -u replicon -p -h 127.0.0.1 -P 3306 replicon -e "SELECT COUNT(*) FROM events_raw;"

Notes for Alma/RHEL-family

Installer v1.0.2 supports Redis-compatible package fallback on newer RHEL-family systems (for example valkey where redis package is not available).

Uninstall

./uninstall.sh
./uninstall.sh --purge

More docs

  • INSTALL.md for install/provision options
  • CHANGELOG.md for release history

About

Replicon is a high-throughput, multi-DC event ingest and replication service with dynamic channel schemas, PostgreSQL / MySQL storage, and optional Redis/file queue backends.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages