A distributed, semantically sharded graph database for trillion-edge workloads.
rhizome is a horizontally scalable graph database designed for very large property graphs (billions of vertices, trillions of edges) that need to behave like a single logical database. It combines:
- External consistency β Spanner-style commit-wait over a TrueTime-bounded clock, layered on top of Multi-Raft for per-shard replication;
- Serializable Snapshot Isolation β optimistic MVCC with a CahillβFeketeβRΓΆhm conflict tracker and a Percolator-style two-phase commit for cross-shard writes;
- Semantic sharding β instead of hashing IDs, the orchestrator continuously partitions the graph along communities discovered by the Leiden algorithm, so vertices that are frequently traversed together physically live together;
- A native vector index β HNSW + Product Quantization sits inside the same storage engine that holds the graph, with one query language and one transaction model;
- A GraphQL query surface β every CRUD, traversal, aggregation, KNN, and transaction lifecycle operation is exposed through one schema, with a cost-based optimiser pushing predicates down to the right shard;
- Autonomous operation β a Kotlin control plane closes the loop over placement, schema evolution, deadlock detection, and load forecasting without asking an SRE to pick a shard count.
- Highlights
- Compared with the rest of the landscape
- Architecture
- Getting Started
- Querying rhizome
- Observability
- Testing
- Linting
- Roadmap
- Documentation
- Contributing
- Code of Conduct
- Get in touch!
- Security
- Citing rhizome
- Useful Links
- License
| Capability | What we provide |
|---|---|
| Storage | Custom LSM on top of RocksDB with graph-aware compaction, SuRF/Xor presence filters, an ARC cache with a neighbour prefetcher, and AES-GCM-SIV field encryption. |
| Consensus | Multi-Raft with pipeline batching, joint consensus, witness nodes for geo-replication, pre-vote, lease-based reads, and credit-based flow control. |
| Transactions | Optimistic MVCC + SSI for read-write conflicts, Percolator-style 2PC for cross-shard writes, hybrid global deadlock detector, epoch-based GC. |
| Time | TrueTime-style oracle with explicit uncertainty; commit-wait guarantees external consistency; PTP-mode drops |
| Vector search | HNSW + Product Quantization living inside the same SSTables as the graph; hot vectors in RAM, cold tiers on disk; one transaction covers both. |
| Query | GraphQL gateway with a cost-based optimiser, gRPC for low-overhead clients, BSP traversal engine with push/pull dispatch, native cursor pagination over MVCC scans. |
| Schema evolution | Git-like DAG of schema versions, expandβcontract online migration, DBSCAN-driven type inference suggestions. |
| Autonomy | Leiden-based community detection drives semantic placement; an NSGA-II optimiser balances load vs. locality vs. migration cost; TFT load forecasting. |
| Observability | Prometheus metrics on every layer, pre-built Grafana dashboards, OpenTelemetry traces end-to-end (gateway β orchestrator β node β raft). |
| Security | Mandatory mTLS between every internal hop, attribute-based access control at the transaction layer, cryptographic shredding for GDPR right-to-be-forgotten. |
There is no shortage of graph databases, and we sit in good company. The tables below place rhizome next to the systems we get compared to most often. They are split into three thematic blocks so each fits on a screen.
Roster legend: LPG = property graph, RDF = triple/quad store, SQL = distributed SQL system included for the transaction-protocol comparison only.
Click to expand the full comparison β 17 systems Γ 3 thematic tables.
| System | Type | License | Lang | Native graph store? | Embeddable? |
|---|---|---|---|---|---|
| Neo4j | LPG | GPLv3 (Community) + commercial | Java | yes | both |
| TigerGraph | LPG | commercial (free Developer Ed.) | C++ | yes (GSE) | standalone |
| Dgraph | LPG | Apache 2.0 | Go | yes (badger) | standalone |
| NebulaGraph | LPG | Apache 2.0 | C++ | yes (RocksDB) | standalone |
| JanusGraph | LPG | Apache 2.0 | Java | no β over Cassandra/HBase/BerkeleyDB | both |
| Memgraph | LPG | BSL 1.1 / Apache 2.0 (MAGE) | C++ | yes (in-memory) | standalone |
| ArangoDB | LPG | Apache 2.0 / commercial | C++ | yes (RocksDB) | standalone |
| FalkorDB | LPG | SSPL (ex-RedisGraph) | C | yes (sparse matrix) | standalone (Redis module) |
| KuzuDB | LPG | MIT | C++ | yes | embedded |
| Stardog | RDF | commercial | Java | yes | standalone |
| Apache Jena | RDF | Apache 2.0 | Java | yes (TDB) | embeddable |
| Virtuoso | RDF | GPLv2 / commercial | C | yes | standalone |
| AllegroGraph | RDF | commercial | Common Lisp | yes | standalone |
| YugabyteDB | SQL | Apache 2.0 | C++ | n/a | standalone |
| CockroachDB | SQL | BSL 1.1 (CCL bits) | Go | n/a | standalone |
| Spanner | SQL | proprietary (Google Cloud) | C++ | n/a | standalone |
| π± rhizome | LPG | MIT | Rust + Kotlin | yes (bedrock) | standalone |
| System | Distribution | Replication | Default consistency | Cross-shard tx | External consistency | Multi-region | Built-in mTLS |
|---|---|---|---|---|---|---|---|
| Neo4j | replicated (Causal Cluster); sharding via Fabric | raft | causal | no (read-only Fabric joins) | no | enterprise | yes |
| TigerGraph | sharded + replicated | proprietary sync | tunable | yes (ACID) | no | enterprise | yes |
| Dgraph | sharded + replicated | raft (per group) | linearizable reads | yes | no | yes | yes |
| NebulaGraph | sharded + replicated | raft | strong (leader reads) | yes (SI) | no | yes | yes |
| JanusGraph | inherits backend | backend-dependent | eventual (Cassandra default) | backend-dependent | no | backend-dependent | partial |
| Memgraph | single primary + replicas | sync / async | ACID single-node | n/a | no | no | yes |
| ArangoDB | sharded + replicated | raft (agency) + async | tunable | yes (community limited; cluster ACID is Enterprise) | no | enterprise | enterprise |
| FalkorDB | single primary | Redis async replication | ACID single-node | n/a | no | no | Redis TLS |
| KuzuDB | single process | none | ACID single-process | n/a | no | no | n/a |
| Stardog | replicated (cluster) | raft | ACID | yes (within cluster) | no | enterprise | yes |
| Apache Jena | single process (Fuseki) | none built-in | ACID single-node | n/a | no | no | configurable |
| Virtuoso | replicated / clustered (Enterprise) | proprietary | ACID | yes | no | enterprise | yes |
| AllegroGraph | replicated + FedShard | proprietary spread | ACID | yes (federated) | no | yes | yes |
| YugabyteDB | sharded + replicated | raft | snapshot (tunable β serializable) | yes | clock-skew bounded | yes | yes |
| CockroachDB | sharded + replicated | raft | serializable | yes | HLC with optional commit-wait | yes | yes |
| Spanner | sharded + replicated | paxos | external consistency | yes | TrueTime + commit-wait | yes (designed-in) | yes |
| π± rhizome | sharded + replicated | Multi-Raft | SSI | yes (Percolator 2PC) | TrueTime + commit-wait | yes (witness voters) | yes (mandatory) |
| System | Query language(s) | Schema model | Online schema migration | Native vector index | Sharding strategy | Graph analytics in-DB |
|---|---|---|---|---|---|---|
| Neo4j | Cypher / GQL | schemaless + constraints | yes | yes (HNSW, since 5.x) | manual (Fabric) | yes (GDS: PageRank, Louvain, Leidenβ¦) |
| TigerGraph | GSQL | typed strict | yes | limited add-on | hash + workload-aware | yes (GSQL is Turing-complete) |
| Dgraph | DQL + GraphQL | typed strict | yes | no | predicate-based | limited |
| NebulaGraph | nGQL / openCypher | typed strict | yes | external (ES) | hash | yes (Nebula Algorithm) |
| JanusGraph | Gremlin (TinkerPop) | typed (managed) | yes | external (ES/Solr) | backend-dependent | yes (TinkerPop OLAP) |
| Memgraph | Cypher / openCypher | schemaless | yes | yes (HNSW) | n/a | yes (MAGE) |
| ArangoDB | AQL | schemaless / JSON-schema | yes | experimental | hash (SmartGraphs = community-aware, Enterprise) | yes (Pregel) |
| FalkorDB | Cypher | schemaless | yes | no | n/a | limited |
| KuzuDB | Cypher | typed strict | limited | no | n/a | limited |
| Stardog | SPARQL + GraphQL | OWL ontology | yes | no | replication only | SPARQL only |
| Apache Jena | SPARQL | OWL ontology | n/a (schemaless triples) | no | n/a | SPARQL only |
| Virtuoso | SPARQL + SQL | OWL + relational | yes | no | hash | limited |
| AllegroGraph | SPARQL + Prolog | OWL ontology | yes | yes (since 8.x) | FedShard (manual) | SPARQL + reasoning |
| YugabyteDB | SQL (Postgres) + CQL | relational | yes | pgvector (HNSW) | hash + range | SQL only |
| CockroachDB | SQL (Postgres) | relational | yes | no | range (auto-split) | SQL only |
| Spanner | SQL | relational | yes | yes (recent GA) | range (auto-split) | SQL only |
| π± rhizome | GraphQL | typed, versioned DAG | yes (expandβcontract) | yes (HNSW + PQ, in-engine) | semantic (Leiden + NSGA-II) | yes (BSP, KNN, community detection) |
A few honest caveats:
- Spanner owns external consistency through TrueTime + commit-wait; rhizome adopts the same recipe but ships it on commodity NTP/PTP rather than purpose-built atomic clocks. CockroachDB approximates the property with HLC and optional commit-wait. We sit closer to Spanner in semantics but to CockroachDB in deployment cost.
- Neo4j remains the most polished graph product on the market and its GDS algorithm catalogue is unmatched. rhizome's differentiator is horizontal scale-out: Neo4j shards only through the Fabric federation layer, while rhizome partitions a single logical graph along community boundaries.
- Dgraph is structurally the closest to rhizome: distributed, raft, GraphQL-first. The split is in placement (predicate-based vs. semantic) and in transaction guarantees (linearizable reads vs. SSI + external consistency).
- RDF stores (Stardog, Jena, Virtuoso, AllegroGraph) live in an adjacent space β strong reasoning and SPARQL ergonomics, weaker multi-region transactions. We treat them as a different design point, not a head-to-head competitor.
- YugabyteDB, CockroachDB, Spanner are not graph databases. They appear in the tables because they share the distribution and transaction recipe (sharded raft/paxos, MVCC, external/serializable consistency) and let readers calibrate where rhizome sits on that axis.
rhizome separates data plane (Rust) from control plane (Kotlin).
Cluster metadata (membership, shard map, leases, replicated wait-graph)
lives in a 3-voter raft group called the metastore, which is
co-located inside the same rhizome-node binary as the shard raft
groups. Clients talk to a stateless gateway, which compiles
GraphQL queries into gRPC calls against the right shard's leader.
flowchart TB
client["Application<br/>(GraphQL Β· gRPC)"]
subgraph cp["Control plane Β· Kotlin"]
gw["Gateway<br/><i>GraphQL Β· CBO Β· sticky tx router</i>"]
orch["Orchestrator<br/><i>Placement Β· Schema Β· Deadlock slow path</i>"]
end
subgraph dp["Data plane Β· rhizome-node Γ N Β· Rust"]
direction TB
subgraph shards["Shard raft groups (per shard)"]
direction LR
n1["node 1<br/>shards {0..k}"]
n2["node 2<br/>shards {k+1..2k}"]
n3["node N<br/>shards {...}"]
n1 <-->|Multi-Raft| n2
n2 <--> n3
n3 <--> n1
end
subgraph meta["Metastore raft group (Γ 3 voters, co-located in rhizome-node)"]
direction LR
m1[("on node 1")]
m2[("on node 2")]
m3[("on node 3")]
m1 <-.-> m2
m2 <-.-> m3
m3 <-.-> m1
end
end
obs[("Prometheus + Grafana")]
client ==>|HTTPS Β· mTLS| gw
gw ==>|gRPC Β· sticky route| shards
gw -->|shard-map lookup| meta
orch <-->|membership Β· leases Β· wait-graph| meta
orch -->|migrate Β· abort Β· drain| shards
dp -.->|metrics + traces| obs
cp -.-> obs
Hot path is the thick line: client β gateway β data plane, with a side
lookup into the metastore for shard map. The metastore is not a
separate process β its 3-voter raft group runs alongside the shard
raft groups inside the same rhizome-node binary, scheduled by a
shared MultiRaftManager. The orchestrator never sits in the request
path: it talks to the metastore raft group for membership / leases /
the replicated wait-graph, and to the data plane only for rare
management operations (shard migration, deadlock abort, graceful
drain).
A short tour of the modules:
| Module | Language | Purpose |
|---|---|---|
bedrock |
Rust | Graph-native storage engine over RocksDB. Adjacency-block SSTables, HNSW vectors, ARC cache, encryption, MVCC retention. |
clockwork-ferryman |
Rust | Consensus core. Wraps raft-rs, owns the log + state machine, talks Multi-Raft over a tonic transport with mTLS. |
transaxxxtion |
Rust | Transaction manager. MVCC, SSI conflict graph, TrueTime oracle, Percolator 2PC, local wait-for graph for deadlock probes. |
pathfinder-pete |
Rust | Graph compute. BSP traversal, weighted shortest path (Dijkstra and A* with heuristics), KNN over HNSW, cursor pagination, push/pull strategy switch. |
chrysalis |
Rust | Schema registry client + validator. Mirrors the Kotlin authority store and enforces shape on every write. |
rhizome-metastore |
Rust | Raft group for cluster metadata (membership, shard map, leases, replicated wait-graph). 3 voters by default, co-located inside rhizome-node and scheduled by the shared MultiRaftManager. |
rhizome-node |
Rust | The shipping binary. Hosts the gRPC services and wires every Rust crate above into one process. |
control-plane/gateway |
Kotlin | Stateless edge service: GraphQL endpoint, query planning, sticky transaction routing, mTLS termination. |
control-plane/orchestrator |
Kotlin | Placement, schema authority, deadlock slow path, load forecasting, autoscaling decisions. |
control-plane/common |
Kotlin | Shared DTOs, metastore client, gRPC stubs, retry policies, TLS material loading. |
See docs/architecture.md for the full breakdown and
docs/adr/ for the architectural decision records.
- A POSIX host (Linux is the supported target; macOS works for development).
gitanddockerwithdocker compose v2for the recommended path.- For building from source:
rustupβ the pinned toolchain is inrust-toolchain.toml;- JDK 21 + Gradle (the wrapper in
control-plane/will fetch the right version); - Python 3.11+ with
pipfor the end-to-end suite.
The fastest way to see rhizome alive is the bundled stack β one etcd, three metastore voters, three storage nodes, two orchestrator replicas, one gateway, plus Prometheus and Grafana:
git clone https://github.com/maxbarsukov/rhizome.git
cd rhizome
# 1. Mint a development PKI for mTLS (CA + per-service leaf certs).
./deploy/scripts/gen-test-pki.sh
# 2. Build images and bring the cluster up.
docker compose -f deploy/docker-compose.yml up --build --wait
# 3. Verify everything is healthy.
curl -fsS http://localhost:8080/ready # orchestrator
curl -fsS http://localhost:7003/ready # node #1
curl -fsS http://localhost:8081/graphql # gateway (GraphQL playground)Open Grafana at http://localhost:3000 (default
admin / admin) β the rhizome dashboards are pre-provisioned in
deploy/grafana/.
Tear it down with docker compose -f deploy/docker-compose.yml down -v when
you're done.
# Rust workspace β engine, transactions, traversal, all of which link
# into the single `rhizome-node` binary.
cargo build --workspace --release
# Kotlin control plane β gateway + orchestrator + common.
( cd control-plane && ./gradlew build )
# Run a single-node rhizome-node for smoke tests:
./target/release/rhizome-node \
--id 1 \
--listen 0.0.0.0:7001 \
--data /tmp/rhizome-data \
--single-node
# Run the GraphQL gateway pointed at it:
( cd control-plane && \
./gradlew :gateway:run \
--args="--node-addr=127.0.0.1:7001 --listen=0.0.0.0:8081" )The GraphQL endpoint at http://<gateway>:8081/graphql exposes vertices, edges,
traversals, aggregations, and the full transaction lifecycle:
mutation {
beginTx(timeoutMs: 5000) { id, nodeId }
}
mutation Insert($tx: TxHandleInput!) {
createVertex(tx: $tx, input: {
type: "Person"
properties: [
{ key: "name", value: "Ada" }
{ key: "country", value: "UK" }
]
}) { id }
}
mutation Commit($tx: TxHandleInput!) {
commitTx(tx: $tx) { ok, commitTs }
}
query Friends($tx: TxHandleInput) {
traverse(tx: $tx, from: 42, depth: 2, edgeTypes: ["KNOWS"]) {
vertex { id, type, properties }
}
}Sticky-routing keeps a long-lived transaction pinned to the node that owns its
session β see docs/transactions.md for the routing
contract.
Every gateway call is a thin wrapper around protobuf services defined in
proto/rhizome/v1/. You can talk to a node directly when
you want lower overhead or precise control:
grpcurl -plaintext -d '{"id": 42, "vertex_type": "Person"}' \
localhost:7001 rhizome.v1.Rhizome/CreateVertexSee docs/api.md for the full surface (CRUD, traversal, KNN,
aggregation, transactions, deadlock, metastore).
rhizome exposes Prometheus metrics on every component:
- Data plane β
rhizome_commit_latency_seconds,rhizome_ssi_aborts_total,rhizome_raft_propose_total,rhizome_mvcc_retained_versions, lock-probe outcomes, raft snapshot age, RocksDB pebble metrics. - Control plane β
deadlock_slow_*,placement_migrations_total,schema_versions_active, gateway query-plan cost histograms. - Metastore β
rhizome_metastore_watch_lag_ms_max, raft membership changes, lease expirations.
Pre-built Grafana dashboards live in deploy/grafana/ and
load automatically when you bring the docker-compose stack up. A walkthrough
with screenshots is in docs/observability.md.
# Rust unit + integration tests.
cargo test --workspace
# Kotlin tests.
( cd control-plane && ./gradlew test )
# End-to-end Python suite (spins up the docker-compose stack).
( cd tests/e2e && pip install -r requirements.txt && pytest -v )
# Chaos sub-suite (requires Docker, takes β 10 minutes).
RHIZOME_E2E_CHAOS=1 pytest tests/e2e/test_chaos.py -vCI runs all three suites on every PR β see
.github/workflows/ci.yml.
# Rust β clippy with -D warnings + rustfmt.
cargo clippy --workspace --all-targets -- -D warnings
cargo fmt --all -- --check
# Kotlin β detekt + Kover coverage.
( cd control-plane && ./gradlew detekt koverHtmlReport )Both sets of checks are required on every commit through GitHub Actions.
The next milestones beyond the current v0.3.0 release, in shipping order:
- Security & compliance β full surface. ABAC policy engine with a Rego dialect, field-level encryption with AES-GCM-SIV (deterministic AEAD for searchable PII), cryptographic shredding for GDPR right-to-be-forgotten, WORM audit log with Merkle-tree chaining, per-tenant rate limiting, multi-tenancy isolation enforced at storage and policy layers.
- Autonomy v2 β predictive and adaptive. Load forecasting via Temporal Fusion Transformers; anomaly detection (Isolation Forest + VAE ensemble) over the live metric stream; reinforcement-learning placement (PPO over a Digital Twin simulator) replacing the heuristic NSGA-II loop; federated learning for embedding training without exporting raw data.
- Observability++, backup, and chaos. End-to-end OpenTelemetry tracing
with span propagation through gRPC and raft, ClickHouse sink for
high-cardinality telemetry, incremental backup with point-in-time
restore from raft log + RocksDB checkpoint, an admin chaos-injection
API for fault tolerance drills, and graceful drain on
SIGTERMfor zero-loss rolling deploys. - Developer experience & ecosystem. Native client SDKs for Python,
Go, Java, and TypeScript; a
rhizome-cliadmin tool; a Kubernetes operator with CRDs and a Helm chart; migration importers from Neo4j Bolt, JanusGraph Gremlin, and Dgraph DQL.
Longer-horizon work β multi-region native deployments, hardware-accelerated
analytics (CUDA / FPGA), federated cross-deployment queries, formal
verification of the consensus layer, and confidential-computing
integrations β lives in docs/roadmap.md.
| Document | What's inside |
|---|---|
docs/architecture.md |
The full architectural tour β every module, every protocol. |
docs/transactions.md |
SSI + 2PC + TrueTime + deadlock detection in one place. |
docs/observability.md |
Prometheus metrics catalogue + Grafana dashboard guide. |
docs/benchmarks.md |
Reproducible benchmark methodology and result tables. |
docs/deployment.md |
Production deployment: sizing, networking, PKI, backup/restore. |
docs/api.md |
GraphQL schema + gRPC service catalogue, with error codes. |
docs/adr/ |
Architectural Decision Records (ADR-001 through ADR-008). |
CHANGELOG.md |
Release-by-release changes. |
Hey! We're glad you're thinking about contributing to rhizome! Pick an
issue labeled good first issue, ask any question you need on the tracker, and
we'll guide you through.
Bug reports and pull requests are welcome on GitHub at https://github.com/maxbarsukov/rhizome.
Before opening a PR, please read CONTRIBUTING.md β it
covers the development workflow, the test matrix you're expected to run
locally, and the commit-message conventions we follow.
This project is intended to be a safe, welcoming space for collaboration. Everyone interacting in the rhizome project's codebases, issue trackers, chat rooms and mailing lists is expected to adhere to the Code of Conduct.
- π Found a bug? Open an issue!
- π¬ Want to discuss design or ask questions? Start a thread in GitHub Discussions.
rhizome takes the security of distributed data infrastructure seriously. If you believe you have found a security vulnerability in this repository, please report it privately according to our security policy β do not open a public issue.
If you use rhizome in academic work, please cite it via the metadata in
CITATION.cff. A short form:
@software{rhizome,
author = {Barsukov, Max},
title = {rhizome: a distributed graph database with semantic sharding},
year = {2026},
url = {https://github.com/maxbarsukov/rhizome},
version = {0.3.0}
}| Link | Description |
|---|---|
| raft.github.io/raft.pdf | The original Raft paper β In Search of an Understandable Consensus Algorithm. |
| research.google/pubs/spanner/ | Spanner β TrueTime and external consistency at planet scale. |
| research.google/pubs/percolator/ | Percolator β the 2PC variant we extend for cross-shard writes. |
| vldb.org/pvldb/vol10/p781-Wu.pdf | Wu et al. (VLDB 2017) β An empirical evaluation of in-memory MVCC. |
| doi.org/10.1145/1376616.1376690 | Cahill, Fekete, RΓΆhm (SIGMOD 2008) β Serializable Isolation for Snapshot Databases. |
| github.com/CWTSLeiden/networkanalysis | Reference implementation of the Leiden community-detection algorithm. |
| arxiv.org/abs/1603.09320 | Malkov & Yashunin β HNSW: efficient and robust ANN search. |
| tikv.org/blog/ | TiKV engineering blog β our reference for production raft-rs operation. |
rhizome is available as open source under the terms of the MIT License.
Leave a star β if you find this project useful β it helps a lot.
An @maxbarsukov project Β· graduate work at ITMO University Β· 2026.
