A curated list of resources, libraries, runtimes, SDKs, and learning material for Apache Stateful Functions and stateful actor frameworks running on Apache Flink.
Stateful Functions (StateFun) is a programming model for distributed actor-style systems on Apache Flink — durable per-key state, exactly-once messaging, polyglot remote functions, and Kubernetes-native deployment.
- Runtimes
- SDKs
- Examples and playgrounds
- Production guides
- Related actor frameworks
- Articles and talks
- Apache Flink resources
- Kzmlabs StateFun Actors - Stateful actors on Apache Flink 2.x and Java 21. Durable per-key state, exactly-once messaging, Kafka and Kinesis I/O, Kubernetes-native deployment. Continues the Apache Stateful Functions programming model on the modern Flink line.
- Apache Flink Stateful Functions - The original Apache project. Last release 3.4.0 (October 2024) targets Flink 1.16 and Java 11.
- statefun-sdk-java - Java SDK for remote functions, published to Maven Central.
- statefun-sdk-python - Python SDK, upstream Apache.
- statefun-sdk-go - Go SDK, upstream Apache.
- statefun-sdk-js - Node.js SDK, upstream Apache.
- Kzmlabs StateFun Actors — Fraud detection example - Per-card risk scoring on a Kafka payments stream with velocity, geo-impossibility, and amount-anomaly detection. Full Java code + module.yaml.
- Kzmlabs StateFun Actors — IoT fleet digital twins - Per-device twin actors for industrial telemetry on Kinesis with rolling stats, anomaly detection, and command/response loop.
- flink-statefun-playground - Apache's collection of polyglot examples: Java, Python, Go, JS samples in Docker compose.
- StateFun Actors documentation - Quickstart, install, build, Kafka/Kinesis I/O, Kubernetes deployment, architecture, and migration guide.
- Kubernetes deployment guide - Production layout via the Flink Kubernetes Operator with RocksDB checkpoints to S3.
- Migration from Apache Stateful Functions - Coordinate change, Flink 2.x configuration keys, what stays the same.
For context on the broader stateful-actor / durable-execution space:
- Restate - Durable execution for distributed services. Different runtime, similar problems.
- Temporal - Durable workflows. Workflow-style API rather than actor-style.
- Akka - JVM actor framework. Different runtime model, no Flink dependency.
- Orleans - Microsoft's virtual-actor framework on .NET.
- Cloudflare Durable Objects - Edge-runtime stateful actors.
- Stateful Functions: Polyglot Event-Driven Functions for Stateful Distributed Applications - Ververica's introduction to the runtime internals.
- Apache StateFun release notes - Latest upstream documentation.
- Stateful Functions 2.0 — An Event-Driven Database on Apache Flink - Apache Flink blog post on the 2.0 release.
- Stateful Functions 3.0: Remote Functions Front-and-Center - Apache Flink blog post on the 3.0 release.
- Stateful Functions 3.2.0 Release Announcement - Apache Flink blog post on the 3.2 release.
- Lightweight Asynchronous Snapshots for Distributed Dataflows - Carbone et al. The paper behind Flink's checkpointing, the foundation StateFun's exactly-once semantics rest on.
- Stateful Functions: Building Event-Driven Applications - Flink Forward conference talks on Stateful Functions (search results, multiple years).
- Distributed Architecture and Concepts - Apache deep-dive into how the runtime, dispatcher, and remote functions interact.
- awesome-flink - The canonical curated list for Apache Flink projects, libraries, and tooling.
- Apache Flink documentation - The Flink runtime that StateFun runs on.
- Flink Kubernetes Operator - The Operator that provisions StateFun (and Flink) jobs on Kubernetes.
- Flink Forward conference - The annual Apache Flink conference. Several editions feature Stateful Functions talks.
- Apache Flink blog - Release announcements, design notes, and ecosystem updates.
- Flink State Backends documentation - How RocksDB and HashMap state backends work, including checkpoint storage to S3.
Contributions are very welcome. Read the contribution guidelines first, then open a pull request.
This list aims for quality over quantity — entries should be actively useful (working code, clear documentation, demonstrated usage). Entries that haven't seen a commit or release in 24+ months may be moved to an archive section.