ChaosAI — Time-Series Foundation Model

Self-supervised world model for chaotic dynamical systems, evaluated on financial markets.

DISCLAIMER: Research only. Not a trading system. Not financial advice.

Overview

ChaosAI is a four-stratum pipeline for learning compressed, predictive representations of non-stationary time series. Each stratum is a recent self-supervised or RL component (FSQ tokenization, Mamba-2 JEPA, OT Conditional Flow Matching, TD-MPC2), composed so that each level of abstraction is learnable independently and then cascaded. The system is domain-agnostic by design; we use crypto and equity markets as a stress test because their non-stationarity, heavy tails, and weak signal-to-noise make them a worst-case benchmark for representation robustness.

What is novel. The contribution is not any single strate — each of them exists in the literature. The contribution is Multiverse Crossing, an evaluation method for self-supervised time-series representations. Instead of scoring a single deterministic forward pass, we generate M parallel universes by geodesic perturbation on the tangent plane of the JEPA representation hypersphere, re-normalize to the unit sphere, and measure the downstream agreement across universes. The resulting metrics — Lyapunov-style stability proxy, bifurcation index, cross-universe decision consensus — give a geometric test of whether a learned representation's downstream behavior is driven by signal or by noise. We find that representations with Lyapunov < 0 and bifurcation ≈ 0 produce decisions that are robust to 2% geodesic perturbations; this correlates with out-of-sample performance on the financial backtest (Sharpe 2.78 vs 1.22 for deterministic).

Why it matters beyond finance. Any time-series domain where the cost of a wrong decision is high (medical diagnosis, climate forecasting, industrial fault detection, neuroscience) benefits from a per-decision robustness test, not just an average metric. Multiverse Crossing is model-agnostic once a representation sits on a normalized manifold (any contrastive / JEPA / SimCLR-style embedding space).

Architecture

Raw OHLCV → [Strate I: FSQ Tokenizer] → discrete codes
          → [Strate II: Mamba-2 JEPA]  → latent embeddings
          → [Strate III: OT-CFM]       → N future trajectories
          → [Strate IV: TD-MPC2 Agent] → trading actions

Strate I — Dilated CNN + Finite Scalar Quantization (1024 codes, zero codebook collapse)
Strate II — Mamba-2 SSD + VICReg JEPA + cross-attention macro conditioning (FRED/COT)
Strate III — Optimal Transport Flow Matching, multimodal latent futures
Strate IV — TD-MPC2 + CVaR + Multiverse Crossing (M=30 geodesic perturbations)

Results

Model	Params	Dataset	Loss	Strate IV Sharpe
v6	36.1M	838M tokens	1,310	2.63
v6.1	36.6M	838M tokens	908	2.78

v6.1 adds: cross-attention macro injection, bifurcation-modulated CQL, risk-parity rewards, priority experience replay.

Multiverse Crossing (M=30, fresh data): Lyapunov −0.73 (stable), Sharpe 2.78, 0/372 contested assets. Full results in results/multiverse_crossing_30u.md.

Status

Component	Status
Data pipeline (838M tokens, 8,969 assets)	✅ Done
Strate I — FSQ tokenizer (JAX/Flax)	✅ Done
Strate II — Mamba-2 JEPA, auto-sharding, XLA flags	✅ Done
Strate III — OT-CFM stochastic predictor	✅ Done
Strate IV — TD-MPC2 + CVaR + Multiverse Crossing	✅ Done
v6.1 training (100 epochs, TPU v6e)	✅ Done
v6.2 — scale-invariant JEPA (scale_id embedding + cross-res VICReg)	🔄 Training
v6.3 — return prediction auxiliary loss	🔄 Implemented, retraining

Stack

JAX/Flax (TPU-native) + PyTorch (GPU validation). Training done on Google TPU Research Cloud (v6e-64, europe-west4-a), zero-idle-cost data lake via Drive ↔ GCS ↔ TPU.

Key TPU-side design choices:

Auto-sharder detects v6e / v5e / v5p meshes and routes to optimal (8,8) Tore 2D or (16,4) Tore 3D
MXU 128×128 alignment on every head_dim to saturate the matrix unit
bf16 activations, fp32 SSD accumulators — required for numerical stability past step ~2750

Quick Start

# Setup
uv venv && source .venv/bin/activate && uv sync
export PYTHONPATH=$PWD

# Train (TPU v6e)
export SCALE_CONFIG=configs/scaling/v6e_38m_v3.yaml
export GCS_BUCKET=gs://fin-ia-eu
nohup python3 -u scripts/run_training.py > logs/train.log 2>&1 &

# BTC regime prediction
PYTHONPATH=. python3 scripts/predict_btc_1week.py \
    --jepa_ckpt checkpoints/jax_v6e/38m_v3/92112 \
    --strate_i_ckpt checkpoints/strate_i_jax_combined/best_params.npz

See docs/REPRODUCIBILITY.md for GPU fallback setup, environment pinning, and a 5-minute smoke test that validates the full pipeline on a single batch.

References

The architecture is a composition of four recent self-supervised / RL components; our contribution is their integration and the Multiverse Crossing evaluation.

FSQ (Strate I) — Mentzer, F. et al. Finite Scalar Quantization: VQ-VAE Made Simple. ICLR 2024. arXiv:2309.15505
Mamba-2 (Strate II) — Dao, T. & Gu, A. Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality. ICML 2024. arXiv:2405.21060
JEPA — LeCun, Y. A Path Towards Autonomous Machine Intelligence. Meta AI, 2022. openreview
VICReg — Bardes, A., Ponce, J., LeCun, Y. VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning. ICLR 2022. arXiv:2105.04906
OT-CFM (Strate III) — Tong, A. et al. Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport. TMLR 2024. arXiv:2302.00482
TD-MPC2 (Strate IV) — Hansen, N., Su, H., Wang, X. TD-MPC2: Scalable, Robust World Models for Continuous Control. ICLR 2024. arXiv:2310.16828
CVaR RL — Chow, Y. et al. Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach. NeurIPS 2015. arXiv:1506.02188
KAN — Liu, Z. et al. KAN: Kolmogorov-Arnold Networks. 2024. arXiv:2404.19756

Acknowledgments

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) (v6e-64, europe-west4-a). TRC is a non-commercial research program; code is released under AGPL-3.0 in line with the program's expectation that TRC-supported research be shared openly. The project follows Google's AI Principles and is intended exclusively for academic research — not for trading, surveillance, or any revenue-generating activity.

License

AGPL v3

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
configs/scaling		configs/scaling
docs		docs
paper		paper
results		results
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChaosAI — Time-Series Foundation Model

Overview

Architecture

Results

Status

Stack

Quick Start

References

Acknowledgments

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ChaosAI — Time-Series Foundation Model

Overview

Architecture

Results

Status

Stack

Quick Start

References

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages