Skip to content

Zer0pa/ZPE-FT

Repository files navigation

ZPE-FT Masthead

Search financial patterns on compressed delayed-feed archives with deterministic replay and bounded price-field fidelity.

One of 17 independent codec products in the Zer0pa portfolio. Retained public-benchmark, bounded replay, and blocker artifacts are real. The open-access enterprise benchmark is still blocked on missing inputs and auditable query truth.

Public benchmark (3 datasets, parquet+zstd+DuckDB baseline): 5.9–10.9× smaller than raw (aggregate across 3 datasets) · 2.7–3.3× smaller than parquet+zstd (aggregate across 3 datasets) · up to 62.9× faster pattern query vs DuckDB (OHLCV datasets; tick at scale: latency parity) · exact tick fidelity (RMSE = 0.0) · proof artifact

License: SAL v7.1 Python >=3.11 Public benchmark: retained Current gate: BLOCKED_MISSING_INPUTS

Legal boundaries Architecture: runtime map Packet spec: .zpfin

ZPE-FT

What This Is Key Metrics What We Prove Competitive Benchmarks What We Don't Claim
Jump Jump Jump Jump Jump
Readiness Verification Status Proof Anchors Repo Shape Quick Start Upcoming Workstreams
Jump Jump Jump Jump Jump Jump

What This Is

Financial tick-stream encoding. Bounded proof surface for trade-tape replay, missing-input blockers, and auditable FT-C004 truth. Install from PyPI: pip install zpe-ft

The wedge is narrow and specific: compressed delayed-feed archives with deterministic replay, retained public-benchmark evidence on open datasets, and bounded replay fidelity on the in-repo smoke bundle. Public-data rehearsal lanes are useful evidence; they are not authority enterprise inputs. The open-access enterprise benchmark is still blocked on missing inputs and auditable FT-C004 truth.

Codec Mechanics

ZPE-FT Codec Mechanics animation

Field Value
Architecture FT
Encoding FT_TICK_DELTA_V1
Mechanics Asset .github/assets/readme/lane-mechanics/FT.gif

Key Metrics

Metric Value Baseline
SPY 10y daily compression ratio vs raw 5.94× smaller Raw OHLCV bytes
Price-field reconstruction fidelity RMSE = 0.0 ticks (exact) Parquet lossless round-trip
Pattern query latency vs DuckDB (OHLCV) 62.9× faster (0.70 ms vs 43.9 ms p95) parquet+zstd+DuckDB
30-symbol 24-month corpus fidelity 0.0 RMSE across all 30 series, 15,000 corpus points Alpaca daily bars, in-repo proxy lane

Source: proofs/artifacts/public_benchmarks/phase3_public_benchmarks.json (public benchmark rows); proofs/artifacts/real_market_benchmarks/daily_24m/artifacts/ft_reconstruction_fidelity.json (30-symbol corpus row). All public-dataset results are retained proof artifacts from real runs, not synthetic fixtures. 30-symbol row is a proxy lane, not the sovereign enterprise benchmark.

Repo Identity

Field Value
Identifier ZPE-FT
Repository https://github.com/Zer0pa/ZPE-FT
Section encoding
Visibility PUBLIC
Architecture FT
Encoding FT_TICK_DELTA_V1
Commit SHA c8c6ea5e9dcc
License SAL-7.0
Authority Source proofs/reruns/2026-03-21_phase06_contract_freeze_attempt_v3/missing_inputs_packet.json

Readiness

Field Value
Verdict BLOCKED
Checks 4/6
Anchors 5 display anchors
Confidence 98%
Commit c8c6ea5e9dcc
Authority proofs/reruns/2026-03-21_phase06_contract_freeze_attempt_v3/missing_inputs_packet.json

Honest Blocker

No Phase 06 closure or public release readiness; No broad warehouse or incumbent displacement claim; No lossless volume reconstruction claim; No promoted public-data search-quality claim while FT-C004 remains unresolved; No claim that the bounded proxy lanes satisfy the sovereign enterprise benchmark.

What We Prove

  • Public benchmark artifacts are retained for Yahoo SPY, Binance BTCUSDT aggTrades, and Kaggle SPY.
  • Repo-bundled OHLCV roundtrip stays within the bounded price-field error threshold and compresses below raw bytes.
  • Freeze and refresh scripts execute on a declared corpus contract and emit benchmark, fidelity, latency, and roundtrip artifacts.
  • Missing authority inputs keep the sovereign Phase 06 gate blocked.
  • FT-C004 truth remains blocked until labels or audit refs exist.

What We Don't Claim

  • No Phase 06 closure or public release readiness.
  • No broad warehouse or incumbent displacement claim.
  • No lossless volume reconstruction claim.
  • No promoted public-data search-quality claim while FT-C004 remains unresolved.
  • No claim that the bounded proxy lanes satisfy the sovereign enterprise benchmark.

Verification Status

Code Check Verdict
V_01 Public SPY 10y daily compression PASS
V_02 Public BTCUSDT aggTrades compression PASS
V_03 Public Kaggle SPY compression PASS
V_04 Bounded replay price-field fidelity PASS
V_05 Phase 06 contract freeze FAIL
V_06 Public proxy retrieval truth INC

Proof Anchors

Path State
proofs/artifacts/public_benchmarks/phase3_public_benchmarks.json VERIFIED
proofs/reruns/2026-03-19_alpaca_demo_smoke/ft_reconstruction_fidelity.json VERIFIED
proofs/artifacts/real_market_benchmarks/BOUNDARY.json VERIFIED
proofs/reruns/2026-03-21_phase06_contract_freeze_attempt_v3/missing_inputs_packet.json VERIFIED
proofs/phase06_inputs/series_gap_matrix.csv VERIFIED

Repo Shape

Field Value
Proof Anchors 5 display anchors
Modality Lanes 1
Architecture FT
Encoding Not assigned
Verification 4/6 checks
Authority Source proofs/reruns/2026-03-21_phase06_contract_freeze_attempt_v3/missing_inputs_packet.json

Competitive Benchmarks

vs parquet+zstd+DuckDB

The baseline for all comp benchmarks is Parquet (ZSTD compression) queried via DuckDB — a representative modern data-warehouse path for financial time-series pattern search. Each row below is a retained artifact from a real public dataset run, not a synthetic fixture.

Dataset Rows ZPE size vs raw ZPE size vs parquet+zstd ZPE query p95 vs parquet p95 Price-field RMSE (ticks) Proof artifact
Yahoo SPY 10y daily (OHLCV) 2,513 5.94x 2.69x 62.9x faster (0.70 ms vs 43.9 ms) 0.0 proofs/artifacts/public_benchmarks/phase3_public_benchmarks.json
Kaggle SPY full history (OHLCV) 3,201 7.31x 3.32x 13.2x faster (1.04 ms vs 13.7 ms) 0.0 proofs/artifacts/public_benchmarks/phase3_public_benchmarks.json
Binance BTCUSDT aggTrades 2017-09 (tick) 198,880 10.90x 2.81x parity (63.4 ms vs 55.6 ms) 0.0 proofs/artifacts/public_benchmarks/phase3_public_benchmarks.json

Notes on scope: the Binance tick dataset is a trade-tape mapping (bid=ask=trade price) because Binance public aggTrades do not expose top-of-book quotes; query-latency parity at ~200k rows is expected for that workload shape. OHLCV latency advantage is largest on daily series; tick at scale has a different profile. These are delayed-feed public datasets, not authority enterprise inputs.

Quick Start

Install from PyPI:

pip install zpe-ft

Verify from source:

git clone https://github.com/Zer0pa/ZPE-FT.git
cd ZPE-FT
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e .
python - <<'PY'
import zpe_finance
from zpe_finance.rust_bridge import rust_version
print("exports", sorted(zpe_finance.__all__))
print("rust_bridge", rust_version())
PY

Start with docs/ARCHITECTURE.md, then read docs/LEGAL_BOUNDARIES.md and proofs/reruns/2026-03-21_phase06_contract_freeze_attempt_v3/missing_inputs_packet.json. LICENSE is the legal source of truth; the repo uses SAL v7.1.

Upcoming Workstreams

This section captures the active lane priorities — what the next agent or contributor picks up, and what investors should expect. Cadence is continuous, not milestoned.

  • FT-C004 truth resolution — Research-Deferred — Investigation Underway. Open question must be diagnosed and a falsifiable claim formulated before Phase 06 enterprise benchmark work commits direction.
  • Phase 06 enterprise authority inputs — Operations / External Dependency. Authority datasets pending acquisition; engineer-side scaffolding ready.

Retained Proxy-Lane Metrics

These results are from repo-bundled provider-max proxy lanes (not the sovereign Phase 06 authority pack). They are useful evidence of codec behaviour across 30 symbols and tick corpora; they are not the enterprise benchmark.

Lane Series RMSE (ticks) Query p95 Encode p95 Proof artifact
Alpaca demo smoke — SPY 1m bars 1 OHLCV 0.0 (exact tick) 0.01 ms n/a proofs/reruns/2026-03-19_alpaca_demo_smoke/ft_reconstruction_fidelity.json
Alpaca demo smoke — AAPL tick stream 1 tick 0.0 (exact tick); 8.40x compression vs raw n/a n/a proofs/reruns/2026-03-19_alpaca_demo_smoke/ft_tick_benchmark.json
30-symbol daily 24-month corpus 30 OHLCV 0.0 across all 30 series 0.077 ms (p95 across 15,000 corpus points) 4.2 ms proofs/artifacts/real_market_benchmarks/daily_24m/artifacts/ft_reconstruction_fidelity.json
Dukascopy tick 20-session corpus 3 tick 0.0 across all 3 series; 7.2–14.9x compression range (mean 11.1x) 39.1 ms (p95) 1,281 ms proofs/artifacts/real_market_benchmarks/tick_20_sessions/artifacts/ft_reconstruction_fidelity.json

CI coverage for all proxy lanes: tests/test_real_market_corpus.py, tests/test_ohlcv_roundtrip.py, tests/test_packet_roundtrip.py.

About

Deterministic codec for delayed public market data. 5.94×–10.90× smaller than raw on declared public corpora; RMSE 0.0 ticks on price field. Phase 06 sovereign authority gated on FT-C004 truth.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages