Local-first backtest framework for Binance USDT-M perpetual futures.
perplab is a research environment for intermediate algorithmic traders who want to know whether a strategy survives honest validation. It ships the statistical machinery — cost-aware simulation, CPCV, PBO, deflated Sharpe, robustness tests, a five-tier verdict — and ships zero edges. The framing is deliberate: the engine, not the edges.
- Downloads Binance USDT-M kline history into a local DuckDB cache. Your data stays on your machine.
- Runs vectorized, no-lookahead backtests against a cost model that accounts for fees, spread, slippage, and funding (with sign-corrected long/short cashflows).
- Validates every backtest through a six-gate funnel: sample size, PBO, OOS Sortino, family-grouped deflated Sharpe, max drawdown in R-multiples, and CPCV-fold stability.
- Stress-tests each candidate at 1.5×, 2×, and 3× costs and classifies it as
COST_ROBUST,COST_MARGINAL,COST_FRAGILE, orCOST_PARADOX. - Runs four robustness checks: bootstrap, parameter jitter, monkey test, shuffled-bar test.
- Summarizes the result as a five-tier verdict —
STRONG,PLAUSIBLE,MIXED,WEAK,REJECT— with strengths, weaknesses, and improvement hints. - Sweeps parameter grids in parallel with a live gate-funnel die-off chart.
- Bundles a refined dark React UI with a glossary tooltip on every term.
- Includes an LLM Playground for chat, strategy drafting, and Pine→perplab translation. Multiple providers, redacted keys, on-demand strategy save with no server restart.
- It does not place orders. There is no live, paper, or shadow trading code
path — read
BEFORE_GOING_LIVE.mdfor what you still have to build. - It does not ship tuned thresholds or curated strategy parameters. Defaults are intentionally relaxed; you calibrate for your universe.
- It does not phone home. No telemetry, no analytics, no remote logging. The binary is local. Your API keys, your data, your machine.
- It does not promise that a strategy "works." The verdict is a hint, not a recommendation.
perplab is not yet on PyPI. Clone the repo, build the frontend bundle once, then run the server. Tested on macOS, Linux, and Windows under Python 3.11+ and Node 20+.
git clone https://github.com/analogdada/perplab.git
cd perplab
python3.11 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
bash scripts/build_frontend.sh
perplab serveOpen http://localhost:8723 in your browser.
The frontend bundle is built locally rather than committed so clones stay
small and the build matches your Node version. Re-run
bash scripts/build_frontend.sh after pulling frontend changes.
The starter pack ships with two parquet files (BTCUSDT and ETHUSDT, 4h, six months) so the first backtest works before you download anything. Add more data from the Data page once Binance keys are configured in Settings.
Seven textbook strategies, each in a single file under
src/perplab/strategy/library/. Use them as research starting points or as
templates for your own.
| Name | Mechanic |
|---|---|
ema_crossover |
Long when fast EMA > slow EMA, flip on inverse cross. |
trend_follow |
ADX-strength + EMA-bias regime filter. |
bollinger_squeeze |
Trades the first breakout after a Bollinger-inside-Keltner squeeze. |
bollinger_mean_reversion |
Fades extremes back to the band midline. |
volatility_breakout |
Long on close > prior-N high + ATR buffer. |
pullback_retest |
Stateful retest of a recent breakout level. |
donchian_breakout |
Classic N-bar high/low breakout. |
Defaults are textbook (round numbers). They are not edges. Every backtest must clear the gate funnel on its own.
The gate funnel. Six gates filter overfit and cost-fragile candidates before any verdict is reached. Sample size guards against statistical noise. PBO (Bailey & López de Prado 2014) estimates how likely the in-sample best is random — and is honestly skipped on single-parameter runs rather than faked with a fallback. OOS Sortino, family-grouped deflated Sharpe, R-multiple max drawdown, and CPCV-fold stability cover the rest.
Five-tier verdict. Each backtest lands in one of five tiers — STRONG,
PLAUSIBLE, MIXED, WEAK, REJECT — computed from the share of
applicable gates passed, the cost-stress class, and a clamped Sortino score.
STRONG requires every applicable gate to pass, COST_ROBUST cost stress,
and a strong Sortino. The verdict is a hint to inspect the report, not a
deployment signal.
Cost stress. Backtests are rerun at 1.5×, 2×, and 3× costs. A strategy
that only works at the modeled cost level is fragile. COST_PARADOX (Sortino
climbs with costs) is a red flag for a sign error in the user's cost
model — the engine catches and reports it.
CPCV. Combinatorial Purged Cross-Validation (López de Prado) splits the timeline into N blocks, forms every (train, test) partition, and purges a buffer around each test block to prevent leakage. perplab uses N=6, k=2 (15 OOS estimates per parameter set) with a 2× max-lookback embargo.
perplab is designed to work well with modern AI coding assistants. Three docs at the repo root brief them:
CLAUDE.md— load-bearing pieces, common pitfalls, things not to refactor without understanding.AGENT_GUIDE.md— repo structure, the Strategy contract, where to find indicators, costs, metrics, and validation.SKILL.md— step-by-step workflows: add a strategy, run a sweep, translate a Pine script, diagnose a failing backtest, run pre-commit, build the frontend.
The most common request — "help me write a strategy" — is a single file
under src/perplab/strategy/library/ plus a one-line registration.
METHODOLOGY.md— research-to-production playbook, abbreviated public version. Cites Bailey & López de Prado.BEFORE_GOING_LIVE.md— pre-deployment checklist for what perplab does not validate.AGENT_GUIDE.md,SKILL.md,CLAUDE.md— agent-facing.SPEC.md— full technical specification.CONTRIBUTING.md— contribution guidelines.docs/— in-app docs sources (glossary, gates, robustness tests, writing strategies).
AGPL-3.0-or-later. See LICENSE. In practice: you can fork,
modify, and self-host without restriction. If you run a modified version as
a network service, you must release the modifications under the same
license. Commercial use is permitted under the same terms.
perplab is research software. Trading futures with leverage is risky and most retail strategies lose money after costs. The verdict, gates, and robustness tests are statistical filters, not predictions. Nothing in this repository is financial advice. You are solely responsible for any decisions you make using the output of this software.