perplab

Local-first backtest framework for Binance USDT-M perpetual futures.

perplab is a research environment for intermediate algorithmic traders who want to know whether a strategy survives honest validation. It ships the statistical machinery — cost-aware simulation, CPCV, PBO, deflated Sharpe, robustness tests, a five-tier verdict — and ships zero edges. The framing is deliberate: the engine, not the edges.

What it does

Downloads Binance USDT-M kline history into a local DuckDB cache. Your data stays on your machine.
Runs vectorized, no-lookahead backtests against a cost model that accounts for fees, spread, slippage, and funding (with sign-corrected long/short cashflows).
Validates every backtest through a six-gate funnel: sample size, PBO, OOS Sortino, family-grouped deflated Sharpe, max drawdown in R-multiples, and CPCV-fold stability.
Stress-tests each candidate at 1.5×, 2×, and 3× costs and classifies it as COST_ROBUST, COST_MARGINAL, COST_FRAGILE, or COST_PARADOX.
Runs four robustness checks: bootstrap, parameter jitter, monkey test, shuffled-bar test.
Summarizes the result as a five-tier verdict — STRONG, PLAUSIBLE, MIXED, WEAK, REJECT — with strengths, weaknesses, and improvement hints.
Sweeps parameter grids in parallel with a live gate-funnel die-off chart.
Bundles a refined dark React UI with a glossary tooltip on every term.
Includes an LLM Playground for chat, strategy drafting, and Pine→perplab translation. Multiple providers, redacted keys, on-demand strategy save with no server restart.

What it doesn't do

It does not place orders. There is no live, paper, or shadow trading code path — read BEFORE_GOING_LIVE.md for what you still have to build.
It does not ship tuned thresholds or curated strategy parameters. Defaults are intentionally relaxed; you calibrate for your universe.
It does not phone home. No telemetry, no analytics, no remote logging. The binary is local. Your API keys, your data, your machine.
It does not promise that a strategy "works." The verdict is a hint, not a recommendation.

Quickstart

perplab is not yet on PyPI. Clone the repo, build the frontend bundle once, then run the server. Tested on macOS, Linux, and Windows under Python 3.11+ and Node 20+.

git clone https://github.com/analogdada/perplab.git
cd perplab
python3.11 -m venv .venv
source .venv/bin/activate           # Windows: .venv\Scripts\activate
pip install -e .
bash scripts/build_frontend.sh
perplab serve

Open http://localhost:8723 in your browser.

The frontend bundle is built locally rather than committed so clones stay small and the build matches your Node version. Re-run bash scripts/build_frontend.sh after pulling frontend changes.

The starter pack ships with two parquet files (BTCUSDT and ETHUSDT, 4h, six months) so the first backtest works before you download anything. Add more data from the Data page once Binance keys are configured in Settings.

Built-in strategies

Seven textbook strategies, each in a single file under src/perplab/strategy/library/. Use them as research starting points or as templates for your own.

Name	Mechanic
`ema_crossover`	Long when fast EMA > slow EMA, flip on inverse cross.
`trend_follow`	ADX-strength + EMA-bias regime filter.
`bollinger_squeeze`	Trades the first breakout after a Bollinger-inside-Keltner squeeze.
`bollinger_mean_reversion`	Fades extremes back to the band midline.
`volatility_breakout`	Long on close > prior-N high + ATR buffer.
`pullback_retest`	Stateful retest of a recent breakout level.
`donchian_breakout`	Classic N-bar high/low breakout.

Defaults are textbook (round numbers). They are not edges. Every backtest must clear the gate funnel on its own.

Key concepts

The gate funnel. Six gates filter overfit and cost-fragile candidates before any verdict is reached. Sample size guards against statistical noise. PBO (Bailey & López de Prado 2014) estimates how likely the in-sample best is random — and is honestly skipped on single-parameter runs rather than faked with a fallback. OOS Sortino, family-grouped deflated Sharpe, R-multiple max drawdown, and CPCV-fold stability cover the rest.

Five-tier verdict. Each backtest lands in one of five tiers — STRONG, PLAUSIBLE, MIXED, WEAK, REJECT — computed from the share of applicable gates passed, the cost-stress class, and a clamped Sortino score. STRONG requires every applicable gate to pass, COST_ROBUST cost stress, and a strong Sortino. The verdict is a hint to inspect the report, not a deployment signal.

Cost stress. Backtests are rerun at 1.5×, 2×, and 3× costs. A strategy that only works at the modeled cost level is fragile. COST_PARADOX (Sortino climbs with costs) is a red flag for a sign error in the user's cost model — the engine catches and reports it.

CPCV. Combinatorial Purged Cross-Validation (López de Prado) splits the timeline into N blocks, forms every (train, test) partition, and purges a buffer around each test block to prevent leakage. perplab uses N=6, k=2 (15 OOS estimates per parameter set) with a 2× max-lookback embargo.

For AI coding agents

perplab is designed to work well with modern AI coding assistants. Three docs at the repo root brief them:

CLAUDE.md — load-bearing pieces, common pitfalls, things not to refactor without understanding.
AGENT_GUIDE.md — repo structure, the Strategy contract, where to find indicators, costs, metrics, and validation.
SKILL.md — step-by-step workflows: add a strategy, run a sweep, translate a Pine script, diagnose a failing backtest, run pre-commit, build the frontend.

The most common request — "help me write a strategy" — is a single file under src/perplab/strategy/library/ plus a one-line registration.

Documentation

METHODOLOGY.md — research-to-production playbook, abbreviated public version. Cites Bailey & López de Prado.
BEFORE_GOING_LIVE.md — pre-deployment checklist for what perplab does not validate.
AGENT_GUIDE.md, SKILL.md, CLAUDE.md — agent-facing.
SPEC.md — full technical specification.
CONTRIBUTING.md — contribution guidelines.
docs/ — in-app docs sources (glossary, gates, robustness tests, writing strategies).

License

AGPL-3.0-or-later. See LICENSE. In practice: you can fork, modify, and self-host without restriction. If you run a modified version as a network service, you must release the modifications under the same license. Commercial use is permitted under the same terms.

Disclaimer

perplab is research software. Trading futures with leverage is risky and most retail strategies lose money after costs. The verdict, gates, and robustness tests are statistical filters, not predictions. Nothing in this repository is financial advice. You are solely responsible for any decisions you make using the output of this software.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

perplab

What it does

What it doesn't do

Quickstart

Built-in strategies

Key concepts

For AI coding agents

Documentation

License

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
docs		docs
frontend		frontend
sample_data		sample_data
scripts		scripts
src/perplab		src/perplab
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENT_GUIDE.md		AGENT_GUIDE.md
BEFORE_GOING_LIVE.md		BEFORE_GOING_LIVE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
METHODOLOGY.md		METHODOLOGY.md
README.md		README.md
SKILL.md		SKILL.md
SPEC.md		SPEC.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

perplab

What it does

What it doesn't do

Quickstart

Built-in strategies

Key concepts

For AI coding agents

Documentation

License

Disclaimer

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages