Skip to content

Stabilize Polymarket live runner smoke path#11

Merged
Mathews-Tom merged 11 commits into
mainfrom
fix/polymarket-live-runner
May 17, 2026
Merged

Stabilize Polymarket live runner smoke path#11
Mathews-Tom merged 11 commits into
mainfrom
fix/polymarket-live-runner

Conversation

@Mathews-Tom

Copy link
Copy Markdown
Owner

Summary

This PR turns the merged single-process runner into a validated Polymarket live-smoke path and records the operational evidence needed to continue the proof loop.

It covers the full branch scope:

  • seeds a bounded active Polymarket watchlist in config/markets.toml
  • fixes Polymarket live API usage across Gamma metadata, CLOB order books, and Data API trades
  • keeps configured market IDs stable while using CLOB token IDs only for order-book calls
  • persists computed feature vectors before detector dispatch
  • adds operator-visible one-shot and continuous progress summaries on stderr while preserving stdout for canonical SignalContext JSON
  • adds smoke controls for feature warmup and summary cadence
  • handles live runner interrupts cleanly
  • ignores local DuckDB and live-run capture artifacts
  • updates manual testing docs with one-shot, short warmup, and default-warmup evidence

Details

Polymarket ingestion and runner wiring

  • Added targeted Polymarket market polling through Gamma conditionId lookup.
  • Normalized Gamma payloads into the existing snapshot normalizer contract.
  • Switched order-book polling to use the primary CLOB token ID instead of the condition ID.
  • Switched trade polling to Polymarket Data API by condition ID.
  • Kept internal configured market IDs stable in snapshots, trades, features, and signals.

Watchlist and relationships

  • Replaced the inactive placeholder watchlist with 12 active Polymarket markets.
  • Added manual relationship edges for adjacent MicroStrategy BTC-sale markets, adjacent BTC/ETH ETF-flow markets, and one China-related geopolitical pair.
  • Kept the seed Polymarket-only so local live smoke does not require Kalshi credentials.

Runtime observability

  • --once now emits a concise run summary to stderr.
  • Continuous mode emits progress summaries every --summary-every-cycles cycles.
  • Added --feature-warmup-size for short smoke runs while leaving the default warmup at 50 observations per market.
  • Ctrl-C now exits cleanly with run_engine stopped: interrupted and exit code 130 instead of printing a traceback.
  • Stdout remains reserved for canonical signal JSON.

Persistence

  • Engine.run_cycle() now persists computed FeatureVectors before detector dispatch.
  • The features table is now a real operational progress signal instead of remaining empty during warmed runs.

Live validation evidence

One-shot smoke

uv run python scripts/run_engine.py --once now reports live progress and persists snapshots.

Example shape:

augur run summary: status=ok mode=once cycle=1 storage=duckdb:data/augur.duckdb
  markets: active=12 platforms=polymarket:12 snapshots=12
  outputs: trades=<market-dependent> features=0 signals=0
  note: feature buffers are still warming; configured warmup is 50 observations per market, estimated remaining cycles=49, and --once starts a fresh in-memory buffer

Short warmup smoke

A short smoke with --feature-warmup-size 2 showed the expected transition:

cycle=1 features=0
cycle=2 features=12
cycle=3 features=12
cycle=4 features=12

Default warmup capture

A longer default-warmup capture ran with:

uv run python scripts/run_engine.py --poll-seconds 60 --summary-every-cycles 1 \
  > data/run_engine.signals.jsonl \
  2> data/run_engine.progress.log

Observed results after stopping the runner:

progress summaries 105
first feature cycle 50
first signal cycle 103
snapshots 1416
features 732
signals 1

The emitted signal was a price_velocity context for polymarket_btc_etf_flows_may_18_2026 with magnitude/confidence 0.873316 and manipulation flag thin_book_during_move. This is evidence that the live monolith path can emit canonical contexts from real market data. It is not a claim that confidence is production-calibrated.

Validation

Passed locally:

uv run ruff check .
uv run ruff format --check .
uv run mypy --strict src/
uv run python scripts/export_schemas.py --check
uv run pytest

Full test result:

376 passed

Known boundaries

  • scripts/backtest.py still raises NotImplementedError.
  • scripts/calibrate.py still raises NotImplementedError.
  • Confidence values still use identity calibration until the backtest, labels, and calibration loop exist.
  • The monolith is still daemon-shaped; repeated --once runs do not rehydrate feature buffers from historical snapshots.
  • The emitted live signal should be reviewed as a detector-quality candidate, especially because it came from a low-liquidity ETF-flow market and carried thin_book_during_move.

Next work after this PR

Per .docs/current-development-state.md, the next proof-loop item is scripts/backtest.py:

  1. implement uv run python scripts/backtest.py --help
  2. replay DuckDB or fixture snapshots through the same feature and detector path
  3. emit stable JSON and Markdown reports
  4. include snapshot counts, markets covered, detector signal counts, deduped counts, manipulation flag distribution, and unlabeled signal counts
  5. use the emitted ETF-flow signal as an early detector-review case

@Mathews-Tom Mathews-Tom merged commit 38ce652 into main May 17, 2026
2 checks passed
@Mathews-Tom Mathews-Tom deleted the fix/polymarket-live-runner branch May 17, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant