Stabilize Polymarket live runner smoke path by Mathews-Tom · Pull Request #11 · Mathews-Tom/augur

Mathews-Tom · 2026-05-17T15:15:46Z

Summary

This PR turns the merged single-process runner into a validated Polymarket live-smoke path and records the operational evidence needed to continue the proof loop.

It covers the full branch scope:

seeds a bounded active Polymarket watchlist in config/markets.toml
fixes Polymarket live API usage across Gamma metadata, CLOB order books, and Data API trades
keeps configured market IDs stable while using CLOB token IDs only for order-book calls
persists computed feature vectors before detector dispatch
adds operator-visible one-shot and continuous progress summaries on stderr while preserving stdout for canonical SignalContext JSON
adds smoke controls for feature warmup and summary cadence
handles live runner interrupts cleanly
ignores local DuckDB and live-run capture artifacts
updates manual testing docs with one-shot, short warmup, and default-warmup evidence

Details

Polymarket ingestion and runner wiring

Added targeted Polymarket market polling through Gamma conditionId lookup.
Normalized Gamma payloads into the existing snapshot normalizer contract.
Switched order-book polling to use the primary CLOB token ID instead of the condition ID.
Switched trade polling to Polymarket Data API by condition ID.
Kept internal configured market IDs stable in snapshots, trades, features, and signals.

Watchlist and relationships

Replaced the inactive placeholder watchlist with 12 active Polymarket markets.
Added manual relationship edges for adjacent MicroStrategy BTC-sale markets, adjacent BTC/ETH ETF-flow markets, and one China-related geopolitical pair.
Kept the seed Polymarket-only so local live smoke does not require Kalshi credentials.

Runtime observability

--once now emits a concise run summary to stderr.
Continuous mode emits progress summaries every --summary-every-cycles cycles.
Added --feature-warmup-size for short smoke runs while leaving the default warmup at 50 observations per market.
Ctrl-C now exits cleanly with run_engine stopped: interrupted and exit code 130 instead of printing a traceback.
Stdout remains reserved for canonical signal JSON.

Persistence

Engine.run_cycle() now persists computed FeatureVectors before detector dispatch.
The features table is now a real operational progress signal instead of remaining empty during warmed runs.

Live validation evidence

One-shot smoke

uv run python scripts/run_engine.py --once now reports live progress and persists snapshots.

Example shape:

augur run summary: status=ok mode=once cycle=1 storage=duckdb:data/augur.duckdb
  markets: active=12 platforms=polymarket:12 snapshots=12
  outputs: trades=<market-dependent> features=0 signals=0
  note: feature buffers are still warming; configured warmup is 50 observations per market, estimated remaining cycles=49, and --once starts a fresh in-memory buffer

Short warmup smoke

A short smoke with --feature-warmup-size 2 showed the expected transition:

cycle=1 features=0
cycle=2 features=12
cycle=3 features=12
cycle=4 features=12

Default warmup capture

A longer default-warmup capture ran with:

uv run python scripts/run_engine.py --poll-seconds 60 --summary-every-cycles 1 \
  > data/run_engine.signals.jsonl \
  2> data/run_engine.progress.log

Observed results after stopping the runner:

progress summaries 105
first feature cycle 50
first signal cycle 103
snapshots 1416
features 732
signals 1

The emitted signal was a price_velocity context for polymarket_btc_etf_flows_may_18_2026 with magnitude/confidence 0.873316 and manipulation flag thin_book_during_move. This is evidence that the live monolith path can emit canonical contexts from real market data. It is not a claim that confidence is production-calibrated.

Validation

Passed locally:

uv run ruff check .
uv run ruff format --check .
uv run mypy --strict src/
uv run python scripts/export_schemas.py --check
uv run pytest

Full test result:

376 passed

Known boundaries

scripts/backtest.py still raises NotImplementedError.
scripts/calibrate.py still raises NotImplementedError.
Confidence values still use identity calibration until the backtest, labels, and calibration loop exist.
The monolith is still daemon-shaped; repeated --once runs do not rehydrate feature buffers from historical snapshots.
The emitted live signal should be reviewed as a detector-quality candidate, especially because it came from a low-liquidity ETF-flow market and carried thin_book_during_move.

Next work after this PR

Per .docs/current-development-state.md, the next proof-loop item is scripts/backtest.py:

implement uv run python scripts/backtest.py --help
replay DuckDB or fixture snapshots through the same feature and detector path
emit stable JSON and Markdown reports
include snapshot counts, markets covered, detector signal counts, deduped counts, manipulation flag distribution, and unlabeled signal counts
use the emitted ETF-flow signal as an early detector-review case

Mathews-Tom added 11 commits May 17, 2026 02:05

fix(engine): target Polymarket live market polling

3f762ab

feat(config): seed Polymarket live watchlist

8bf7dbb

docs(ops): record live runner smoke

d56707d

chore: ignore local DuckDB runtime files

d2de96b

feat(engine): summarize one-shot live runs

8e835c1

feat(engine): explain one-shot run status

ca42d0c

fix(engine): persist computed feature vectors

8d49150

feat(engine): report continuous warmup progress

b6f12b8

chore: ignore live runner capture outputs

e6156eb

fix(engine): handle live runner interrupts cleanly

74a70dd

docs(ops): record default warmup capture

8703687

Mathews-Tom merged commit 38ce652 into main May 17, 2026
2 checks passed

Mathews-Tom deleted the fix/polymarket-live-runner branch May 17, 2026 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize Polymarket live runner smoke path#11

Stabilize Polymarket live runner smoke path#11
Mathews-Tom merged 11 commits into
mainfrom
fix/polymarket-live-runner

Mathews-Tom commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mathews-Tom commented May 17, 2026

Summary

Details

Polymarket ingestion and runner wiring

Watchlist and relationships

Runtime observability

Persistence

Live validation evidence

One-shot smoke

Short warmup smoke

Default warmup capture

Validation

Known boundaries

Next work after this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant