Skip to content

boxsie/qtrader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

QTrader โ€” Deep Reinforcement Learning Trading Agent

Cryptocurrency trading research system using PyTorch Double DQN with LSTM, multi-timeframe analysis, and a real-time dashboard. Research project, not trading advice โ€” see the warnings at the bottom before doing anything with real money.

Python PyTorch CUDA License

๐ŸŽฏ What is QTrader?

QTrader is a deep reinforcement learning system that learns to trade cryptocurrencies by:

  • Analyzing multiple timeframes simultaneously (1m to 4h)
  • Learning from historical price data and technical indicators
  • Optimizing for portfolio growth with intelligent risk management
  • Adapting strategy based on market conditions

Key Innovation: Multi-Timeframe HODL Strategy

Unlike traditional RL trading agents that try to time every move, QTrader implements a HODL-based approach where it:

  • Learns when to enter positions (buying the dip)
  • Learns position sizing based on market volatility (10%, 25%, 50%, 100%)
  • Learns when to exit (taking profits or cutting losses)
  • Uses trailing stops and take-profit levels automatically

This matches real-world crypto trading better than high-frequency approaches.

๐Ÿ“Š Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         HTTP/WS          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 โ”‚ โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค                 โ”‚
โ”‚  Training Agent โ”‚  Real-time Events       โ”‚    Dashboard    โ”‚
โ”‚  (PyTorch DQN)  โ”‚                         โ”‚  (Web UI)       โ”‚
โ”‚                 โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ”œโ”€ Double DQN + LSTM
         โ”œโ”€ Prioritized Experience Replay
         โ”œโ”€ Multi-Timeframe State (5m, 15m, 30m, 1h, 4h)
         โ”œโ”€ Advanced Technical Indicators (RSI, MACD, BB, etc.)
         โ””โ”€ Risk Management (Position Sizing, Stop Loss, Take Profit)

Fully Decoupled: Agent and Dashboard can run independently on different machines!

โœจ Features

Training Agent

  • ๐Ÿง  Double DQN with Dueling Architecture - Stable, efficient learning
  • ๐Ÿ“ˆ LSTM Networks - Captures temporal patterns in price movements
  • ๐ŸŽฏ Prioritized Experience Replay - Learns from important transitions
  • ๐Ÿ“Š Multi-Timeframe Analysis - Sees market from multiple perspectives
  • ๐Ÿ›ก๏ธ Advanced Risk Management - Dynamic position sizing, stop-loss, take-profit
  • โšก GPU Accelerated - CUDA support with mixed precision training
  • ๐Ÿ’พ Automatic Checkpointing - Saves best models during training
  • ๐Ÿ“ Config-Driven - No code changes for different strategies

Real-Time Dashboard

  • ๐Ÿ“Š Live Portfolio Tracking - Balance, P&L, ROI updated every second
  • ๐Ÿ“ˆ Performance Metrics - Win rate, Sharpe ratio, max drawdown
  • ๐ŸŽจ Trading Exchange UI - Professional Bootstrap 5 interface
  • ๐Ÿ“‰ Real-Time Charts - Price and reward visualization
  • ๐Ÿ’ธ Trade History Feed - Live buy/sell activity
  • ๐Ÿ”Œ WebSocket Streaming - Sub-second latency
  • ๐ŸŒ Remote Monitoring - Deploy dashboard on separate machine/cloud

๐Ÿš€ Quick Start

1. Install Training Agent

git clone https://github.com/boxsie/qtrader.git
cd qtrader

# GPU build โ€” installs torch 2.5.1+cu121 from the PyTorch index.
pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 torchaudio==2.5.1+cu121 \
    --index-url https://download.pytorch.org/whl/cu121
pip install -r agent/requirements.txt

# CPU-only (works for tests + smoke runs):
# pip install torch --index-url https://download.pytorch.org/whl/cpu
# pip install -r agent/requirements.txt

2. Fetch market data

The BTC/USD 1-minute CSV is not committed. Download it from Binance Vision (free, no API key) into data/:

python scripts/fetch_btc_data.py --start 2017-08 --end 2025-12

See docs/DATA.md for CLI options and the output schema.

3. Start Training

cd agent

# Basic training (50 episodes, fast config)
python train.py --config fast --profile day_trader

# Full training with dashboard monitoring
python train.py --config full --profile day_trader --dashboard http://localhost:8765

# Custom episode count
python train.py --config fast --profile scalper --episodes 200

4. Launch Dashboard (Optional)

# In a new terminal
cd dashboard
pip install -r requirements.txt
python server.py

Open browser to http://localhost:8765 and watch your agent train in real-time!

๐Ÿ“ Project Structure

qtrader/
โ”œโ”€โ”€ agent/                      # Training Agent (Self-Contained)
โ”‚   โ”œโ”€โ”€ config/                 # Training configurations
โ”‚   โ”‚   โ”œโ”€โ”€ fast.yaml          # Quick training (50 episodes)
โ”‚   โ”‚   โ””โ”€โ”€ full.yaml          # Full training (1000 episodes)
โ”‚   โ”œโ”€โ”€ profiles/               # Trading strategies
โ”‚   โ”‚   โ”œโ”€โ”€ scalper.yaml       # 1m-15m timeframes
โ”‚   โ”‚   โ”œโ”€โ”€ day_trader.yaml    # 15m-4h timeframes
โ”‚   โ”‚   โ””โ”€โ”€ swing_trader.yaml  # 1h-1d timeframes
โ”‚   โ”œโ”€โ”€ qlearn/                 # Deep RL models
โ”‚   โ”œโ”€โ”€ train.py               # Main training script
โ”‚   โ””โ”€โ”€ requirements.txt       # Agent dependencies
โ”‚
โ”œโ”€โ”€ dashboard/                  # Dashboard Server (Standalone)
โ”‚   โ”œโ”€โ”€ server.py              # FastAPI WebSocket server
โ”‚   โ”œโ”€โ”€ index.html             # Web UI with playback controls
โ”‚   โ”œโ”€โ”€ history/               # Saved run recordings (auto-generated)
โ”‚   โ””โ”€โ”€ requirements.txt       # Dashboard dependencies
โ”‚
โ”œโ”€โ”€ dataservice/                # Experimental: HTTP API for OHLCV + sentiment
โ”‚   โ””โ”€โ”€ README.md              # (not required for training)
โ”‚
โ”œโ”€โ”€ shared/                     # Shared event contracts
โ”‚   โ””โ”€โ”€ events.py              # Event models
โ”‚
โ”œโ”€โ”€ scripts/
โ”‚   โ””โ”€โ”€ fetch_btc_data.py      # Download BTCUSDT 1-min data from Binance Vision
โ”‚
โ”œโ”€โ”€ tests/                      # pytest suite
โ”œโ”€โ”€ docs/                       # DATA.md, REWARDS.md, DEPLOYMENT.md
โ”œโ”€โ”€ examples/                   # Reward configuration examples
โ”‚
โ”œโ”€โ”€ data/                       # Market data (gitignored, populated by fetch script)
โ”‚   โ””โ”€โ”€ btcusd_1-min_data.csv
โ”‚
โ”œโ”€โ”€ runs/                       # Training outputs (gitignored, auto-generated)
โ”‚   โ””โ”€โ”€ [config]/[hash]/[profile]/[hash]/[timestamp]/
โ”‚       โ”œโ”€โ”€ best_model.pt
โ”‚       โ”œโ”€โ”€ run_log.jsonl
โ”‚       โ””โ”€โ”€ run_summary.txt
โ”‚
โ””โ”€โ”€ README.md                   # This file

โš™๏ธ Configuration

Training Configs (agent/config/)

fast.yaml - Quick experimentation (50 episodes)

training:
  num_episodes: 50
  batch_size: 256
memory:
  capacity: 10000
model:
  hidden_size: 128
  lstm_layers: 2

full.yaml - Production training (1000 episodes)

training:
  num_episodes: 1000
  batch_size: 512
memory:
  capacity: 100000
model:
  hidden_size: 256
  lstm_layers: 3

Trading Profiles (agent/profiles/)

day_trader.yaml - Intraday swings (recommended)

timeframe_weights:
  5m: 0.15
  15m: 0.45   # Primary
  30m: 0.15
  1h: 0.20
  4h: 0.05

scalper.yaml - Short-term trades

timeframe_weights:
  1m: 0.30
  5m: 0.40   # Primary
  15m: 0.20
  30m: 0.10

swing_trader.yaml - Multi-day holds

timeframe_weights:
  1h: 0.20
  4h: 0.45   # Primary
  1d: 0.35

๐ŸŽฎ Actions

The agent can take 9 discrete actions per step:

Action Description
0 HOLD - Do nothing
1-4 BUY 10% / 25% / 50% / 100% of cash
5-8 SELL 10% / 25% / 50% / 100% of position

Position sizing is dynamic based on volatility (higher vol = smaller positions).

๐Ÿ“ˆ Reward Shaping

QTrader uses a sophisticated multi-component reward system:

  1. Immediate Rewards (30%)

    • Trading fees (small penalty)
    • Realized P&L on position closes
  2. Timing Rewards (20%)

    • Retrospective evaluation of past decisions
    • "Did buying 15 steps ago turn out good?"
  3. Portfolio Growth Rewards (50%)

    • New portfolio highs = BIG rewards
    • Encourages long-term growth over quick wins

This prevents overfitting to individual trades and focuses on total portfolio performance.

๐Ÿ›ก๏ธ Risk Management

Built-in risk controls:

  • Position Sizing: Volatility-adjusted (max 15% of portfolio)
  • Stop Loss: Dynamic 2% trailing stop
  • Take Profit: Automatic 4% profit taking
  • Max Drawdown: Training stops at 20% drawdown
  • Fee Modeling: 0.15% per trade (Coinbase standard)

๐Ÿ“Š Dashboard Features

The web dashboard provides real-time monitoring with full playback capabilities:

Live Monitoring

Portfolio Panel:

  • Portfolio value with color-coded P&L
  • Cash vs crypto allocation
  • ROI percentage
  • Unrealized P&L on open positions

Performance Metrics:

  • Total trades (buys/sells breakdown)
  • Win rate (decision quality, not just profitable trades)
  • Average profit per trade
  • Total fees paid
  • Current market price

Training Metrics:

  • Episode progress bar
  • Epsilon decay (exploration โ†’ exploitation)
  • Neural network loss
  • Average reward per step

Live Charts:

  • Price chart with real-time updates
  • Reward chart showing agent performance
  • Smooth animations, auto-scaling

Trade Feed:

  • Live trade history (last 50 trades)
  • Color-coded BUY (green) / SELL (red)
  • Price, quantity, and total value

๐ŸŽฌ Playback System

Record, replay, and analyze training runs:

Auto-Recording:

  • Every training run automatically recorded with unique ID
  • All events saved to dashboard/history/ as JSON
  • Auto-saves when new run starts (run_id change detected)
  • User notified with toast message on save

History Browser:

  • Click History button to browse all saved runs
  • See metadata: timestamp, duration, episodes, ROI, event count
  • Load any run for instant replay

Playback Controls:

  • โ–ถ๏ธ Play/Pause - Control playback speed
  • ๐Ÿ›‘ Stop - Exit playback, return to live mode
  • ๐Ÿ“Š Timeline Slider - Seek to any event (rebuilds state)
  • โšก Speed Control - 0.5x to 10x playback speed
  • ๐Ÿ“ Position Display - Current event / total events

Mode Indicators:

  • LIVE mode - Green dot, WebSocket connected
  • PLAYBACK mode - Yellow indicator, disconnected from live
  • Clear visual distinction between modes

Smart Features:

  • Non-blocking recording (zero performance impact)
  • State reconstruction on timeline seek (accurate historical replay)
  • Auto-save on run change (no manual saves needed)
  • Handles corrupted files and edge cases gracefully

Example Workflow:

# 1. Start training with dashboard
python train.py --dashboard http://localhost:8765

# 2. Training auto-records all events

# 3. Start new run โ†’ previous run auto-saves
python train.py --dashboard http://localhost:8765

# 4. Click History โ†’ Browse โ†’ Load โ†’ Play!
# Replay entire run at any speed, seek anywhere

Storage Format:

{
  "run_id": "20251005_143022",
  "metadata": {
    "start_time": "2025-10-05T14:30:22",
    "duration_seconds": 2725,
    "total_episodes": 100,
    "final_stats": {"roi": 15.5, ...}
  },
  "events": [...],  // All events with timestamps
  "saved_at": "2025-10-05T15:15:47"
}

Files saved to: dashboard/history/run_{run_id}_{timestamp}.json

๐Ÿ”ง Advanced Usage

Training from Checkpoint

python train.py --resume runs/fast/.../20251005_123456/best_model.pt

Evaluation Mode

python train.py --mode eval --model runs/.../best_model.pt --episodes 10

Remote Dashboard

# On the monitoring machine
cd dashboard
python server.py --host 0.0.0.0 --port 8765

# On the GPU training machine
cd agent
python train.py --dashboard http://<dashboard-host>:8765

Cloud Dashboard Deployment

Deploy dashboard to Heroku/Railway, then:

python train.py --dashboard https://your-dashboard.railway.app

๐Ÿ“š Documentation

๐Ÿงช Development

pytest tests/ -v            # reward + profile sanity checks
python agent/test_data.py   # validate your CSV's columns

๐ŸŽฏ Current Performance

Latest Training Run (Day Trader, 41 episodes):

  • ROI: 11.45%
  • Win Rate: 65%
  • Trades: 19 total
  • Sharpe Ratio: 1.82
  • Max Drawdown: -4.2%

See runs/ directory for detailed logs and training metrics. Monitor training in real-time via the web dashboard at http://localhost:8765.

โš ๏ธ Important Notes

Before Live Trading

๐Ÿšจ DO NOT TRADE LIVE WITHOUT:

  1. โœ… Extended Training - Run 200+ episodes minimum
  2. โœ… Out-of-Sample Testing - Test on 2023, 2024 data (different market regimes)
  3. โœ… Paper Trading - 30 days on real-time data
  4. โœ… Risk Metrics - Verify Sharpe ratio > 1.5, max drawdown < 10%
  5. โœ… Baseline Comparison - Must beat buy-and-hold
  6. โœ… Stress Testing - Flash crashes, high volatility periods
  7. โœ… Fee Reconciliation - Verify exchange fees match model (0.15%)
  8. โœ… Position Limits - Set max exposure per trade
  9. โœ… Circuit Breakers - Auto-stop on excessive drawdown

Known Limitations

  • Sample Size: Most experiments to date are small (need 200+ episodes for stability).
  • Slippage Modeling: Configurable but conservative defaults; doesn't capture book impact at scale.
  • No Liquidity Constraints: Assumes orders always fill at the modelled price.
  • Single Asset: BTC/USD only.
  • Training Data: As far back as the fetch script can pull from Binance Vision (BTCUSDT listed 2017-08).

Open an issue if you'd like to help with any of these.

๐Ÿค Contributing

Contributions welcome! Areas needing work:

  • Multi-asset portfolio support
  • Additional technical indicators
  • Alternative reward functions
  • Hyperparameter optimization
  • Live trading connectors (Coinbase, Binance)
  • Ensemble methods

๐Ÿ“„ License

MIT โ€” see LICENSE.

๐Ÿ™ Acknowledgments

Built with:


โšก Start training: cd agent && python train.py --config fast --profile day_trader

๐Ÿ“Š Monitor progress: Open http://localhost:8765 in your browser

๐Ÿš€ Happy trading!

About

A PyTorch Double-DQN agent that learns to trade crypto, with a live FastAPI/WebSocket training dashboard.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors