A chess engine powered by deep reinforcement learning, implementing an AlphaZero-style architecture with Monte Carlo Tree Search (MCTS).
This project started as a personal learning journey into neural networks and deep learning. After studying the fundamentals, I became fascinated with the idea of creating an AI that could learn to play chess from scratch - not through handcrafted rules, but by learning patterns and strategies on its own.
The initial development was limited by computational constraints, but with a new machine (RTX 5070 Ti), I was finally able to complete the training pipeline and see the project come to life.
- AlphaZero-style architecture: Residual neural network with policy and value heads
- Monte Carlo Tree Search: Strategic move selection through simulation
- Two-phase training: Supervised learning from master games + self-play reinforcement
- Optimized data pipeline: Parallel PGN preprocessing for fast training iterations
- Visual interface: Pygame GUI with evaluation bar to play against the trained model
- Configurable: YAML-based configuration for easy experimentation
The model uses a residual convolutional neural network:
- Input: 12-channel tensor (6 piece types × 2 colors) on 8×8 board
- Backbone: Configurable residual blocks (default: 5 blocks, 128 channels)
- Policy head: 4096-dimensional output (64×64 possible from-to moves)
- Value head: Single scalar [-1, 1] for position evaluation
- Supervised Learning: Learn basic patterns from high-rated Lichess games
- Self-Play: Improve through games against itself using MCTS
- Continuous Improvement: Iterative self-play with updated models
# Clone the repository
git clone https://github.com/Matcraft94/chess_AI.git
cd chess_AI
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Install PyTorch with CUDA (adjust for your CUDA version)
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu130Download PGN files from Lichess Database and preprocess them:
# Preprocess PGN to optimized format
python preprocess_pgn.py data/lichess_games.pgn -o data/processed/ -n 50000 --min-elo 1800Edit config.yaml to adjust:
- Model architecture (residual blocks, channels)
- Training parameters (batch size, learning rate, epochs)
- Data paths
- Hardware settings
# Full training (supervised + self-play)
python train.py
# Only supervised learning
python train.py --supervised-only
# Only self-play (requires pretrained model)
python train.py --selfplay-only --pretrained saved_models/supervised.ckptMonitor training with TensorBoard:
tensorboard --logdir logs# Launch GUI (loads best model automatically)
python play_chess_gui.py
# Play as black
python play_chess_gui.py --blackControls:
- Click to select and move pieces
- R - Restart game
- F - Flip colors
- Q - Quit
chess_AI/
├── src/
│ ├── models.py # Neural network architecture
│ ├── environment.py # Chess game state management
│ ├── mcts.py # Monte Carlo Tree Search
│ ├── dataset.py # Data loading utilities
│ ├── lightning_module.py # PyTorch Lightning modules
│ └── utils.py # Helper functions
├── train.py # Main training script
├── preprocess_pgn.py # PGN to tensor conversion
├── play_chess_gui.py # Pygame interface
├── config.yaml # Training configuration
└── requirements.txt # Dependencies
Key parameters in config.yaml:
model:
num_residual_blocks: 5 # Depth of the network
num_channels: 128 # Width of the network
supervised:
batch_size: 512
learning_rate: 0.001
max_epochs: 50
selfplay:
num_iterations: 50
num_games_per_iteration: 100
mcts_simulations: 50The board is encoded as a 12×8×8 tensor:
- 6 channels for white pieces (pawn, knight, bishop, rook, queen, king)
- 6 channels for black pieces
- Each channel is a binary 8×8 grid indicating piece positions
Moves are encoded as a 4096-dimensional vector:
- Index = from_square × 64 + to_square
- Promotions default to queen (most common case)
- Selection: Traverse tree using UCB score
- Expansion: Add new node using neural network policy
- Evaluation: Get position value from neural network
- Backpropagation: Update visit counts and values
- Minimum: 8GB RAM, modern CPU
- Recommended: 16GB RAM, NVIDIA GPU with 8GB+ VRAM
- Optimal: 32GB RAM, RTX 5070 Ti or better
Training time estimates (50 epochs supervised + 50 iterations self-play):
- RTX 5070 Ti: ~8-10 hours
The model learns progressively:
- After supervised learning: Basic opening principles, piece development, simple tactics
- After self-play: Improved positional understanding, deeper tactical calculations
Current limitations:
- Endgame play could be stronger (requires more training data)
- Opening book integration would improve early game
- Implement opening book for consistent openings
- Add endgame tablebases for perfect endgame play
- Experiment with transformer architectures
- Implement distributed training for faster iteration
- Add time controls and UCI protocol support
- Mastering Chess and Shogi by Self-Play - DeepMind's AlphaZero paper
- Lichess Database - Training data source
- python-chess - Chess library
MIT License - feel free to use, modify, and distribute.
This project was a fantastic learning experience in deep learning, reinforcement learning, and game AI. Special thanks to the open-source community for the tools and resources that made this possible.
Built with passion for chess and neural networks