Production-ready DuoFormer for general medical image classification with enterprise-grade MLOps practices.
- 💡 Purpose:
- Refactored for general medical imaging use cases including histopathology, radiology, dermatology, and more
- 🔬 Original Repository:
- 📄 Paper:
- 🚀 This Repository:
A professionally engineered and enhanced implementation with:
- ✅ Platform-Independent: Works on Windows, Linux, macOS
- ✅ Hardware-Agnostic: Auto-detects CUDA, MPS (Apple Silicon), or CPU
- ✅ Modern PyTorch: No deprecated APIs, latest best practices
- ✅ MLOps Ready: Configuration management, auto-checkpointing, TensorBoard
- ✅ Clean Code: No wildcard imports, explicit dependencies
- ✅ Professional Structure: Modular, testable, maintainable
- ✅ Comprehensive Testing: Unit tests and health checks
- ✅ Auto-Configuration: Optimal settings for your hardware
# 1. Clone repository
git clone https://github.com/AliSerwat/Refactored-DuoFormer.git
cd Refactored-DuoFormer
# One-command setup (handles everything)
python setup_environment.py# Check system capabilities
python scripts/check_system.py
# Verify installation
python scripts/verify_installation.py# Interactive notebook
jupyter notebook demo_duoformer.ipynb
# Or platform-agnostic demo
python examples/demo_robust.py# Auto-configuration (detects your hardware)
python train.py --data_dir /path/to/data
# Or with specific configuration
python train.py --config src/duoformer/config/default_config.yamlDuoFormer combines ResNet backbones with multi-scale transformers for medical image classification:
Input Image (224×224×3)
↓
ResNet Backbone (ResNet-50/18)
↓
Multi-Scale Features (2/3/4 scales)
↓
Multi-Scale Transformer
↓
Classification Head
↓
Predictions (num_classes)
Key Features:
- 🔄 Multi-scale attention at different resolutions
- 🏗️ Hybrid CNN-Transformer architecture
- 🎯 Optimized for medical imaging (histopathology, radiology)
- ⚡ Supports 2, 3, or 4 scale levels
duoformer/
│
├── 📄 README.md Main project documentation (you are here)
├── 📄 GETTING_STARTED.md Comprehensive beginner's guide
│
├── 📁 docs/ Documentation directory
│ ├── README.md Documentation overview
│ ├── INSTALLATION.md Platform-specific installation
│ ├── TROUBLESHOOTING.md Common issues & solutions
│ ├── CONTRIBUTING.md Development guidelines
│ ├── DOCUMENTATION_INDEX.md Complete documentation map
│ ├── CODE_REVIEW_REPORT.md Code quality analysis
│ ├── FIXES_APPLIED.md What's been fixed
│ └── QUICK_FIX_REFERENCE.md Developer quick reference
│
├── 📁 src/duoformer/config/ Configuration Management
│ ├── model_config.py Type-safe configs (Python dataclasses)
│ ├── default_config.yaml Balanced settings
│ ├── lightweight_config.yaml Fast experiments (ResNet-18, 2 scales)
│ └── performance_config.yaml Best accuracy (ResNet-50, 4 scales)
│
├── 📁 src/duoformer/models/ Model Architectures
│ ├── __init__.py Exports: build_model_no_extra_params(), count_parameters()
│ ├── model_wo_extra_params.py Main DuoFormer (recommended)
│ ├── model.py Original implementation
│ ├── multi_vision_transformer.py Multi-scale transformer
│ ├── multiscale_attn.py Multi-scale attention
│ ├── projection_head.py Projection layers
│ ├── resnet50ssl.py Self-supervised ResNet
│ └── scale_attention.py Scale attention mechanisms
│
├── 📁 utils/ Training Utilities
│ ├── trainer.py Professional trainer (checkpointing, TensorBoard)
│ ├── dataset.py MedicalImageDataset, augmentation
│ ├── device_utils.py Auto-detect CUDA/MPS/CPU
│ └── platform_utils.py Platform-specific optimizations
│
├── 📁 tests/ Testing (Resource-Efficient!)
│ ├── unit/ Fast tests (<30s, no GPU)
│ ├── integration/ Full tests (slower, GPU optional)
│ ├── fixtures/ Mock data generators
│ └── run_tests.py Central test runner
│
├── 📁 scripts/ Utility Scripts
│ ├── check_system.py System capabilities + recommendations
│ ├── health_check.py Code health validation
│ └── verify_installation.py Installation verification
│
├── 📁 examples/ Usage Examples
│ ├── demo_robust.py Platform-agnostic demo
│ └── example_usage.py Feature demonstrations
│
├── 🐍 train.py Main Training Script
├── 🔧 setup_environment.py One-Command Setup
├── 📓 demo_duoformer.ipynb Interactive Demo
│
├── 📦 requirements.in Direct Dependencies
└── 📦 requirements.txt Lockfile (compiled automatically from requirements.in)
# Basic training (auto-configures everything)
python train.py --data_dir /path/to/data
# With custom configuration
python train.py --config src/duoformer/config/performance_config.yaml
# Specific device
python train.py --device cuda --data_dir ./data
# Multi-GPU
python train.py --gpu_ids 0,1,2 --data_dir ./data
# CPU only
python train.py --device cpu --batch_size 8 --data_dir ./data
# With mixed precision
python train.py --amp --data_dir ./data
# Resume training
python train.py --resume checkpoints/best_checkpoint.ptfrom duoformer.config import ModelConfig
# Load from YAML
config = ModelConfig.from_yaml('src/duoformer/config/my_experiment.yaml')
# Or use presets
from duoformer.config import DEFAULT_CONFIG, LIGHTWEIGHT_CONFIG, PERFORMANCE_CONFIG
# Modify and save
config.training.epochs = 200
config.to_yaml('src/duoformer/config/my_custom.yaml')import torch
from duoformer.models import build_model_no_extra_params
# Load checkpoint
checkpoint = torch.load('checkpoints/best_checkpoint.pt')
# Create model
model = build_model_no_extra_params(
depth=12,
embed_dim=768,
num_heads=12,
num_classes=10,
num_layers=2,
backbone='r50'
)
# Load weights
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
# Predict
with torch.no_grad():
output = model(image_tensor)
prediction = output.argmax(dim=1)| Profile | Backbone | Scales | Params | Speed | Use Case |
|---|---|---|---|---|---|
lightweight_config.yaml |
ResNet-18 | 2 | ~30M | Fast | Quick experiments |
default_config.yaml |
ResNet-50 | 2 | ~50M | Medium | Standard training |
performance_config.yaml |
ResNet-50 | 4 | ~70M | Slow | Best accuracy |
Create src/duoformer/config/my_experiment.yaml:
exp_name: my_tcga_experiment
device: auto # Auto-detects CUDA/MPS/CPU
backbone:
name: resnet50
pretrained: true
freeze: true
transformer:
depth: 12
embed_dim: 768
num_heads: 12
multiscale:
num_layers: 2
proj_dim: 768
training:
learning_rate: 0.0001
epochs: 100
batch_size: 32
optimizer: adamw
scheduler: cosine
data:
data_dir: ./data
num_classes: 10
image_size: 224Then train:
python train.py --config src/duoformer/config/my_experiment.yamlThe codebase automatically detects and optimizes for your platform:
Windows:
- Auto-detects CUDA or falls back to CPU
- Optimized
num_workers=4 pin_memory=False(prevents crashes)
Linux:
- Multi-GPU support
- Optimized
num_workers=8 pin_memory=True(faster transfer)
macOS:
- Apple Silicon (M1/M2/M3) MPS support
- Optimized
num_workers=4 - Automatic device selection
# Auto-detect (recommended)
python train.py --device auto
# Specific GPU
python train.py --device cuda:0
# Apple Silicon
python train.py --device mps
# CPU only
python train.py --device cpu
# Multi-GPU
python train.py --gpu_ids 0,1,2from utils import Trainer
trainer = Trainer(
model=model,
criterion=criterion,
optimizer=optimizer,
device=device,
use_amp=True, # Mixed precision
gradient_clip_val=1.0
)
trainer.fit(
train_loader,
val_loader,
epochs=100,
patience=20 # Early stopping
)Features:
- ✅ Automatic checkpointing (best, latest, periodic)
- ✅ Early stopping
- ✅ Mixed precision training (AMP)
- ✅ Gradient clipping
- ✅ TensorBoard logging
- ✅ Learning rate scheduling
from utils import create_dataloaders
# From directory structure
train_loader, val_loader, test_loader = create_dataloaders(
data_dir='./data',
batch_size=32
)
# From CSV file
train_loader, val_loader, test_loader = create_dataloaders(
csv_file='data.csv',
batch_size=32
)from utils import setup_device_environment, get_optimal_num_workers
# Auto-detect best device
device = setup_device_environment('auto')
# Get optimal workers for your platform
num_workers = get_optimal_num_workers()# Fast unit tests (recommended, <30 seconds, no GPU)
python tests/run_tests.py --unit
# Full test suite (all tests)
python tests/run_tests.py
# Integration tests only (slower, may use GPU)
python tests/run_tests.py --integration
# Specific test file
python tests/unit/test_config.py
python tests/integration/test_full_models.py
# With pytest
pytest tests/unit/ -v # Fast tests only
pytest tests/integration/ -v # Integration tests
pytest tests/ -v # All tests
# Check code health
python scripts/health_check.py- Python: 3.8+ (3.10+ recommended)
- RAM: 8GB+ (16GB+ recommended)
- GPU: Optional (NVIDIA CUDA, or Apple MPS)
- Storage: 5GB free space
Managed via pip-tools for reproducibility:
- PyTorch 2.0+
- torchvision
- timm (Vision Transformers)
- einops, numpy, matplotlib
- scikit-learn, pandas
- TensorBoard, Jupyter
Install all:
python setup_environment.py- Quick Start: This README
- Getting Started Guide: See
GETTING_STARTED.mdfor detailed walkthrough - Installation Guide: See
docs/INSTALLATION.mdfor platform-specific setup - Troubleshooting: See
docs/TROUBLESHOOTING.mdfor common issues - Contributing: See
docs/CONTRIBUTING.mdfor development guidelines - Full Documentation: See
docs/directory for complete guides - System Check:
python scripts/check_system.py - Examples: See
examples/directory
This implementation works with various medical imaging modalities:
# Histopathology (e.g., TCGA, CAMELYON)
python train.py \
--data_dir ./data/histopathology \
--num_classes 10 \
--config src/duoformer/config/default_config.yaml
# Radiology (e.g., X-rays, CT, MRI)
python train.py \
--data_dir ./data/radiology \
--num_classes 5 \
--backbone r50 \
--num_layers 4
# Dermatology (e.g., skin lesions)
python train.py \
--data_dir ./data/dermatology \
--num_classes 7 \
--image_size 224
# Retinal imaging (e.g., diabetic retinopathy)
python train.py \
--data_dir ./data/retinal \
--num_classes 5 \
--config src/duoformer/config/performance_config.yaml# Freeze backbone for faster convergence
python train.py \
--freeze_backbone \
--lr 1e-5 \
--epochs 50 \
--data_dir ./dataIf you use this code, please cite the original paper:
@inproceedings{tang2025duoformer,
title={DuoFormer: Hierarchical Vision Transformer for Medical Image Segmentation},
author={Tang, Xiaoya and others},
booktitle={Medical Imaging with Deep Learning (MIDL)},
year={2025}
}Contributions welcome! This enhanced version adds:
- Modern PyTorch APIs
- Platform independence
- MLOps best practices
- Professional structure
See examples/ for usage patterns.
Same license as original repository.
- Original Work: xiaoyatang/duoformer_TCGA - Original TCGA-focused implementation
- Paper: Tang, X. et al. "Hierarchical Vision Transformer for Medical Image Segmentation" (MIDL 2025)
- This Refactoring: Enhanced for general medical imaging with production-ready MLOps practices
- System Issues:
python scripts/check_system.py - Code Health:
python scripts/health_check.py - Examples: See
examples/directory
Note: For the original TCGA-specific implementation, see xiaoyatang/duoformer_TCGA
| Feature | Original | DuoFormer |
|---|---|---|
| Code Quality | Wildcards, deprecated | Explicit, modern ✅ |
| Platform | Linux-only paths | Cross-platform ✅ |
| Device | Hardcoded CUDA | Auto-detect ✅ |
| Configuration | Hardcoded | YAML/JSON ✅ |
| Training | Basic loop | Professional trainer ✅ |
| Monitoring | TensorBoard ✅ | |
| Testing | None | Unit tests ✅ |
| Dependencies | Manual | pip-tools lockfile ✅ |
Made with ❤️ for the medical AI community