Skip to content

coralr-1/DuoKinaseNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DuoKinaseNet

Official implementation of "The structure-preserving spectral graph neural network for dual kinase inhibitors and synergy scoring in gastric cancer"

πŸ“„ Published in npj Digital Medicine, Volume 9, Article 1 (2025) | Paper


πŸ‘₯ Authors

Yang ZhangΒΉ, Chunhong YuanΒ², Longgang WangΒ³, Yujia ChenΒΉ, Yanpeng XingΒΉ, Yuanlin SunΒ³

ΒΉ Department of Gastrocolorectal Surgery, The First Hospital of Jilin University, Changchun, China
Β² Faculty of Control Systems and Robotics, ITMO University, St. Petersburg, Russia
Β³ Department of Gastrointestinal Surgery, Shandong Cancer Hospital and Institute, Jinan, China


πŸ“Œ Overview

DuoKinaseNet is a dual-task spectral graph neural network for predicting drug-target interactions (DTI) targeting HER2 and FGFR2b kinases in gastric cancer, with synergy scoring for rational combination design.

Key Achievements

  • HER2: AUC-ROC 0.903 (AUPR 0.891) on unseen protein benchmark
  • FGFR2b: AUC-ROC 0.895 (AUPR 0.883) on unseen protein benchmark
  • Outperforms state-of-the-art methods including 3D-aware models (SP-DTI)
  • Enables rational polypharmacology design to overcome therapeutic resistance

Core Innovation: SPSE Module

Structure-Preserving Spectral Expansion (SPSE) addresses the locality bias of conventional GNNs:

  1. Spectral Feature Expansion: Augments node features with Laplacian eigenvectors for global geometric awareness
  2. Diffusion-Distance Biased Attention: Guides message passing using graph diffusion distances
  3. Spectral Smoothness Regularization: Encourages embeddings to align with graph manifold structure

Architecture Components

  • MolProtEncoder: Dual-view Transformer encoders with cross-attention fusion
  • HetGraphAggregator: Relation-aware GAT with spectral and geodesic distance weighting
  • ContrastMetaLearner: Graph co-contrastive learning and few-shot meta-learning
  • RobustPredHead: Calibrated DTI prediction with safety-aware synergy scoring

πŸ›  Installation

Prerequisites

  • Python >= 3.8 (3.10+ recommended)
  • PyTorch >= 2.0
  • CUDA >= 11.0 (optional, for GPU acceleration)
  • 16GB+ RAM

Setup

# Clone repository
git clone https://github.com/YourUsername/DuoKinaseNet.git
cd DuoKinaseNet

# Create conda environment
conda create -n duokinase python=3.10
conda activate duokinase

# Install PyTorch (example for CUDA 11.8)
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

# Install DuoKinaseNet
pip install -e .

# Or install with ML dependencies
pip install -e ".[ml]"

πŸ“Š Dataset

We use DrugBank v5.1.13:

  • 18,491 drug entries
  • 23,934 drug-target associations
  • 5,138 unique protein targets
  • Targets: HER2 (UniProt P04626), FGFR2b (UniProt P21802)

Data Preparation

  1. Download DrugBank XML:

  2. Parse XML to JSONL:

# Parse DrugBank XML
duokinase-parse --max-records 1000  # Use --max-records for testing

# Or specify custom path
duokinase-parse --config config.yaml

Output: artifacts/drugbank.parsed.jsonl

  1. Build Heterogeneous Graph:
# Construct graph with Drugs, Proteins, Pathways nodes
duokinase-build-graph

Output: artifacts/graph_stats.json


πŸš€ Usage

Training

Train DuoKinaseNet with default settings:

# Basic training
duokinase-train --config config.yaml

# Custom configuration
duokinase-train --config custom_config.yaml --gpu 0

Configuration Example (config.yaml):

training:
  batch_size: 64
  max_epochs: 100
  learning_rate: 1e-4
  device: "cuda"
  
spectral:
  num_eigen: 128
  diffusion_time: 1.0
  
model:
  hidden_dim: 512
  num_gnn_layers: 6
  num_heads: 8
  dropout: 0.1

Evaluation

# Evaluate on unseen protein split
python evaluate.py \
    --checkpoint checkpoints/best_model.pth \
    --split unseen_protein

# Evaluate on custom test set
python evaluate.py \
    --checkpoint checkpoints/best_model.pth \
    --test_file data/custom_test.csv

Prediction

# Single drug-target prediction
python predict.py \
    --checkpoint checkpoints/best_model.pth \
    --drug_smiles "CC(C)Cc1ccc(cc1)[C@@H](C)C(=O)O" \
    --protein_seq "MKVLWAALLVTFLAGCQAKVE..."

# Batch prediction
python predict.py \
    --checkpoint checkpoints/best_model.pth \
    --input data/pairs.csv \
    --output results/predictions.csv

HER2/FGFR2b Synergy Screening

Run targeted screening for synergistic drug combinations:

# Screen top 50 combinations
duokinase-screen --top 50

# Screen with custom safety threshold
duokinase-screen --top 100 --safety-threshold 0.8

Output (artifacts/HER2_FGFR2b_pairs.tsv):

  • drug_H: HER2-targeting candidate
  • drug_F: FGFR2b-targeting candidate
  • score: Final synergy score
  • psi: Safety penalty (CYP conflicts, QT prolongation, etc.)
  • bliss: Bliss independence proxy
  • p_HER2, p_FGFR2b: Calibrated single-agent probabilities

πŸ“‚ Project Structure

DuoKinaseNet/
β”œβ”€β”€ duokinase/
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ encoders.py        # DrugEncoder, ProteinEncoder, CrossViewFusion
β”‚   β”‚   β”œβ”€β”€ aggregator.py      # HetGraphAggregator with SPSE
β”‚   β”‚   └── heads.py           # PredictionHead, ContrastiveHead
β”‚   β”œβ”€β”€ graph/
β”‚   β”‚   β”œβ”€β”€ build.py           # Graph construction
β”‚   β”‚   β”œβ”€β”€ spectral.py        # Spectral operators
β”‚   β”‚   └── synergy.py         # Synergy scoring logic
β”‚   β”œβ”€β”€ training/
β”‚   β”‚   β”œβ”€β”€ loop.py            # Trainer with meta-learning
β”‚   β”‚   └── losses.py          # Loss functions
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ config.py          # Configuration management
β”‚   β”‚   └── safety.py          # Safety penalty calculation
β”‚   └── data/
β”‚       └── preprocess.py      # Data preprocessing
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ train.py               # Training script
β”‚   β”œβ”€β”€ evaluate.py            # Evaluation script
β”‚   └── predict.py             # Prediction script
β”œβ”€β”€ configs/
β”‚   └── default.yaml           # Default configuration
β”œβ”€β”€ tests/
β”‚   └── test_model.py          # Unit tests
β”œβ”€β”€ config.yaml                # Main configuration file
β”œβ”€β”€ requirements.txt           # Dependencies
β”œβ”€β”€ setup.py                   # Package setup
└── README.md                  # This file

πŸ§ͺ Testing

Smoke Test

Run a quick test on subset of data:

duokinase-smoke

Full Test Suite

pytest tests/ -v

πŸ“ˆ Results

Performance Comparison (Unseen Protein Benchmark)

Model HER2 AUC-ROC FGFR2b AUC-ROC Avg. AUC
SVM 0.743 0.729 0.736
Random Forest 0.768 0.751 0.760
XGBoost 0.791 0.778 0.785
DeepDTA 0.812 0.798 0.805
DeepConv-DTI 0.829 0.814 0.822
GraphDTA 0.845 0.831 0.838
MolTrans 0.856 0.843 0.850
DrugBAN 0.864 0.849 0.857
SP-DTI 0.873 0.861 0.867
DuoKinaseNet 0.903 0.895 0.899

Ablation Study (% AUC Drop)

Component Removed HER2 FGFR2b Avg.
Complete SPSE -7.65% -7.65% -7.65%
Diffusion Distance -4.78% -4.78% -4.78%
Spectral Features -2.17% -2.17% -2.17%
Cross-Attention Fusion -3.45% -3.45% -3.45%
Contrastive Learning -5.84% -5.84% -5.84%

πŸ“– Citation

If you use this code in your research, please cite our paper:

@article{zhang2026duokinasenet,
  title={The structure-preserving spectral graph neural network for dual kinase inhibitors and synergy scoring in gastric cancer},
  author={Zhang, Yang and Yuan, Chunhong and Wang, Longgang and Chen, Yujia and Xing, Yanpeng and Sun, Yuanlin},
  journal={npj Digital Medicine},
  volume={9},
  number={1},
  pages={1},
  year={2026},
  publisher={Nature Publishing Group},
  doi={10.1038/s41746-025-02240-7}
}

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“§ Contact

For questions, collaborations, or issues:

GitHub Issues: Please report bugs or request features via GitHub Issues


πŸ™ Acknowledgments

This work was supported by:

  • Scientific and Technological Development Program of Jilin Province (Grant No. YDZJ202201ZYTS133)
  • Jilin Province Department of Education (Grant No. JJKH20221074KJ)
  • Natural Science Foundation of Shandong Province (Grant No. ZR2025QC1746)

Special thanks to the DrugBank team for maintaining the comprehensive drug database.


πŸ”— Related Resources


Last Updated: January 2026

About

Official implementation of "Structure-Preserving Spectral Graph Neural Network for Dual Kinase Inhibitors in Gastric Cancer" (npj Digital Medicine, 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors