Skip to content

researchintegrity/panel-extractor

Repository files navigation

Panel Extractor

Extract panels from scientific images using YOLO object detection.

Features

  • Fast YOLOv5 inference for panel detection
  • Automatic panel classification (Blots, Graphs, Microscopy, Body Imagery, Flow Cytometry)
  • Batch processing of multiple images
  • Docker support (CPU and GPU)
  • Minimal dependencies (only runtime requirements)

Installation

From Source

git clone https://github.com/researchintegrity/panel-extractor.git
cd panel-extractor
pip install -e .

GPU Support (Recommended)

For NVIDIA GPU acceleration:

pip install -r requirements.txt --index-url https://download.pytorch.org/whl/cu121

Quick Start

Command Line

# Single image
panel-extract \
  --input-path image.png \
  --output-path ./output \
  --weights model_4_class.pt

# Multiple images
panel-extract \
  --input-path fig1.png fig2.png fig3.png \
  --output-path ./output

Python API

from extract import run

run(
    input_path=['fig1.png', 'fig2.png'],
    output_path='./output',
    weights='model_4_class.pt',
    device='cpu',  # or '0' for GPU
    imgsz=640,
    conf_thres=0.4,
    iou_thres=0.4,
    save_img=True,
)

Arguments

--input-path, -i      Path(s) to input image(s) [required]
--output-path, -o     Directory for output [required]
--weights             Path to model weights (default: model_4_class.pt)
--device              Device: 'cpu' or GPU index like '0' (default: cpu)
--imgsz               Inference image size (default: 640)
--conf-thres          Confidence threshold (default: 0.4)
--iou-thres           NMS IoU threshold (default: 0.4)
--save-img            Save extracted panel images (default: True)
--save-txt            Save bounding box coordinates (default: False)

Docker Usage

CPU-only

docker build -f Dockerfile.minimal -t panel-extractor:cpu .
docker run --rm \
  -v $(pwd)/input:/app/input \
  -v $(pwd)/output:/app/output \
  panel-extractor:cpu \
  --input-path /app/input/image.png \
  --output-path /app/output

GPU-accelerated

docker build -f Dockerfile.cuda -t panel-extractor:cuda .
docker run --gpus all --rm \
  -v $(pwd)/input:/app/input \
  -v $(pwd)/output:/app/output \
  panel-extractor:cuda \
  --input-path /app/input/image.png \
  --output-path /app/output \
  --device 0

Model Weights

Download pre-trained model weights:

pip install gdown
gdown --id 1CuSUYUF0uTbcANFRffzoMUllCP8Du-HT
unzip panel_extraction_models.zip

Output

The script generates:

  1. Extracted panels: Individual image files for each detected panel
  2. CSV report: PANELS.csv with format:
    FIGNAME,ID,LABEL,X0,Y0,X1,Y1
    fig1,1,Blot,100,50,450,300
    fig1,2,Graph,500,100,950,400
    

Requirements

  • Python ≥ 3.8
  • PyTorch ≥ 1.9
  • Minimal dependencies: numpy, opencv-python, torch, torchvision, Pillow, PyYAML, tqdm

See requirements.txt for complete list.

Performance Tips

  • GPU Mode: 10-50x faster than CPU
  • Batch Processing: Process multiple images for better throughput
  • Image Size: Lower --imgsz (e.g., 512) for faster inference
  • Confidence: Adjust --conf-thres for precision vs recall

Troubleshooting

CUDA out of memory

panel-extract --imgsz 512 --input-path image.png --output-path output

Import errors

pip uninstall torch torchvision -y
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

License

AGPL-3.0 - See LICENSE file

References

Issues & Support

GitHub Issues

About

A solution for extracting panels from scientific images

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages