Skip to content

mathemusician/ocr_pytorch

 
 

Repository files navigation

lightning_text_detection

Demo

CTPN detects text regions (green boxes) and CRNN recognizes the text within each region:

OCR Demo Result

A full PyTorch Lightning rewrite of courao/ocr.pytorch

Text detection (CTPN) and text recognition (CRNN) — converted end-to-end from vanilla PyTorch to PyTorch Lightning.

What's different from the original?

The original repo uses hand-rolled training loops with raw PyTorch. This fork rewrites both models and data pipelines as PyTorch Lightning modules:

Component Original (courao) This fork
CTPN model ctpn_model.py ctpn_model_PL.pypl.LightningModule
CTPN data inline in train script ctpn_data_PL.pypl.LightningDataModule
CTPN training ctpn_train.py (manual loop) ctpn_train_PL.pypl.Trainer
CRNN model crnn.py crnn_model_PL.pypl.LightningModule
CRNN data mydataset.py crnn_data_PL.pypl.LightningDataModule
CRNN training train_pytorch_ctc.py (manual loop) crnn_train_PL.pypl.Trainer

This gives you automatic logging (TensorBoard), checkpointing, multi-GPU support, and cleaner separation of model/data/training logic — all out of the box.

Working on implementing CRAFT and Transformer_STR.

Pull requests welcome!

Prerequisite

  • python-3.6+
  • pytorch-lightning-1.4.1
  • opencv-4.5.2.52
  • numpy-1.21.1
  • Pillow-8.2.0
  • pathed-1.1.00

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

About

A pytorch-lightning implementation of CTPN for text detection and recognition

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%