Skip to content

FluxVisionAI/DeepECG_Docker

 
 

Repository files navigation

DeepECG_Docker

DeepECG_Docker is a repository designed for deploying deep learning models for ECG signal analysis and comparing their performance over a Bert Classifier model or a specified ground truth. The pipeline can be run locally or in a docker container. This pipeline offers 3 modes of processing:

  • Preprocessing: Preprocess the ecg signals and save them in the preprocessing/ folder
  • Analysis: Analyze the ecg signals and save the results in the outputs/ folder (using the preprocessed data)
  • Full run: Preprocess the ecg signals and analyze them, saving both the preprocessed and analyzed data in the preprocessing/ and outputs/ folders respectively

Table of Contents

🚀 Features

  • BERT-based multilabel classification model for ECG diagnosis (77 classes)
  • EfficientNet-based multilabel classification model for ECG signals (77 classes)
  • WCR-based multilabel classification model for ECG signals (77 classes)
  • WCR-based binary classification model for ECG signals (LVEF <= 40%)
  • WCR-based binary classification model for ECG signals (LVEF < 50%)
  • WCR-based binary classification model for ECG signals (Incident AFIB at 5 years)
  • EfficientNet-based binary classification model for ECG signals (LVEF <= 40%)
  • EfficientNet-based binary classification model for ECG signals (LVEF < 50%)
  • EfficientNet-based binary classification model for ECG signals (Incident AFIB at 5 years)
  • Dockerized deployment for easy setup and execution
  • Configurable pipeline for flexible usage
  • CPU & GPU support for accelerated processing

🛠️ Installation

  1. 📥 Clone the repository:

    git clone https://github.com/HeartWise-AI/DeepECG_Docker.git
    cd DeepECG_Docker
    
  2. 🔑 Set up your HuggingFace API key:

    • Create a HuggingFace account if you don't have one yet
    • Ask for access to the DeepECG models needed in the heartwise-ai/DeepECG repository
    • Create an API key in the HuggingFace website in User Settings -> API Keys -> Create API Key -> Read
    • Add your API key in the following format in the api_key.json file in the root directory:
      {
        "huggingface_api_key": "your_api_key_here"
      }
  3. 📄 Populate a csv file containing the data to be processed, example: inputs/data_rows_template.csv (see Usage for more details)

    • If using DICOMs, update the root path in extract_metada_from_dicoms.py then run the script to extract the metadata from the DICOMs
      python utils/extract_metada_from_dicoms.py
      
  4. 🐳 Build the docker image:

    docker build -t deepecg-docker .
    
  5. 🚀 Run the docker container: (see Docker for more details)

    docker run --gpus all -v $(pwd)/inputs:/app/inputs -v $(pwd)/outputs:/app/outputs -v $(pwd)/ecg_signals:/app/ecg_signals:ro -v $(pwd)/preprocessing:/app/preprocessing -i deepecg-docker
    
  6. Connect to the container

    docker exec -it deepecg_docker bash
    
  7. Run pipeline

    bash run_pipeline.bash --mode full_run --csv_file_name data_rows_template.csv
    

Project Structure

DeepECG_Docker/
│
├── models/
│   ├── bert_classifier.py
│   ├── efficientnet_wrapper.py
│   ├── heartwise_models_factory.py
│   └── resnet_wrapper.py
│
├── inputs/
│   └── data_rows_template.csv
│
├── outputs/
│   └── (output files will be generated here)
│
├── preprocessing/
│   └── (preprocessed files will be saved here)
│
├── utils/
│   └── ...
│
├── dockerfile
├── heartwise.config
├── api_key.json
├── main.py
├── README.md
├── requirements.txt
└── run_pipeline.sh

Models

  1. BertClassifier:

    • Utilizes the BERT architecture fine-tuned to classify ECG diagnosis into 77 classes.
    • More information here
  2. EfficientV2_77_classes:

    • Utilizes the EfficientNetV2 architecture to classify ECG signals into 77 classes.
    • More information here
  3. EfficientV2_LVEF_Equal_Under_40:

    • Utilizes the EfficientNetV2 architecture to classify ECG signals into binary classification of LVEF <= 40%.
    • More information here
  4. EfficientV2_Under_50:

    • Utilizes the EfficientNetV2 architecture to classify ECG signals into binary classification of LVEF < 50%.
    • More information here
  5. EfficientV2_Incident_AFIB_At_5_Years:

    • Utilizes the EfficientNetV2 architecture to classify ECG signals into binary classification of incident AFIB at 5 years.
    • More information here
  6. WCR_77_classes:

    • Utilizes the WCR architecture to classify ECG signals into 77 classes.
    • More information here
  7. WCR_LVEF_Equal_Under_40:

    • Utilizes the WCR architecture to classify ECG signals into binary classification of LVEF <= 40%.
    • More information here
  8. WCR_LVEF_Under_50:

    • Utilizes the WCR architecture to classify ECG signals into binary classification of LVEF < 50%.
    • More information here
  9. WCR_Incident_AFIB_At_5_Years:

    • Utilizes the WCR architecture to classify ECG signals into binary classification of incident AFIB at 5 years.
    • More information here

📄 Usage

  1. Prepare your input data:

    • Create a CSV file with the following template in inputs/data_rows_template.csv:
    • For each model, add two columns with the following format:
      'ecg_machine_diagnosis': '77_classes_ecg_file_name',
      'afib_5y': 'afib_ecg_file_name',
      'lvef_40': 'lvef_40_ecg_file_name',
      'lvef_50': 'lvef_50_ecg_file_name'
      
    • ecg_machine_diagnosis (string): Diagnosis from the ECG machine
    • 77_classes_ecg_file_name (string): The ECG signal file names machine ecg diagnosis
    • afib_5y (int): Binary classification of incident AFIB at 5 years
    • afib_ecg_file_name (string): The ECG signal file names incident AFIB at 5 years
    • lvef_40 (int): Binary classification of LVEF <= 40%
    • lvef_40_ecg_file_name (string): The ECG signal file names LVEF <= 40%
    • lvef_50 (int): Binary classification of LVEF < 50%
    • lvef_50_ecg_file_name (string): The ECG signal file names LVEF < 50%
    • Place your input CSV file in the inputs/ directory
    • Change the data_rows_template.csv filename in the heartwise.config file
  2. Pipeline configuration:

    • When using docker, you only need to change the actual csv filename. Edit the heartwise.config file to set the desired configuration:

      • diagnosis_classifier_device: Specifies the device to be used for the diagnosis classifier model. Example: cuda:0 for using the first GPU.
      • signal_processing_device: Specifies the device to be used for the signal processing model. Example: cuda:0 for using the first GPU.
      • batch_size: Defines the batch size for processing the data. Example: 32.
      • output_folder: The directory where the output files will be saved. Example: /app/outputs.
      • hugging_face_api_key_path: The path to the file containing the HuggingFace API key. Example: /app/api_key.json.
      • use_efficientnet: Boolean value to specify if the EfficientNet model should be used. Example: True.
      • use_wcr: Boolean value to specify if the WCR model should be used. Example: True.
      • data_path: The path to the input CSV file containing the data. Example: /app/inputs/data_rows_template.csv.
      • mode: The mode of the pipeline (overwriten by docker command line). Example: analysis | preprocessing | full_run.
      • ecg_signals_path: The path to the ecg signals files parsed in docker command line. Example: /app/ecg_signals.
      • preprocessing_folder: The path to the folder where the preprocessed files will be saved. Example: /app/preprocessing.
      • preprocessing_n_workers: The number of workers to be used for the preprocessing. Example: 16.
  3. Notes:

    • Single ECG processing: When running the pipeline with only one ECG file, metrics computation (AUC, F1, etc.) is automatically skipped since these metrics require multiple samples. Predictions are still generated and saved normally.

Testing

Run the error-collector unit tests (no GPU or data required):

python tests/test_error_collector.py

To verify that errors are collected and printed at the end (no data or GPU needed):

python main.py \
  --mode analysis \
  --data_path /nonexistent.csv \
  --output_folder /tmp/out \
  --preprocessing_folder /tmp/pre \
  --hugging_face_api_key_path api_key.json \
  --use_wcr False \
  --use_efficientnet False \
  --ecg_signals_path /tmp

You should see Errors encountered: followed by a clear message (e.g. file not found) instead of a raw traceback.

Run the full pipeline from the project root (requires heartwise.config and data). From inside the container or after installing dependencies locally:

bash run_pipeline.bash --mode full_run --csv_file_name data_rows_template.csv

To run main.py directly with explicit arguments:

python main.py \
  --mode analysis \
  --data_path inputs/your_data.csv \
  --output_folder outputs \
  --preprocessing_folder preprocessing \
  --hugging_face_api_key_path api_key.json \
  --use_wcr True \
  --use_efficientnet True \
  --ecg_signals_path ecg_signals

If any step fails, the pipeline collects error messages and prints them at the end under Errors encountered:.

🐳 Docker

Interactive shell (recommended for Cursor / IDE terminals)

If docker run -it ... hangs or shows a blank screen in Cursor’s terminal, start the container in the background and attach a shell with docker exec -it. The image keeps the container running by default.

1. Start the container (no -it):

docker run -d --gpus all --name deepecg \
  -v $(pwd)/inputs:/app/inputs \
  -v $(pwd)/outputs:/app/outputs \
  -v $(pwd)/ecg_signals:/app/ecg_signals:ro \
  -v $(pwd)/preprocessing:/app/preprocessing \
  deepecg-docker

2. Open an interactive shell:

docker exec -it deepecg bash

You’ll get a prompt inside the container. Run the pipeline manually when you’re ready, e.g.:

./run_pipeline.bash --mode full_run --csv_file_name data_rows_template.csv

When you’re done, exit the shell (exit) and stop the container: docker stop deepecg. Remove it before the next run if you reuse the name: docker rm deepecg (or use docker rm -f deepecg to remove a running container).

🤝 Contributing

Contributions to DeepECG_Docker repository are welcome! Please follow these steps to contribute:

  1. Fork the repository
  2. Create a new branch for your feature or bug fix
  3. Make your changes and commit them with clear, descriptive messages
  4. Push your changes to your fork
  5. Submit a pull request to the main repository

📚 Citation

If you find this repository useful, please cite our work:

@article {Nolin-Lapalme2025.03.02.25322575,
	author = {Nolin-Lapalme, Alexis and Sowa, Achille and Delfrate, Jacques and Tastet, Olivier and Corbin, Denis and Kulbay, Merve and Ozdemir, Derman and No{\"e}l, Marie-Jeanne and Marois-Blanchet, Fran{\c c}ois-Christophe and Harvey, Fran{\c c}ois and Sharma, Surbhi and Ansari, Minhaj and Chiu, I-Min and Dsouza, Valentina and Friedman, Sam F. and Chass{\'e}, Micha{\"e}l and Potter, Brian J. and Afilalo, Jonathan and Elias, Pierre Adil and Jabbour, Gilbert and Bahani, Mourad and Dub{\'e}, Marie-Pierre and Boyle, Patrick M. and Chatterjee, Neal A. and Barrios, Joshua and Tison, Geoffrey H. and Ouyang, David and Maddah, Mahnaz and Khurshid, Shaan and Cadrin-Tourigny, Julia and Tadros, Rafik and Hussin, Julie and Avram, Robert},
	title = {Foundation models for generalizable electrocardiogram interpretation: comparison of supervised and self-supervised electrocardiogram foundation models},
	elocation-id = {2025.03.02.25322575},
	year = {2025},
	doi = {10.1101/2025.03.02.25322575},
	publisher = {Cold Spring Harbor Laboratory Press},
	URL = {https://www.medrxiv.org/content/early/2025/03/05/2025.03.02.25322575},
	eprint = {https://www.medrxiv.org/content/early/2025/03/05/2025.03.02.25322575.full.pdf},
	journal = {medRxiv}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 50.1%
  • Jupyter Notebook 48.7%
  • Shell 1.2%