Skip to content

BensonRen/eye_nerve_segmentation

Repository files navigation

Eye segmentation project

A project collaboration with Med school to segment the eye nerve from eye CT scans

Dependencies

Package
Pytorch
Torchvision
pandas
scipy
scikit-learn

Data Pre-processing procedure

  1. Get the binary mask from the .csv file (image size [496 x 1536]
python utils/get_mask.py
  1. Split the training and testing sets manually by patient id (top 5480 images)
python utils/train_test_split.py
  1. Pre-process the each image and binary mask into 512 x 512 by cutting into 3 pieces and mirror symmetric padding
python utils/pre_processing.py

To-do list:

  • make the README file explaining the project
  • understand the type of data that I need in this project
  • make the helper function to read and load pictures into Pytorch dataset
  • make the helper function to read labels into Pytorch dataset
  • start finish the Unet model
  • Debug and run the training module
  • Work on GPU version of minimal training product
  • Add loss monitoring
  • Solve the environment issue
  • Add evaluation module
  • Split the dataset
  • Change the epoch based training into mini-batch based training
  • Debug the train test split and pre-processing
  • Try full datasets training
  • Change the time recorder from epoch base to batch-base
  • Add the plotting module during training and output plotting to the tensorboard
  • Debug the problem of output segmentation map is all 0 or 1
  • Debug confusion matrix plot generating
  • Change the target mask from [1,0] structure to [0, 1] structure so the confusion plot better reflects the IoU values
  • Time profiling to find the training bottleneck issue
  • Check RAM issue if that is causing the program to slow down
  • Debug the full dataset training bottleneck
  • Full set IoU evaluation instead of single image ones
  • Evaluation master function
  • Evaluation plot function
  • Evaluation module which loads from the trained model
  • Debug the evaluation function
  • Debug the IoU bigger than 1 issue
  • Add post processing to the output result
  • Add the name tags to the data reader so that while needed it has access to print
  • Revise plotting module for saving label name
  • Ablation study on Post-processing code set up
  • Ablation study on Dice Loss code set up
  • Ablation study on pretrained-18, pretrained-50, standard_Unet_from_scratch code set up
  • Ablation study on Batch_size code set up
  • Debug the Ablation study code
  • THE LABEL ISSUE: SHOULD USE RNFL AS LABEL INSTEAD OF EPR, DOING LABELING AGAIN
  • Add the evaluation bulk module to do multi-model evaluation
  • Add the evaluation summary writing to the evaluation function
  • Debug the evaluation new code
  • Run on DCC the above-mentioned ablation studies
  • Add the morphological closing and "hole chlosing" post processing to the code
  • Run the 5 Epoch run on 2 GPUs for total length
  • Write the weighted edge code
  • Debug the morphological code and run the post-processing ablation studies
  • Analysis the post-processing effect on 10k inference
  • Get the fixed random seed into data loader
  • Debug the weighted edge code using different kernel sizes
  • Run model on "low quality images" to test out the comparison

Time profiling

Update on 2020.07.02: The time performance issue is resolved by re-installing gpu version of pytorch

Time profile results

To address the training time consumption issue (takes about 30s to train on each image), a time profiling job is done both using the Cprofiler and the manual checkpoints added to the program. The Cprofiler outputs around 19M function calls, which is not very readable by human. Therefore the table of manual checkpoints are recorded below.

Time performance on training function level

Operation Approximate Time taken
enter epoch, set up metric holder 0.012 s
set model to train state 0.03s
take the grouped data point from train loader 0.85s
get the image and binary mask from the grouped data point 0.01s
put the img and mask on gpu 0.01s
zero the gradient 0.01 s
logit=model(img) 12.2s
make loss 0.1s
loss.backward() 20s
optm.step() 0.2s
adding training samples 0.02s
calculate IoU and adding losses to tensorboard 0.06s
change model to evaluation mode 0.03s

Time performance on forward model level

Operation(in channel, out channel, kernelsize, padding) Approximate Time taken Accumulate time
model preparation 0.47s 0.47s
conv(3,64,3,1)+relu 0.2s 0.68s
conv(64,64,3,1)+relu 1.5s 2.12s
First 3 layer of ResNet 18 0.16s 2.28s
4-5 layer of ResNet 18 0.4s 2.68s
6th layer of ResNet 18 0.4s 3.02s
7th layer of ResNet 18 0.16s 3.18s
8th layer of ResNet 18 0.15s 3.33s
conv(512,512,1,0)+relu 0.01s 3.34s
Upsample 2 times 0.01s 3.35s
conv(256,256,1,0)+relu 0.01s 3.36s
concatenate 0.01s 3.37s
conv(256+512,512,3,1)+relu 0.2s 3.57s
Upsample 2 times 0.05s 3.62s
conv(128,128,1,0)+relu 0.01s 3.63s
concatenate 0.01s 3.64s
conv(128+512,256,3,1)+relu 0.4s 4.02s
Upsample 2 times 0.08s 4.10s
conv(64,64,1,0)+relu 0.01s 4.11s
concatenate 0.02s 4.13s
conv(64+256,256,3,1)+relu 0.76s 4.89s
Upsample 2 times 0.65s 5.54s
conv(64,64,1,0)+relu 0.05s 5.59s
concatenate 0.05s 5.56s
conv(64+256,128,3,1)+relu 1.8s 7.53s
Upsample 2 times 0.7s 8.21s
concatenate 0.12s 8.33s
conv(64+128, 64, 3, 1) +relu 4.4s 12.76s
conv(64,2,1,0) 0.06s 12.82s

|

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages