Skip to content

rvrane/EmbryoLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmbryoLens

Control-FREEC based CNV calling pipeline for preimplantation genetic testing (PGT-A) and embryo aneuploidy detection using a panel-of-normals (PoN) approach.

Language Platform License


Overview

EmbryoLens implements an end-to-end bioinformatics pipeline for aneuploidy and copy number variation (CNV) detection from NGS data of trophectoderm (TE) biopsies. The pipeline uses Control-FREEC for copy number profiling with a custom panel-of-normals (PoN) reference built from euploid embryo samples to improve specificity and reduce noise from low-input, amplified DNA.

Key features:

  • Panel-of-normals reference construction and management
  • Batch-mode Control-FREEC execution for high-throughput PGT-A
  • Autosome copy number uniformity verification from BAM files
  • Automated PDF report generation per sample
  • Chromosome-level CNV visualization plots

Repository Structure

EmbryoLens/
├── PoN_new/                      # Panel-of-normals sample directory
├── freec_PoN_m/                  # Control-FREEC PoN reference files
├── scripts/                      # Helper and utility scripts
├── src/                          # Source scripts
├── data/                         # Sample data (excluded from VCS)
├── tmp_pofn/                     # Temporary PoN working files
├── tmp_pofn_bed/                 # Temporary BED format PoN files
├── 562_panel_of_normals.bed      # Pre-built PoN BED reference (n=562)
├── autosome_cnv_check.sh         # Autosome copy number verification QC
├── chr_freec_graph_batch_*.sh    # Batch chromosome CNV graph generation
├── freec_bed_pon_batch.sh        # Batch FREEC run with BED-based PoN
├── freec_pon_batch.sh            # Batch FREEC run with PoN reference
├── freec_results_pdf.py          # PDF report generator (v1)
├── freec_results_pdf_v2.py       # PDF report generator (v2, improved)
├── freec_results_pdf_v3.py       # PDF report generator (v3, with QC metrics)
├── freec_results_pdf_v4.py       # PDF report generator (v4, final production)
└── make_chr_freec_graph.sh       # Per-sample chromosome-level CNV graph

Pipeline Workflow

BAM Files (TE biopsy, WGA amplified)
    │
    ▼
1. Autosome CNV Verification      (autosome_cnv_check.sh)
   └─ Per-chromosome (chr1-chr22) read depth profiling
   └─ Coefficient of variation (CV) QC flag: PASS / WARN
    │
    ▼
2. Control-FREEC CNV Calling      (freec_pon_batch.sh)
   └─ Runs FREEC against PoN reference (n=562)
   └─ Computes copy number and BAF profiles
    │
    ▼
3. Chromosome Graph               (chr_freec_graph_batch.sh)
   └─ Per-chromosome CNV visualization
    │
    ▼
4. Report Generation              (freec_results_pdf_v4.py)
   └─ Clinical PDF report with CNV calls per sample
    │
    ▼
PGT-A Result Report (.pdf)

Requirements

Tool Version Purpose
Control-FREEC ≥ 11.6 CNV calling engine
SAMtools ≥ 1.15 BAM processing and autosome CNV QC
Python ≥ 3.8 PDF report generation
R ≥ 4.0 CNV graph generation
Bash ≥ 4.0 Pipeline orchestration

Python dependencies:

pip install reportlab matplotlib pandas numpy

Installation

git clone https://github.com/rvrane/EmbryoLens.git
cd EmbryoLens
chmod +x *.sh

Ensure Control-FREEC is installed and available in $PATH:

# Install Control-FREEC
git clone https://github.com/BoevaLab/FREEC.git
cd FREEC/src && make
export PATH=$PATH:/path/to/FREEC/src

Usage

Step 1: Autosome Copy Number Verification

bash autosome_cnv_check.sh /path/to/sample.bam

This step computes per-chromosome read depth across chr1–chr22 and calculates the coefficient of variation (CV) as a QC metric. Samples with CV > 25% are flagged as WARN and should be reviewed before proceeding.

Step 2: Run batch Control-FREEC with PoN reference

bash freec_pon_batch.sh \
  --bam_dir /path/to/bams/ \
  --pon_ref 562_panel_of_normals.bed \
  --genome hg38 \
  --threads 8

Step 3: Generate chromosome-level CNV graphs

bash chr_freec_graph_batch_20250805.sh --output_dir /path/to/output/

Step 4: Generate PDF reports

python freec_results_pdf_v4.py \
  --freec_dir /path/to/freec_output/ \
  --output_dir /path/to/reports/

Output

File/Directory Description
*.bam_CNVs Raw CNV calls per sample from FREEC
*.bam_ratio.txt Normalized copy number ratios
chr_plots/ Per-chromosome CNV visualization
reports/*.pdf Clinical PGT-A result PDFs
autosome_qc.log Autosome CNV verification QC log

Panel-of-Normals (PoN)

The included 562_panel_of_normals.bed file was constructed from n=562 euploid control embryos sequenced with a targeted low-pass NGS approach. The PoN is used to normalize sample CNV profiles against background noise from whole genome amplification (WGA) artifacts.

Note: For clinical use, it is recommended to rebuild the PoN using your own lab's euploid controls to match your specific library prep and sequencing platform.


References

  1. Boeva V, et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics. 2012;28(3):423-5. PMID: 22155870
  2. Boeva V, et al. FREEC: a tool for copy number analysis. Bioinformatics. 2011;27(2):268-9. PMID: 21081509

Author

Rugved Rane
Senior Bioinformatician | Reproductive Genetics & Clinical Genomics
GitHub


License

MIT License — see LICENSE for details.

About

Control-FREEC based CNV calling pipeline for preimplantation genetic testing (PGT-A) and aneuploidy detection using panel-of-normals approach

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors