Control-FREEC based CNV calling pipeline for preimplantation genetic testing (PGT-A) and embryo aneuploidy detection using a panel-of-normals (PoN) approach.
EmbryoLens implements an end-to-end bioinformatics pipeline for aneuploidy and copy number variation (CNV) detection from NGS data of trophectoderm (TE) biopsies. The pipeline uses Control-FREEC for copy number profiling with a custom panel-of-normals (PoN) reference built from euploid embryo samples to improve specificity and reduce noise from low-input, amplified DNA.
Key features:
- Panel-of-normals reference construction and management
- Batch-mode Control-FREEC execution for high-throughput PGT-A
- Autosome copy number uniformity verification from BAM files
- Automated PDF report generation per sample
- Chromosome-level CNV visualization plots
EmbryoLens/
├── PoN_new/ # Panel-of-normals sample directory
├── freec_PoN_m/ # Control-FREEC PoN reference files
├── scripts/ # Helper and utility scripts
├── src/ # Source scripts
├── data/ # Sample data (excluded from VCS)
├── tmp_pofn/ # Temporary PoN working files
├── tmp_pofn_bed/ # Temporary BED format PoN files
├── 562_panel_of_normals.bed # Pre-built PoN BED reference (n=562)
├── autosome_cnv_check.sh # Autosome copy number verification QC
├── chr_freec_graph_batch_*.sh # Batch chromosome CNV graph generation
├── freec_bed_pon_batch.sh # Batch FREEC run with BED-based PoN
├── freec_pon_batch.sh # Batch FREEC run with PoN reference
├── freec_results_pdf.py # PDF report generator (v1)
├── freec_results_pdf_v2.py # PDF report generator (v2, improved)
├── freec_results_pdf_v3.py # PDF report generator (v3, with QC metrics)
├── freec_results_pdf_v4.py # PDF report generator (v4, final production)
└── make_chr_freec_graph.sh # Per-sample chromosome-level CNV graph
BAM Files (TE biopsy, WGA amplified)
│
▼
1. Autosome CNV Verification (autosome_cnv_check.sh)
└─ Per-chromosome (chr1-chr22) read depth profiling
└─ Coefficient of variation (CV) QC flag: PASS / WARN
│
▼
2. Control-FREEC CNV Calling (freec_pon_batch.sh)
└─ Runs FREEC against PoN reference (n=562)
└─ Computes copy number and BAF profiles
│
▼
3. Chromosome Graph (chr_freec_graph_batch.sh)
└─ Per-chromosome CNV visualization
│
▼
4. Report Generation (freec_results_pdf_v4.py)
└─ Clinical PDF report with CNV calls per sample
│
▼
PGT-A Result Report (.pdf)
| Tool | Version | Purpose |
|---|---|---|
| Control-FREEC | ≥ 11.6 | CNV calling engine |
| SAMtools | ≥ 1.15 | BAM processing and autosome CNV QC |
| Python | ≥ 3.8 | PDF report generation |
| R | ≥ 4.0 | CNV graph generation |
| Bash | ≥ 4.0 | Pipeline orchestration |
Python dependencies:
pip install reportlab matplotlib pandas numpygit clone https://github.com/rvrane/EmbryoLens.git
cd EmbryoLens
chmod +x *.shEnsure Control-FREEC is installed and available in $PATH:
# Install Control-FREEC
git clone https://github.com/BoevaLab/FREEC.git
cd FREEC/src && make
export PATH=$PATH:/path/to/FREEC/srcbash autosome_cnv_check.sh /path/to/sample.bamThis step computes per-chromosome read depth across chr1–chr22 and calculates the coefficient of variation (CV) as a QC metric. Samples with CV > 25% are flagged as WARN and should be reviewed before proceeding.
bash freec_pon_batch.sh \
--bam_dir /path/to/bams/ \
--pon_ref 562_panel_of_normals.bed \
--genome hg38 \
--threads 8bash chr_freec_graph_batch_20250805.sh --output_dir /path/to/output/python freec_results_pdf_v4.py \
--freec_dir /path/to/freec_output/ \
--output_dir /path/to/reports/| File/Directory | Description |
|---|---|
*.bam_CNVs |
Raw CNV calls per sample from FREEC |
*.bam_ratio.txt |
Normalized copy number ratios |
chr_plots/ |
Per-chromosome CNV visualization |
reports/*.pdf |
Clinical PGT-A result PDFs |
autosome_qc.log |
Autosome CNV verification QC log |
The included 562_panel_of_normals.bed file was constructed from n=562 euploid control embryos sequenced with a targeted low-pass NGS approach. The PoN is used to normalize sample CNV profiles against background noise from whole genome amplification (WGA) artifacts.
Note: For clinical use, it is recommended to rebuild the PoN using your own lab's euploid controls to match your specific library prep and sequencing platform.
- Boeva V, et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics. 2012;28(3):423-5. PMID: 22155870
- Boeva V, et al. FREEC: a tool for copy number analysis. Bioinformatics. 2011;27(2):268-9. PMID: 21081509
Rugved Rane
Senior Bioinformatician | Reproductive Genetics & Clinical Genomics
GitHub
MIT License — see LICENSE for details.