CycSim - a context-based long-read simulator

Long-read sequencing data contain context-dependent errors, where certain bases are more likely to be misread depending on their surrounding sequence. Most existing simulators introduce errors randomly, which overlooks these error biases and only approximates the overall error rate. CycSim takes a different approach by modeling errors in a k-mer–dependent manner, enabling more realistic and biologically accurate error simulation.

CycSim is easy to train and supports all types of long-read sequencing data. It currently provides pre-trained models for BGI CycloneSEQ, PacBio HiFi, and Oxford Nanopore Q20 data. Users can also quickly train their own custom models using a BAM file of reads aligned to a reference genome.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.cargo		.cargo
.github/workflows		.github/workflows
images		images
src		src
test		test
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
build.rs		build.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CycSim - a context-based long-read simulator

Table of Contents

Installation

Installing from source

Dependencies

Download and install

Test

Download pre-trained models

General usage

Simulation

Training

Getting help

Help

Contact

Limitations

Benchmarking

Star

Citation

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

BioEarthDigital/CycSim

Folders and files

Latest commit

History

Repository files navigation

CycSim - a context-based long-read simulator

Table of Contents

Installation

Installing from source

Dependencies

Download and install

Test

Download pre-trained models

General usage

Simulation

Training

Getting help

Help

Contact

Limitations

Benchmarking

Star

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages