Skip to content

Methods documentation for how the gold standards are created? #18

Description

@RhysCAllen

Greetings CAMI team,

Thank you for creating and sharing this invaluable resource and initiative!

I am excited to use the rhizosphere database from the second challenge to benchmark a metagenomics pipeline.

I understand we can compare our pipeline results (such as MAGs) to the input ref files used to create the simulated datasets, such as https://frl.publisso.de/data/frl:6425521/plant_associated/rhimgCAMI2_genomes.tar.gz.

We can also compare our results to the gold standard, as described on the CAMI challenge website and the CAMI publications such as https://www.nature.com/articles/nmeth.4458

I wanted to understand better how the gold standard was created, but I am unable to find that aspect of the Methods.

I found this information: “The gold standard includes all genomic regions covered by at least one read in the metagenome data set.”

Is there also a description of the parameters and software used to create the gold standard files (contigs.tar.gz, for example)? I imagine that depending on the assembly software and parameters used to create the gold standard contigs, for example, the "genomic regions covered by at least one read" could vary quite a bit.

Thanks very much for any info you could provide!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions