Skip to content

Insertion to Duplication post-process bug? #63

Description

@vkaz39

Hello there,

I'm writing to report a couple of issues I've encountered while using Jasmine. I have used six tools independently to call SV on the same sample, say sample1, as well as pre-processed calls to convert duplications to insertions and corrected errors in insertions with Iris.

After per-sample merging with Jasmine, I observed the following unexpected result when running the bcftools norm command (using --check-ref e):

Reference allele mismatch at CM018180.1:14965491:
REF_SEQ: 'T'
VCF REF: 'N'
After performing post-processing (converting insertions back to duplications) with Jasmine, I encountered another issue based on the same bcftools norm command:

Non-ACGTN reference allele at CM018180.1:3761701:
REF_SEQ: 'C'
VCF REF: '.'

I've also verified the original VCF files and the output VCF from Iris tools, and they do not show any discrepancies between the reference sequence and the VCF REF.

When I checked the code within Jasmine (InsertionsToDuplications.java), I see that setRef is '.' for setAlt "<DUP>". Why is it so?

if(line.contains("OLDTYPE=DUP") && ve.getType().equals("INS"))
			{
				countDup++;
					
				long start = ve.getPos();
				int length = ve.getLength();
				long nstart = start - length + 1, nend = nstart + length;
				String refinedAlt = ve.getAlt();
				ve.setPos(nstart);
				ve.setInfo("END", nend+"");
				ve.setType("DUP");
				ve.setInfo("REFINEDALT", refinedAlt);
				ve.setInfo("STRANDS", "-+");
				ve.setRef(".");
				ve.setAlt("<DUP>");
				out.println(ve);
			}
			else
			{
				ve.setInfo("REFINEDALT", ".");
				out.println(ve);
			}

Could you please let me know if these issues indicate a bug in the Jasmine?

Thank you!
Niraj

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions