🤖 Written by Claude
Problem
Searching/resolving an m. (mitochondrial) HGVS against a non-mitochondrial reference (e.g. a nuclear contig like chr1) is silently accepted and resolved as if it were g.. The input m. round-trips back out as g. with no warning or error — so a nonsensical/mis-typed coordinate just "works".
Example: NC_000001.11:m.12345A>G resolves position 12345 on chr1 (exactly as g. would) and is echoed back as g..
This is not cdot
cdot's clean_hgvs() does not rewrite m.→g.. It only adds g. when a kind is entirely missing, and dedupes double-kinds. The coercion happens in the pyhgvs library + VG's converter layer.
Mechanism
-
pyhgvs treats m identically to g. In pyhgvs/models/hgvs_name.py, coordinate handling groups them: get_raw_coords (elif self.kind in ('g', 'm')), get_ref_coords, and format. So m. coordinates are resolved against whatever contig the reference accession names — no check that m. is only valid on the mitochondrial reference.
-
VG only re-stamps the kind as m for the MT contig. genes/hgvs/hgvs_converter.py:108-112:
hgvs_variant = self._variant_coordinate_to_g_hgvs(vc)
if hgvs_variant.contig_accession == self.genome_build.mitochondria_accession:
hgvs_variant.kind = 'm'
return hgvs_variant
For a nuclear contig the kind stays g, so the input m. comes back as g..
Net effect: m. on a non-MT reference is silently parsed as genomic and reformatted as g., with no validation rejecting it.
Suggested fix
Add validation (in VG's converter layer, so it applies regardless of which matcher runs) that rejects m. when the reference accession is not the genome build's mitochondrial accession.
🤖 Written by Claude
Problem
Searching/resolving an
m.(mitochondrial) HGVS against a non-mitochondrial reference (e.g. a nuclear contig like chr1) is silently accepted and resolved as if it wereg.. The inputm.round-trips back out asg.with no warning or error — so a nonsensical/mis-typed coordinate just "works".Example:
NC_000001.11:m.12345A>Gresolves position 12345 on chr1 (exactly asg.would) and is echoed back asg..This is not cdot
cdot's
clean_hgvs()does not rewritem.→g.. It only addsg.when a kind is entirely missing, and dedupes double-kinds. The coercion happens in the pyhgvs library + VG's converter layer.Mechanism
pyhgvs treats
midentically tog. Inpyhgvs/models/hgvs_name.py, coordinate handling groups them:get_raw_coords(elif self.kind in ('g', 'm')),get_ref_coords, andformat. Som.coordinates are resolved against whatever contig the reference accession names — no check thatm.is only valid on the mitochondrial reference.VG only re-stamps the kind as
mfor the MT contig.genes/hgvs/hgvs_converter.py:108-112:For a nuclear contig the kind stays
g, so the inputm.comes back asg..Net effect:
m.on a non-MT reference is silently parsed as genomic and reformatted asg., with no validation rejecting it.Suggested fix
Add validation (in VG's converter layer, so it applies regardless of which matcher runs) that rejects
m.when the reference accession is not the genome build's mitochondrial accession.