Skip to content

HGVS m. (mitochondrial) on a non-MT reference is silently resolved as g. #1632

Description

@davmlaw

🤖 Written by Claude

Problem

Searching/resolving an m. (mitochondrial) HGVS against a non-mitochondrial reference (e.g. a nuclear contig like chr1) is silently accepted and resolved as if it were g.. The input m. round-trips back out as g. with no warning or error — so a nonsensical/mis-typed coordinate just "works".

Example: NC_000001.11:m.12345A>G resolves position 12345 on chr1 (exactly as g. would) and is echoed back as g..

This is not cdot

cdot's clean_hgvs() does not rewrite m.g.. It only adds g. when a kind is entirely missing, and dedupes double-kinds. The coercion happens in the pyhgvs library + VG's converter layer.

Mechanism

  1. pyhgvs treats m identically to g. In pyhgvs/models/hgvs_name.py, coordinate handling groups them: get_raw_coords (elif self.kind in ('g', 'm')), get_ref_coords, and format. So m. coordinates are resolved against whatever contig the reference accession names — no check that m. is only valid on the mitochondrial reference.

  2. VG only re-stamps the kind as m for the MT contig. genes/hgvs/hgvs_converter.py:108-112:

    hgvs_variant = self._variant_coordinate_to_g_hgvs(vc)
    if hgvs_variant.contig_accession == self.genome_build.mitochondria_accession:
        hgvs_variant.kind = 'm'
    return hgvs_variant

    For a nuclear contig the kind stays g, so the input m. comes back as g..

Net effect: m. on a non-MT reference is silently parsed as genomic and reformatted as g., with no validation rejecting it.

Suggested fix

Add validation (in VG's converter layer, so it applies regardless of which matcher runs) that rejects m. when the reference accession is not the genome build's mitochondrial accession.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions