Skip to content

Genomic to protein projection#437

Merged
theferrit32 merged 34 commits into
mainfrom
kf/g-c-p-projection
Jun 26, 2026
Merged

Genomic to protein projection#437
theferrit32 merged 34 commits into
mainfrom
kf/g-c-p-projection

Conversation

@theferrit32

@theferrit32 theferrit32 commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Touch #146

This PR adds:

  • forward transcription of a genomic variant to a transcript variant (preferring MANE using cool-seq-tool's logic)
  • forward translation of a transcript variant to a protein variant
  • automatic registration of a transcript and protein form of a genomic variant in PUT /variation
  • automatic registration of a protein form of a transcript variant in PUT /variation

@theferrit32 theferrit32 self-assigned this Apr 24, 2026
Guard protein projection by checking the computed reference codon against the curated protein residue before translating alternate codons.

Add selenocysteine, stop-gain, and unsupported multi-residue edge coverage for the genomic-to-cDNA-to-protein path.
Add an explicit VariantProjector.close() hook and call it from REST lifespan and projection test teardown.

Cover shutdown with a unit test that verifies the loop thread starts, stops, closes, and tolerates repeated close calls.
Add an explicit Allele-only projection guard, keep internal AttributeError failures on the unexpected-error path, and cover transcript-to-protein registration through the REST API.
@theferrit32 theferrit32 marked this pull request as ready for review May 1, 2026 17:59
@theferrit32 theferrit32 requested a review from a team as a code owner May 1, 2026 17:59

@jsstevenson jsstevenson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some initial thoughts

Comment thread src/anyvar/mapping/projection.py Outdated
Comment thread src/anyvar/mapping/projection.py Outdated
Comment thread src/anyvar/mapping/projection.py Outdated
Comment thread src/anyvar/mapping/projection.py Outdated
@theferrit32 theferrit32 requested a review from jsstevenson May 5, 2026 19:38

@jsstevenson jsstevenson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting weird github server errors trying to leave a review but

  • add env var to .env config
  • add a blurb to features.rst about auto-add of projection stuff during registration
  • I think the UTA_DB_URL check already happens in CoolSeqTool initialization so doesn't need to be handled by AnyVar

@jsstevenson jsstevenson self-requested a review May 6, 2026 16:59
Comment thread src/anyvar/mapping/projection.py Outdated
Comment thread src/anyvar/mapping/projection.py
Comment thread src/anyvar/mapping/projection.py Outdated
Comment thread src/anyvar/mapping/projection.py
Comment thread src/anyvar/mapping/projection.py Outdated
# the longest compatible remaining transcript when MANE is unavailable
# or incompatible.
result = self._run_async_projection(
self.cst.mane_transcript.grch38_to_mane_c_p(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this function is GRCh38-specific. What happens if someone registers a GRCh37 variant?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking through this. I think we don't want to have two paths of lifting over GRCh37 variants in our application. Our REST API layer can/does already do liftover for GRCh37 inputs. So I think we should use that lifted over GRCh38 version for the input to the projector, and make the VariantProjector class require genomic inputs to be GRCh38. At least for now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised my above thoughts. We'll support GRCh37 inputs to VariantProjector and let cool-seq-tool attempt to project forward to GRCh38 in order to try to get to a MANE transcript, even if AnyVar's logic around the unambiguous round-trippability decided not to persist a GRCh38 form.

Comment thread src/anyvar/mapping/projection.py
jsstevenson
jsstevenson previously approved these changes Jun 23, 2026
Resolve conflicts from #480, which moved bulk-registration logic from
variations_router.py into the shared translate/register.py module.

- anyvar.py: keep both sides (projection helpers + main's queueing/error additions)
- variations_router.py: drop locally-defined registration funcs, import them from
  translate.register; keep `import os`
- translate/register.py: port the projection hook and add_projection_mappings onto
  main's relocated register_variations (also covers the Celery async path)

@jennifer-bowser jennifer-bowser left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, just one last minor change and then I think this is good to go!

Comment thread compose.yaml
@theferrit32 theferrit32 changed the title Projection Genomic to protein projection Jun 26, 2026
@theferrit32 theferrit32 merged commit e31d1ed into main Jun 26, 2026
23 checks passed
@theferrit32 theferrit32 deleted the kf/g-c-p-projection branch June 26, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants