feat: implement suspect data workflow flags for #22#73
Merged
Conversation
Add comprehensive acceptance tests for suspect data workflow flags feature. Tests cover: - Workflow reads quality.json if present - CLI flags: --include-suspect-alleles (default), --exclude-suspect-alleles - Additional flags: --exclude-suspect-loci, --exclude-suspect-profiles - Workflow filters allele database based on flags before MinHash/alignment - Results note which alleles/loci were excluded (if any) - Works when quality.json absent (no filtering) - Tests verify filtering behavior at all three levels - Documentation: flag semantics and defaults All tests currently FAIL as expected (RED phase). 13 failures, 19 passed (placeholder tests). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement CLI flags and WDL workflow enhancements for filtering suspect data: - CLI flags: --include-suspect-alleles (default), --exclude-suspect-alleles - Additional flags: --exclude-suspect-loci, --exclude-suspect-profiles - WDL filter_alleles task filters allele database based on quality.json - Results include excluded_alleles and excluded_loci information - Hierarchical filtering: alleles -> loci -> profiles Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
8447297 to
68d7df6
Compare
Both --strategy and --quality-json/suspect flags now available
Changed mlst_workflow_path fixture to point to balanced_typing.wdl instead of deleted mlst torch workflow
- Add filter_alleles.wdl task that reads quality.json and filters FASTA - Update all three workflows (fast, balanced, sensitive) to: - Accept quality_json and exclude_* parameters - Call filter_alleles before sketching/alignment - Add exclusion metadata to final results - filter_alleles extracts suspect data from quality.json structure: - Suspect alleles: low-similarity pairs below threshold - Suspect loci: flagged loci - Suspect profiles: loci from suspect profiles - Three levels of filtering: - exclude_suspect_alleles: exclude specific alleles only - exclude_suspect_loci: exclude all alleles from suspect loci - exclude_suspect_profiles: exclude all loci from suspect profiles - Result JSON includes exclusion counts and lists in notes.exclusions
Updated input JSON keys from old mlst_typing namespace to balanced_typing with correct parameter names: - contigs -> query_sequences - allele_database -> allele_fasta - profiles -> profiles_table
These tests execute actual WDL workflows via miniwdl which requires full workflow implementations and Docker. They should be excluded from the default test run with -m 'not miniwdl'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements CLI flags and WDL workflow enhancements for filtering suspect data based on quality.json:
Acceptance Criteria Met
Implementation Details
filter_allelesWDL task that processes quality.json and filters FASTA databaseexcluded_allelesandexcluded_lociarraysTest Results
🤖 Generated with Claude Code