Skip to content

Bug: parse_alignment_target_id fails with multiple KDMA names containing underscores #28

@PaulHax

Description

@PaulHax

Problem

The current fix in parse_alignment_target_id function has a critical flaw when handling multiple KDMA names that contain underscores.

Current Logic Issue

The function uses the number of values to determine parsing strategy:

  • 1 value: treat entire KDMA part as single name (fixes personal_safety-0.0)
  • Multiple values: split KDMA part by underscores

Failing Case

Input: personal_safety_merit-0.0_1.0

  • Values: [0.0, 1.0] (2 values)
  • KDMA names after underscore split: ["personal", "safety", "merit"] (3 names)
  • 3 names ≠ 2 values → returns empty list ❌

Should parse as:

  • personal_safety with value 0.0
  • merit with value 1.0

Root Cause

Using value count as a heuristic is unreliable because:

  1. KDMA names can contain underscores (personal_safety)
  2. Multiple KDMAs can also contain underscores
  3. No way to distinguish between name separators vs. name components

Potential Solutions

  1. Delimiter approach: Use a different delimiter between KDMA names (e.g., double underscore __)
  2. Length-based parsing: Use known KDMA name lengths/patterns
  3. Registry approach: Maintain a list of valid KDMA names and match against them
  4. Format change: Restructure alignment target ID format to avoid ambiguity

Location

File: align_browser/experiment_models.py:42-98
Function: parse_alignment_target_id

Priority

High - affects KDMA parsing accuracy for alignment targets

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions