Skip to content

Vectorize enhance.fill_zebra_stripes (replace Python pixel loop) #18

@rfrenchseti

Description

@rfrenchseti

Background

src/picmaker/enhance.py::fill_zebra_stripes runs a Python-level pixel loop over every row and every leading / trailing zero pixel:

for line in range(lines):
    ...
    for s in range(samples):
        if array2d[line, s] != 0:
            break
        ...
    for s in range(samples - 1, -1, -1):
        if array2d[line, s] != 0:
            break
        ...

For a 1024×1024 image the worst case is roughly 1024 × samples Python iterations. This is correctness-equivalent today (the existing tests/test_zebra.py tests pin the behavior) but is slow on real-mission data.

See CODEBASE_CRITIQUE.md §5 (Performance and resource use, "Finding (Medium) — enhance.fill_zebra_stripes Python-pixel loop").

Goal

Replace the inner loops with vectorized NumPy. The function fills in zero pixels at the leading and trailing edge of each row when the corresponding pixels in the rows immediately above and below are both non-zero.

Suggested approach

For each row, two booleans encode the "leading zeros prefix" and "trailing zeros suffix" extents:

is_zero = (array2d == 0)
# leading_zero[line, s] is True iff every column from 0..s on `line` is zero.
leading_zero = np.minimum.accumulate(is_zero, axis=1)
trailing_zero = np.minimum.accumulate(is_zero[:, ::-1], axis=1)[:, ::-1]

# Neighbouring-row sources for the fill.
prev_row = np.roll(array2d, 1, axis=0)
next_row = np.roll(array2d, -1, axis=0)
prev_row[0] = array2d[1]    # first row uses row 1 (matches current code)
next_row[-1] = array2d[-2]  # last row uses row -2

# Fill condition: pixel is in the leading/trailing zero region AND both
# neighbours have non-zero values at the same column.
fill_mask = (leading_zero | trailing_zero) & (prev_row != 0) & (next_row != 0)
filled = (prev_row.astype('int64') + next_row.astype('int64')) // 2
array2d[fill_mask] = filled[fill_mask]

The integer-division convention (the existing code uses (array_above + array_below) / 2 after int(...) casts on both operands; this produces a float but indexes assign back to the integer array) needs to be preserved bit-for-bit so the snapshot tests stay byte-identical.

Acceptance criteria

  • fill_zebra_stripes no longer contains a Python-level for line / for s loop.
  • tests/test_zebra.py and tests/test_pipeline_branches.py::test_zebra_path pass unchanged.
  • Snapshot tests pass byte-identically (no regeneration needed).
  • Add one benchmark / micro-test showing the vectorized version produces identical output for a 256×256 synthetic array (uses the existing tiny_array pattern).

Related

  • CODEBASE_CRITIQUE.md §5 — original finding.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions