Skip to content

Reject duplicate output processor names in composite workflows #675

@andreatgretel

Description

@andreatgretel

Priority Level

Low (Cosmetic / Minor)

Describe the bug

Workflow chaining validates name collisions between stage processors and output_processors, but it does not reject duplicate names within the output_processors list itself. If two output processors share the same name, they can write to the same processor artifact path and the later output can overwrite the earlier one.

Steps/Code to reproduce bug

workflow.add_stage(
    name="drafts",
    config_builder=builder,
    output_processors=[
        DropColumnsProcessorConfig(name="drop_scratch", column_names=["scratch"]),
        DropColumnsProcessorConfig(name="drop_scratch", column_names=["other_scratch"]),
    ],
)

Expected behavior

add_stage() should raise a DataDesignerWorkflowError when output_processors contains duplicate processor names.

Agent Diagnostic / Prior Investigation

Greptile flagged this on PR #636. The current validation checks output_processors names against processors already present in the stage config, but not duplicate names within output_processors itself.

Additional context

Follow-up from PR #636: #636

Checklist

  • I reproduced this issue or provided a minimal example
  • I searched the docs/issues myself, or had my agent do so
  • If I used an agent, I included its diagnostics above

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions