Skip to content

Simplify BaseRunner.run method #235

@MattBurn

Description

@MattBurn

I am working through issue #224 and I think the current BaseRunner.run output API could be simplified.

My understanding of the current behavior is:

  • capture=True uses subprocess.PIPE so stdout and stderr are returned on the CompletedProcess.
  • silent=True redirects output to os.devnull, unless overridden.
  • stdout / stderr can be set to Path objects to write the corresponding stream to disk.

This creates a few ambiguous combinations:

  • capture=True, silent=True silently ignores silent.
  • capture=False, stdout=path or stderr=path still redirects that stream to a file, despite capture being disabled.
  • stdout / stderr are currently path-oriented, so users cannot pass an open file object, in-memory buffer, callback, logger, or multiple destinations.
  • Internal code that needs to inspect stderr must either force capture=True or write stderr to a temporary file and read it back.

This is particularly awkward for OrcaMmRunner.run_orca_mm, which needs to inspect stderr to decide whether to raise OrcaMmException. At the moment, that requires coupling the internal error-checking behavior to the public output-routing behavior.

I think the issue is that capture, silent, stdout, and stderr are all competing controls for the same underlying concern: stream handling.

Proposal

Replace the current flag-based API with a stream-target API.

The runner would always capture stdout and stderr internally and return them on the result. The stdout and stderr arguments would only describe additional destinations for each stream.

For example:

StreamTarget = (
    None
    | Path
    | IO[str]
    | Callable[[str], None]
    | Sequence["StreamTarget"]
)

Proposed semantics:

  • None: no external output destination
  • Path: write the stream to a file
  • IO[str]: write the stream to an open file-like object
  • Callable[[str], None]: stream chunks to a callback
  • Sequence[StreamTarget]: tee the stream to multiple destinations

The run method could then look something like:

def run(
    self,
    binary: OrcaBinary,
    args: Sequence[str] = (),
    *,
    stdout: StreamTarget = None,
    stderr: StreamTarget = None,
    stdin: str | None = None,
    cwd: Path | None = None,
    timeout: int | None = None,
) -> RunResult:
    ...

Where RunResult exposes the captured output:

@dataclass
class RunResult:
    binary: str
    args: Sequence[str]
    returncode: int
    stdout: str
    stderr: str

This removes the need for both capture and silent:

# Current silent behavior
runner.run(..., stdout=None, stderr=None)

# Write stderr to a file while still allowing internal inspection
result = runner.run(..., stderr=Path("orca_mm.err"))

# Print stdout live
result = runner.run(..., stdout=lambda chunk: print(chunk, end=""))

# Print stdout live and write it to disk
result = runner.run(
    ...,
    stdout=[
        lambda chunk: print(chunk, end=""),
        Path("orca.out"),
    ],
)

For OrcaMmRunner.run_orca_mm, this would simplify the implementation because stderr would always be available on the returned result:

result = self.run(
    OrcaBinary.ORCA_MM,
    args,
    stdout=stdout,
    stderr=stderr,
)

if raise_on_error and result.stderr:
    raise OrcaMmException(command, arguments, result.stderr)

This would decouple internal validation from user-facing output routing. Users can decide where output should go, while the runner can still inspect process output consistently.

One possible caveat is memory usage for very large outputs, since this proposal implies always retaining stdout and stderr in memory. If that becomes a concern, we could add a later escape hatch such as max_capture_bytes, but I think the simpler default API is preferable unless large-output cases become a real problem.

Metadata

Metadata

Assignees

Labels

breaking changeIntroduces a backward-incompatible modification.enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions