Skip to content

F059: Built-in File Operations Plugin #206

@pocky

Description

@pocky

F059: Built-in File Operations Plugin

User Stories

US1: Read File Content in Workflow (P1 - Must Have)

As a workflow author,
I want to read file contents from disk within a workflow step using file.read,
So that I can feed file data into downstream steps (prompts, validations, transformations) without shelling out to cat.

Acceptance Scenarios:

  • Given a workflow with operation: file.read and inputs: {path: "README.md"}, when the step executes, then {{states.read_step.output}} contains the file's UTF-8 text content and {{states.read_step.size}} contains the byte count
  • Given a workflow with operation: file.read and inputs: {path: "missing.txt"}, when the step executes, then the operation returns an error with code EXECUTION.OPERATION.FAILED and a hint containing the file path
  • Given a workflow with operation: file.read and inputs: {path: "/etc/shadow"}, when the step executes and the process lacks read permission, then the operation returns an error indicating permission denied

Independent Test: Create a temp file with known content, run a single-step workflow with file.read, assert output matches content and size matches byte length.

US2: Write File Content in Workflow (P1 - Must Have)

As a workflow author,
I want to write content to a file using file.write,
So that I can persist generated output (reports, configs, code) to disk declaratively.

Acceptance Scenarios:

  • Given a workflow with operation: file.write and inputs: {path: "out.txt", content: "hello"}, when the step executes, then out.txt exists with content "hello" and the operation output includes bytes_written
  • Given a workflow with operation: file.write and inputs: {path: "existing.txt", content: "new"} where existing.txt already exists, when the step executes, then the file is overwritten (default behavior)
  • Given a workflow with operation: file.write and inputs: {path: "existing.txt", content: "appended", mode: "append"}, when the step executes, then the content is appended to the existing file
  • Given a workflow with operation: file.write targeting a path in a non-existent directory, when create_dirs input is true, then intermediate directories are created; when false or omitted, then the operation returns an error

Independent Test: Run a single-step workflow writing known content to a temp path, then read back the file and assert content matches. Repeat with append mode on a pre-existing file.

US3: Copy File in Workflow (P2 - Should Have)

As a workflow author,
I want to copy a file from source to destination using file.copy,
So that I can duplicate artifacts (backups, templates, staging) without shell commands.

Acceptance Scenarios:

  • Given a workflow with operation: file.copy and inputs: {src: "a.txt", dest: "b.txt"}, when the step executes, then b.txt exists with identical content to a.txt and the operation output includes bytes_copied
  • Given src does not exist, when the step executes, then the operation returns an error with a descriptive message
  • Given dest already exists, when overwrite input is false, then the operation returns an error; when true or omitted (default), then the destination is overwritten

Independent Test: Create a temp source file, run file.copy, verify destination content matches source byte-for-byte.

US4: Delete File in Workflow (P2 - Should Have)

As a workflow author,
I want to delete a file using file.delete,
So that I can clean up temporary or intermediate files within the workflow lifecycle.

Acceptance Scenarios:

  • Given a workflow with operation: file.delete and inputs: {path: "temp.txt"}, when the step executes and the file exists, then the file is removed and the operation output includes deleted: true
  • Given the file does not exist, when missing_ok input is true (default), then the operation succeeds with deleted: false; when false, then the operation returns an error

Independent Test: Create a temp file, run file.delete, verify the file no longer exists. Run again with missing_ok: true, verify success without error.

US5: Path Validation and Security Boundaries (P1 - Must Have)

As a platform operator,
I want file operations to validate paths against traversal attacks and restrict access to the workflow's working directory,
So that workflows cannot read or write arbitrary system files.

Acceptance Scenarios:

  • Given a workflow with operation: file.read and inputs: {path: "../../etc/passwd"}, when the step executes, then the operation returns a USER.INPUT.INVALID error rejecting path traversal
  • Given a workflow with operation: file.write and inputs: {path: "/tmp/outside.txt"} when the resolved path is outside the workflow working directory, then the operation returns a USER.INPUT.INVALID error
  • Given a workflow with operation: file.read and inputs: {path: "subdir/data.json"}, when the resolved path is within the working directory, then the operation succeeds normally

Independent Test: Attempt path traversal patterns (../, absolute paths outside workdir, symlink escape) and assert all are rejected with appropriate error codes.


Requirements

Functional Requirements

  • FR-001: The system shall provide a FileOperationProvider implementing ports.OperationProvider with four operations: file.read, file.write, file.copy, file.delete
  • FR-002: file.read shall accept a required path (string) input and return output (string, file content) and size (integer, byte count) outputs
  • FR-003: file.write shall accept required path (string) and content (string) inputs, optional mode (string: "overwrite" default, "append"), and optional create_dirs (boolean, default false), returning bytes_written (integer)
  • FR-004: file.copy shall accept required src (string) and dest (string) inputs, optional overwrite (boolean, default true), returning bytes_copied (integer)
  • FR-005: file.delete shall accept a required path (string) input and optional missing_ok (boolean, default true), returning deleted (boolean)
  • FR-006: All path inputs shall be resolved relative to the workflow working directory and validated against directory traversal (rejecting .. components that escape the root and absolute paths outside the working directory)
  • FR-007: The provider shall be registered via CompositeOperationProvider alongside existing github and notify providers in run.go CLI wiring
  • FR-008: Each operation shall define a typed OperationSchema with InputSchema validation (type, required, default, description) consistent with F054/F056 patterns
  • FR-009: Operation errors shall use StructuredError with appropriate error codes (USER.INPUT.INVALID for bad paths, EXECUTION.OPERATION.FAILED for I/O failures)
  • FR-010: file.write shall use atomic write (temp file + rename) for overwrite mode to prevent corruption on partial writes

Non-Functional Requirements

  • NFR-001: File read operations shall support files up to 10 MB; larger files shall return an error with a descriptive message suggesting streaming alternatives
  • NFR-002: No secrets or file contents shall appear in log output at INFO level; DEBUG level may include truncated content (first 256 bytes)
  • NFR-003: All operations shall respect context.Context cancellation, aborting mid-operation when the workflow is cancelled
  • NFR-004: The internal/infrastructure/fileops/ package shall have zero imports from other infrastructure packages (domain + stdlib only, per hexagonal architecture rules)
  • NFR-005: Path validation shall complete in O(1) relative to file size (string operations only, no filesystem access for validation)

Success Criteria

  • All P1 user stories implemented and tested
  • All P2 user stories implemented and tested
  • Unit test coverage >= 80%
  • No lint errors (golangci-lint, go-arch-lint)
  • Documentation updated (workflow-syntax.md, plugins.md, architecture.md, project-structure.md, CHANGELOG.md)
  • YAML fixture workflows demonstrating each operation
  • Integration tests validating cross-component wiring

Key Entities

Entity Description Attributes
FileOperationProvider Infrastructure adapter implementing ports.OperationProvider for file operations operations map, workdir resolver
fileOperation Internal dispatch target per operation type name, handler func, schema
pathValidator Validates and resolves paths against working directory boundary workdir (base path), resolve func

Metadata

  • Status: backlog
  • Version: v0.4.0
  • Priority: medium
  • Estimation: M

Dependencies

  • Blocked by: F057
  • Unblocks: none

Clarifications

  • F057 dependency: F057 (operation interface enhancements) must be completed first. If F057 introduces changes to OperationProvider, OperationSchema, or OperationResult, F059 must conform to the updated contracts.
  • Working directory resolution: File paths resolve relative to the directory containing the workflow YAML file, consistent with how ShellExecutor resolves dir for command steps.
  • Symlink policy: Symlinks are followed but the resolved absolute path must remain within the working directory boundary. Symlinks that escape trigger the same traversal rejection as .. patterns.
  • Binary files: file.read returns raw bytes as a UTF-8 string. Invalid UTF-8 sequences are replaced with the Unicode replacement character. Callers working with binary data should use file.copy instead.
  • Concurrency: Multiple file operations in parallel steps are safe because each operates on independent paths. No file-level locking is provided — conflicting writes to the same path are the workflow author's responsibility.

Notes

  • Architecture pattern: Follows F054 (github) and F056 (notify) built-in provider pattern exactly: internal/infrastructure/fileops/ with provider.go, operations.go, types.go, and per-operation handler files.
  • go-arch-lint: New infra-fileops component in .go-arch-lint.yml with dependency rules matching infra-github and infra-notify.
  • Atomic writes: file.write in overwrite mode uses the same temp-file-plus-rename pattern as JSONStore in internal/infrastructure/store/, ensuring crash safety.
  • No domain changes: All types are infrastructure-internal (YAGNI, consistent with ADR-005 from F054 and F056).
  • 10 MB limit: Chosen to prevent workflows from accidentally loading multi-gigabyte files into memory. The limit is a constant, not configurable (YAGNI until a concrete use case emerges).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions