Skip to content

Add optional submodules input to reusable CI (public submodules) #39

@kazmosahebi

Description

@kazmosahebi

Problem

Consumer repos whose tests read content from a submodule fail in the reusable CI because the workflow checkouts (python-ci.yml and the other *-ci.yml) fetch no submodules. Example: dfe-engine resolves ClickHouse schema profiles from its schemas submodule (schemas/common-header/*.yaml); when the submodule is absent it falls back to stale bundled profiles in src/, so test_schema_builder_v2.py::...test_source_specific_overrides_in_views fails in CI (Version '1.1.0' not found ... Available versions: 1.0.0) while passing locally.

Current guidance

docs/LESSONS.md advises, for private submodules (e.g. dfe-schemas):

mark dependent tests with @pytest.mark.skipif(not schemas_dir.exists(), ...) rather than trying to check out private submodules in CI (GITHUB_TOKEN can't)

That stands for private submodules. But it leaves no supported path for a repo whose submodule is (or becomes) public and which wants those tests to actually run in CI.

Proposal

Add an optional submodules input to the reusable workflow(s):

submodules:
  type: string
  default: ""   # no-op for every current consumer

When set (e.g. submodules: schemas), the test job runs git submodule update --init --depth 1 <paths> after checkout (value passed via env to avoid script injection). Default empty -> nothing changes for repos that don't set it.

  • Public submodule: works with the default GITHUB_TOKEN, no secret needed.
  • Private submodule: still needs skipif (per LESSONS) or a configured cross-repo token -- out of scope here.

Docs to update (part of the fix)

  • docs/LESSONS.md: reconcile -- skipif for private submodules; the new submodules input for public ones.

Scope / open questions

  • Start with python-ci.yml (the dfe-engine case). Extend to rust-ci.yml / ts-ci.yml / go-ci.yml for parity, or add on demand?
  • Only the test job consumes submodule content today (quality lints src/, build packages src/). Keep it test-job-only, or apply to all tree-using jobs?

A draft branch implementing the python-ci.yml input + the test-job init step is ready.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions