Problem
Consumer repos whose tests read content from a submodule fail in the reusable CI because the workflow checkouts (python-ci.yml and the other *-ci.yml) fetch no submodules. Example: dfe-engine resolves ClickHouse schema profiles from its schemas submodule (schemas/common-header/*.yaml); when the submodule is absent it falls back to stale bundled profiles in src/, so test_schema_builder_v2.py::...test_source_specific_overrides_in_views fails in CI (Version '1.1.0' not found ... Available versions: 1.0.0) while passing locally.
Current guidance
docs/LESSONS.md advises, for private submodules (e.g. dfe-schemas):
mark dependent tests with @pytest.mark.skipif(not schemas_dir.exists(), ...) rather than trying to check out private submodules in CI (GITHUB_TOKEN can't)
That stands for private submodules. But it leaves no supported path for a repo whose submodule is (or becomes) public and which wants those tests to actually run in CI.
Proposal
Add an optional submodules input to the reusable workflow(s):
submodules:
type: string
default: "" # no-op for every current consumer
When set (e.g. submodules: schemas), the test job runs git submodule update --init --depth 1 <paths> after checkout (value passed via env to avoid script injection). Default empty -> nothing changes for repos that don't set it.
- Public submodule: works with the default
GITHUB_TOKEN, no secret needed.
- Private submodule: still needs skipif (per LESSONS) or a configured cross-repo token -- out of scope here.
Docs to update (part of the fix)
docs/LESSONS.md: reconcile -- skipif for private submodules; the new submodules input for public ones.
Scope / open questions
- Start with
python-ci.yml (the dfe-engine case). Extend to rust-ci.yml / ts-ci.yml / go-ci.yml for parity, or add on demand?
- Only the
test job consumes submodule content today (quality lints src/, build packages src/). Keep it test-job-only, or apply to all tree-using jobs?
A draft branch implementing the python-ci.yml input + the test-job init step is ready.
Problem
Consumer repos whose tests read content from a submodule fail in the reusable CI because the workflow checkouts (
python-ci.ymland the other*-ci.yml) fetch no submodules. Example:dfe-engineresolves ClickHouse schema profiles from itsschemassubmodule (schemas/common-header/*.yaml); when the submodule is absent it falls back to stale bundled profiles insrc/, sotest_schema_builder_v2.py::...test_source_specific_overrides_in_viewsfails in CI (Version '1.1.0' not found ... Available versions: 1.0.0) while passing locally.Current guidance
docs/LESSONS.mdadvises, for private submodules (e.g.dfe-schemas):That stands for private submodules. But it leaves no supported path for a repo whose submodule is (or becomes) public and which wants those tests to actually run in CI.
Proposal
Add an optional
submodulesinput to the reusable workflow(s):When set (e.g.
submodules: schemas), thetestjob runsgit submodule update --init --depth 1 <paths>after checkout (value passed via env to avoid script injection). Default empty -> nothing changes for repos that don't set it.GITHUB_TOKEN, no secret needed.Docs to update (part of the fix)
docs/LESSONS.md: reconcile -- skipif for private submodules; the newsubmodulesinput for public ones.Scope / open questions
python-ci.yml(the dfe-engine case). Extend torust-ci.yml/ts-ci.yml/go-ci.ymlfor parity, or add on demand?testjob consumes submodule content today (quality lintssrc/, build packagessrc/). Keep it test-job-only, or apply to all tree-using jobs?A draft branch implementing the
python-ci.ymlinput + the test-job init step is ready.