Add pyspark-test-runner asset (v1.10.0)#14
Merged
Conversation
Single-file Python wrapper around pytest for local PySpark suites, packaged as an agentskills.io-style skill (SKILL.md + the runner script) mirroring dbx-ro-query. The wrapper runs pytest, writes full output to a log file, and prints only a bounded digest built from the JUnit XML: result, exit code, counts, runnable failing node ids, and failures deduplicated by a normalized signature so many tests failing with one cause collapse to a single block. Output is bounded on every axis (failing-list cap, top signatures, head+tail excerpt trim, hard total backstop), and it handles collection/import errors, no-tests, timeouts (with a process-tree reap guard), and malformed or missing XML. Includes per-asset tests (JUnit parsing, node-id reconstruction, signature dedup, excerpt trim, digest budget, interpreter and log-dir resolution), the test config, and ASSETS.md plus ROADMAP.md catalog entries.
Finalize the changelog for the v1.10.0 release: rename [Unreleased] to [1.10.0] - 2026-06-20 and add a fresh empty [Unreleased]. Bump both version markers (pyproject.toml version and the generated bundle's _template_version) to 1.10.0 so they agree with the changelog, per the release-metadata guard test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issue
N/A (no tracking issue)
Summary
Adds the
pyspark-test-runnerasset to the Asset Library and cuts the v1.10.0 release. The asset is a single-file Python wrapper aroundpytestfor local PySpark suites: it runs pytest, writes the full output to a log file, and prints only a bounded, agent-friendly digest built from the JUnit XML. When many tests fail with the same error, the digest collapses them into one signature block instead of flooding a coding agent's context window.Changes
assets/pyspark-test-runner/(schema + README +template/{{.target_dir}}/skills/pyspark-test-runner/withSKILL.mdandscripts/run-pyspark-tests.py), mirroring thedbx-ro-queryskill+script layout. Singletarget_dirprompt (default.agents).--timeout-secwith a process-tree reap guard, and malformed/missing JUnit XML falling back to a bounded log tail.tests/assets/test_pyspark_test_runner.py(JUnit parsing, node-id reconstruction, signature dedup, excerpt trim, digest budget, interpreter/log-dir resolution) and test configtests/configs/assets/pyspark_test_runner.json.ASSETS.mdcatalog andROADMAP.md"Shipped" list updated.1.10.0:CHANGELOG.mdfinalized and both version markers bumped.Change Area
assets/<name>/)Configuration Axes Affected
databricks_template_schema.json)Testing
pytest tests/ -V) — 2415 passed, 163 skippedAdditionally validated against a real PySpark suite (PySpark 4.1.2, OpenJDK 17): all-pass, a repetitive-failure flood (797 raw pytest lines collapsed to a ~70-line digest with one signature block), collection/import error, no-tests, timeout (clean process-tree reap, no orphaned JVM workers), and a long Py4J traceback (head+tail trimmed).
Asset Changes (if applicable)
databricks bundle init . --template-dir assets/<name> --output-dir <dir>library/helpers.tmplor other assets)tests/configs/assets/<name>.jsonaddedRelease (if this PR cuts a release)
CHANGELOG.mdfinalized:[Unreleased]renamed to[1.10.0] - 2026-06-20, fresh empty[Unreleased]added above1.10.0(pyproject.tomlandtemplate/{{.project_name}}/bundle_init_config.json.tmpl)tests/test_release_metadata.py)Checklist
{{ }}blocks).tmplfiles appear in generated output