Skip to content

ci: fix Windows job flakiness caused by dirty workspace#3694

Open
Leiyks wants to merge 6 commits intomasterfrom
leiyks/fix-windows-job-flakiness
Open

ci: fix Windows job flakiness caused by dirty workspace#3694
Leiyks wants to merge 6 commits intomasterfrom
leiyks/fix-windows-job-flakiness

Conversation

@Leiyks
Copy link
Contributor

@Leiyks Leiyks commented Mar 6, 2026

Summary

Fixes Windows CI job flakiness caused by leftover files from previous runs on persistent Windows runners.

Root causes fixed:

  • x64/Release/php_ddtrace.dll / .pdb: NTFS junction points/reparse points that PS 5.1 Remove-Item -Recurse and git checkout can't remove, causing git checkout to fail with "Invalid argument"
  • run-tests.php and PHP test files: held open with "Permission Denied" by previous job's processes
  • Docker containers from previous runs holding php_ddtrace.dll open across jobs

Solution:

  • Extracted shared windows_git_setup() function in generate-common.php used by all three affected Windows jobs
  • Kills leftover Docker containers before cleanup
  • Uses cmd.exe rd /s /q from the parent directory (handles junction points that PS 5.1 can't)
  • Manual git clone + git checkout with $LASTEXITCODE guards (PS 5.1 ignores $PSNativeCommandUseErrorActionPreference)
  • All three jobs now use GIT_STRATEGY: none to skip GitLab's built-in checkout

Jobs fixed:

  • compile extension windows (generate-package.php)
  • windows test_c (generate-tracer.php)
  • verify windows (generate-package.php) — uses a variant windows_git_setup_with_packages() that saves/restores the packages/ artifact around the workspace wipe

@datadog-official
Copy link

datadog-official bot commented Mar 6, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 1028 Tests failed

testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integration\PHPInstallerTest::testSearchPhpBinaries
Test code or tested code printed unexpected output: Searching for available php binaries, this operation might take a while.
testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b2ea3b00000000046f2ca99853ebe0
tid: 69b2ea3b00000000
hexProcessTraceId: 046f2ca99853ebe0
hexProcessSpanId: df86221a5e291fd3
processTraceId: 319523205483326432
processSpanId: 16106598613981405139

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b2ea6a00000000c578b57468a1d1af
tid: 69b2ea6a00000000
hexProcessTraceId: c578b57468a1d1af
hexProcessSpanId: bfc8cd46aa21d8b7
processTraceId: 14229322534253351343
processSpanId: 13819521159972116663
View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: b0d2cde | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.30%. Comparing base (7d767af) to head (b0d2cde).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3694      +/-   ##
==========================================
- Coverage   62.40%   62.30%   -0.11%     
==========================================
  Files         142      142              
  Lines       13586    13586              
  Branches     1775     1775              
==========================================
- Hits         8479     8465      -14     
- Misses       4301     4314      +13     
- Partials      806      807       +1     

see 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d767af...b0d2cde. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Leiyks added 2 commits March 11, 2026 15:55
The cmd.exe "for /d" loop used to clean the workspace skips directory
entries during deletion (enumerates and deletes in the same pass), leaving
artifact output dirs from previous runs. When git clone then fails because
the workspace isn't empty, $PSNativeCommandUseErrorActionPreference = $true
is silently ignored on Windows PowerShell 5.1 (requires PS 7.3+), so the
script continues without source code and phpize.bat fails with exit 10.

Fixes:
- Replace cmd.exe cleanup loop with PowerShell-native Get-ChildItem | Remove-Item
  which handles each entry independently and tolerates locked files
- Add WARNING log line if any items could not be removed (aids debugging)
- Remove $PSNativeCommandUseErrorActionPreference (no-op on PS 5.1)
- Add explicit $LASTEXITCODE checks after git clone, checkout, and submodule init

Applied to both generate-package.php (compile extension windows) and
generate-tracer.php (windows test_c).
… cleanup

PowerShell 5.1's Remove-Item -Recurse throws "mismatch between the tag
specified in the request and the tag present in the reparse point" when
the workspace contains Windows junction points (created by switch-php,
e.g. /php <<===>> /php-nts) or NTFS symlinks (from core.symlinks=true
git clone). This caused the entire cleanup to fail silently, leaving
the full previous repo tree in place and making git clone fail again.

Fix: navigate to the parent directory and run cmd.exe "rd /s /q" on
the whole workspace directory. cmd.exe rd removes junction entries
without following them into their targets, avoiding the reparse point
issue entirely. The directory is then recreated empty before returning.
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from f068d19 to b7d680b Compare March 11, 2026 14:56
php_ddtrace.dll (and other workspace files) are locked with "Access is
denied" when a Docker container from a previous job run is still alive
with the workspace volume mounted. This causes rd /s /q to fail and
git clone to fail again.

Fix: force-remove all running containers (docker rm -f $(docker ps -aq))
before the rd /s /q workspace cleanup, releasing all file handles.
@Leiyks Leiyks changed the title ci: fix Windows workspace cleanup and fail-fast for git operations ci: fix Windows job flakiness caused by dirty workspace Mar 12, 2026
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 7f0beb4 to 181b7e6 Compare March 12, 2026 12:37
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 181b7e6 to c286608 Compare March 12, 2026 14:33
@Leiyks Leiyks marked this pull request as ready for review March 12, 2026 14:48
@Leiyks Leiyks requested a review from a team as a code owner March 12, 2026 14:48
Leiyks added 2 commits March 12, 2026 16:02
…leanup

Applies the same GIT_STRATEGY: none + manual clone pattern to the
verify windows job, saving/restoring the packages/ artifact around the
workspace cleanup to avoid git checkout failures on locked/junction files.
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 394ad5e to b0d2cde Compare March 12, 2026 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants