feat(pg-pg): Automated schema dump mode#4283
Conversation
4c76639 to
c6688d5
Compare
❌ 2 Tests Failed:
View the top 2 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
❌ Test FailureAnalysis: Consistent, deterministic failures across all CI matrix jobs caused by |
c6688d5 to
d26ec76
Compare
🔄 Flaky Test DetectedAnalysis: Multiple e2e tests in ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: Multiple |
🔄 Flaky Test DetectedAnalysis: The e2e test suite timed out after exactly 900s (the configured limit) on the MariaDB 8.0 matrix job, with no specific assertion failures — a classic symptom of CI infrastructure slowness or a hanging test rather than a code regression. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: All PG e2e tests consistently time out on STATUS_SETUP across every CI matrix variant, indicating the |
🔄 Flaky Test DetectedAnalysis: All 7–8 failures across every matrix shard show "UNEXPECTED STATUS TIMEOUT STATUS_SETUP", meaning CDC workflows timed out during Temporal workflow setup — a classic infrastructure/timing flake, not a logic regression. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: The same set of e2e PostgreSQL tests ( |
🔄 Flaky Test DetectedAnalysis: Three separate e2e matrix jobs failed due to timing/timeout issues: one run hit the 15-minute test suite timeout causing a cascade panic, while the other two had individual WaitFor-based tests fail in different tests across different matrix configurations, with no consistent assertion error pointing to a real code regression. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: The same two PG schema dump tests fail with STATUS_SETUP timeouts across all CI matrix configurations, consistent with a real regression introduced by the recent Docker image tag upgrade (PR #4285) rather than random flakiness. |
🔄 Flaky Test DetectedAnalysis: Both failing tests (Test_PG_Schema_Dump_No_Owner_No_Privileges and Test_PG_Schema_Dump_And_CDC) hit "UNEXPECTED STATUS TIMEOUT STATUS_SETUP" across two independent matrix configurations, indicating the Temporal workflow setup phase exceeded the wait deadline due to CI resource contention rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Both failing tests ( ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: All three failing tests hit setup/propagation timeouts ("UNEXPECTED STATUS TIMEOUT STATUS_SETUP", repeated record-count polling mismatches, ClickHouse table-not-found) that are consistent with CI resource contention in a parallelized e2e suite, not a logic regression from the unrelated ClickHouse GCS staging commit. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: CI setup fails deterministically with "E: Unable to locate package postgresql-client-18" because the PostgreSQL apt repository is not configured on the runner, so no tests execute at all. |
🔄 Flaky Test DetectedAnalysis: All failures are "UNEXPECTED STATUS TIMEOUT STATUS_SETUP" errors in Temporal-orchestrated e2e tests unrelated to the merged ClickHouse change, consistent with CI resource saturation causing setup phase timeouts. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: Both failing tests ( ✅ Automatically retrying the workflow |
6cc1c5e to
97a5fc4
Compare
🔄 Flaky Test DetectedAnalysis: The test ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: Real build failure: |
Code ReviewBug found in When the However, Suggested fix: Have Also checked for CLAUDE.md compliance -- no violations found. |
323877a to
5aeaf1e
Compare
❌ Test FailureAnalysis: Real bug: |
❌ Test FailureAnalysis: Test_PartitionBy fails deterministically across all 4 ClickHouse test suites with the same assertion mismatch (expected "num" but got "(num)" for the partition_key column), indicating a real regression in how partition keys are generated or reported, not a timing/flaky issue. |
❌ Test FailureAnalysis: Test_PartitionBy deterministically fails across all 4 ClickHouse test suites because ClickHouse now returns "(num)" instead of "num" for partition_key in system.tables, indicating a real behavioral regression rather than a flaky failure. |
80e2bf7 to
b90d3a1
Compare
🔄 Flaky Test DetectedAnalysis: All failures trace back to ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test suite hit the hard 900-second timeout (ran 900.644s) with no assertion failures, indicating a slow or hanging test rather than a code regression. ✅ Automatically retrying the workflow |
🔄 Flaky Test DetectedAnalysis: The e2e test package timed out at exactly the 900s hard limit with no assertion failures, indicating a flaky infrastructure/timing issue rather than a code regression. ✅ Automatically retrying the workflow |
❌ Test FailureAnalysis: All 4 NullEngine test variants fail deterministically at the resync/initial-snapshot stage because data inserted during snapshot doesn't flow through the ClickHouse NullEngine materialized view to the target table, indicating a real bug not a flaky failure. |
This reverts commit 71af23a.
b90d3a1 to
9471f04
Compare
Why
What
SetupFlowWorkflowwhich runs pg_dump on the source database and pipes its output to the target database via PSQL.pg_dumpallrequires exact major version matching between pg_dumpall version and target version, which we cannot guarantee.