summary
Paused linux postgres branching benchmark campaign.
When we resume, rerun everything from start. Do not resume from old artifacts.
Goal:
- single-host linux
- postgres in docker
- branching ops fast (<5s p95)
- runtime performance high
- low storage overhead
cleanup done
Benchmark VM cleanup confirmed.
Current Hetzner servers:
frost-live
frost-demo-1771268678
No frost-pg-bench-* server is running.
what is implemented
Research harness + adapters + tracking docs are in benchmarks/:
run-branching-research-all.sh
run-branching-ops-bench.sh
run-branching-runtime-bench.sh
run-branching-scale-bench.sh
run-branching-soak-bench.sh
run-ext4-control-bench.sh
run-branching-hetzner-batch.sh
provision-linux-vm.sh
backends/*.sh
- tracking docs/csv in
benchmarks/branching-research-*.md|csv
important scripts (copy/paste)
benchmarks/run-branching-research-all.sh
benchmarks/run-branching-hetzner-batch.sh
benchmarks/provision-linux-vm.sh
benchmarks/run-branching-ops-bench.sh
benchmarks/run-ext4-control-bench.sh
benchmarks/run-branching-runtime-bench.sh
benchmarks/run-branching-scale-bench.sh
benchmarks/run-branching-soak-bench.sh
benchmarks/collect-metrics.sh
benchmarks/parse-results.sh
benchmarks/backends/lvm-thin-adapter-core.sh
benchmarks/backends/lvm-thin-ext4-adapter.sh
benchmarks/backends/lvm-thin-meta-adapter.sh
benchmarks/backends/zfs-clone-adapter.sh
benchmarks/backends/btrfs-subvolume-adapter.sh
benchmarks/backends/xfs-reflink-adapter.sh
benchmarks/backends/backend-contract.sh
# full campaign (from scratch)
./benchmarks/run-branching-research-all.sh
# phase-1 ops only
./benchmarks/run-branching-ops-bench.sh
# ext4 baseline only
./benchmarks/run-ext4-control-bench.sh
# runtime only
./benchmarks/run-branching-runtime-bench.sh
# scale only
./benchmarks/run-branching-scale-bench.sh
# soak only
./benchmarks/run-branching-soak-bench.sh
key fixes already made
- fixed CSV parse crash in ops bench (
run-branching-ops-bench.sh) by sanitizing multiline/comma error text
- fixed lvm prepare issues (
backends/lvm-thin-adapter-core.sh):
- enforce min LV size for pgbench init
- save state earlier so cleanup is reliable
- fixed runtime runner to skip failed backends instead of crashing entire batch (
run-branching-runtime-bench.sh)
- added phase-1 result override in orchestrator (exists, but do not use for next campaign)
previous run outputs (reference only)
These are historical signals only. Do not use them in final decision package for the next campaign.
Reference dirs:
benchmarks/results/20260227-085716-branching-research-ops/
benchmarks/results/20260227-093442-ext4-control/
benchmarks/results/20260227-120250-branching-research-all/
current blocker to fix before rerun
run-branching-research-all.sh can hang in phase-1 wrapper after remote completion.
Symptom:
- remote
ops-gates.csv exists
- local process still stuck on SSH command in
run-branching-hetzner-batch.sh
Likely area:
- SSH lifecycle / remote command return handling in
run-branching-hetzner-batch.sh
restart plan (from scratch)
- Fix SSH hang behavior in
run-branching-hetzner-batch.sh:
- timeout around remote execution
- completion marker check
- forced teardown path
- Start a new full run from phase-1, no overrides:
./benchmarks/run-branching-research-all.sh
- Let full campaign complete in order:
- phase1 ops
- phase2 ext4 control
- phase2 runtime
- phase2 scale
- phase3 ext4 control
- phase3 runtime
- phase3 scale
- phase4 soak (24h)
- decision package
- Use only the fresh run artifacts for final ranking/report.
- Verify benchmark VM auto-delete after each batch and at end.
done when
- full campaign completes from phase-1 without artifact reuse
- final report generated in new pipeline dir
- no leftover benchmark VM in Hetzner
summary
Paused linux postgres branching benchmark campaign.
When we resume, rerun everything from start. Do not resume from old artifacts.
Goal:
cleanup done
Benchmark VM cleanup confirmed.
Current Hetzner servers:
frost-livefrost-demo-1771268678No
frost-pg-bench-*server is running.what is implemented
Research harness + adapters + tracking docs are in
benchmarks/:run-branching-research-all.shrun-branching-ops-bench.shrun-branching-runtime-bench.shrun-branching-scale-bench.shrun-branching-soak-bench.shrun-ext4-control-bench.shrun-branching-hetzner-batch.shprovision-linux-vm.shbackends/*.shbenchmarks/branching-research-*.md|csvimportant scripts (copy/paste)
# full campaign (from scratch) ./benchmarks/run-branching-research-all.sh# phase-1 ops only ./benchmarks/run-branching-ops-bench.sh# ext4 baseline only ./benchmarks/run-ext4-control-bench.sh# runtime only ./benchmarks/run-branching-runtime-bench.sh# scale only ./benchmarks/run-branching-scale-bench.sh# soak only ./benchmarks/run-branching-soak-bench.shkey fixes already made
run-branching-ops-bench.sh) by sanitizing multiline/comma error textbackends/lvm-thin-adapter-core.sh):run-branching-runtime-bench.sh)previous run outputs (reference only)
These are historical signals only. Do not use them in final decision package for the next campaign.
Reference dirs:
benchmarks/results/20260227-085716-branching-research-ops/benchmarks/results/20260227-093442-ext4-control/benchmarks/results/20260227-120250-branching-research-all/current blocker to fix before rerun
run-branching-research-all.shcan hang in phase-1 wrapper after remote completion.Symptom:
ops-gates.csvexistsrun-branching-hetzner-batch.shLikely area:
run-branching-hetzner-batch.shrestart plan (from scratch)
run-branching-hetzner-batch.sh:./benchmarks/run-branching-research-all.shdone when