Skip to content

docs: Add HAProxy architecture analysis for EDB PostgreSQL routing#32

Merged
chadmf merged 1 commit into
mainfrom
docs/haproxy-pgbouncer-architecture
Apr 2, 2026
Merged

docs: Add HAProxy architecture analysis for EDB PostgreSQL routing#32
chadmf merged 1 commit into
mainfrom
docs/haproxy-pgbouncer-architecture

Conversation

@chadmf

@chadmf chadmf commented Apr 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

Add comprehensive architectural decision record (ADR) for replacing pgBouncer with HAProxy for AAP database connection routing.

Problem

  • Standard EDB reference architecture requires pgBouncer with EFM VIP management
  • AAP has compatibility issues with pgBouncer that prevent its use
  • Need alternative routing solution for AAP containers → PostgreSQL cluster

Solution

New Architecture Pattern:

AAP Containers → HAProxy (DB Router) → PostgreSQL VIP (EFM-managed) → PRIMARY

Clean separation of concerns:

  • EFM: Database-layer VIP failover orchestration (15s)
  • HAProxy: Routing layer with health checks (10s)
  • Total failover detection: ~25s (well within 5-minute RTO)

Changes

New Documentation

  • docs/haproxy-pgbouncer-architectural-analysis.md (500+ lines)
    • Architecture comparison: HAProxy vs pgBouncer
    • Design validation and trade-off analysis
    • Complete implementation guidance
    • External health check script (pg_is_in_recovery() validation)
    • Monitoring strategies and alerting
    • Long-term supportability considerations

Updated Documentation

  • docs/aap-containerized-enterprise-dr-architecture.md
    • Updated HAProxy configuration and network topology
    • Revised AAP inventory files (pg_host points to HAProxy)
    • Updated component specifications and naming

Other

  • .gitignore: Add .pub pattern

Key Architectural Trade-offs

Aspect Impact Mitigation
No connection pooling +67% PostgreSQL connections Increase max_connections to 2500, add RAM (32GB → 48GB)
Query latency -0.5ms faster (TCP vs protocol) ✅ Benefit
Operational complexity Simpler (standard HAProxy) ✅ Benefit
Failover detection +5-10s lag behind EFM Acceptable for RTO
Infrastructure cost +$300-500/month (RAM increase) Budget approval needed

Critical Implementation Requirements

  1. PostgreSQL Resource Increase:

    • max_connections: 1500 → 2500
    • RAM per node: 32GB → 48GB
    • shared_buffers: 8GB → 12GB
  2. HAProxy External Health Check:

    • Script validates pg_is_in_recovery() = false (writable node)
    • 5-second check interval with rise/fall thresholds
  3. Monitoring:

    • PostgreSQL connection count (alert at 2000/2500)
    • HAProxy backend health status
    • Replication lag (existing)

Impact on RTO/RPO

  • RTO: < 5 minutes ✅ (maintained)
  • RPO: < 5 seconds ✅ (maintained)
  • Failover detection: EFM (15s) + HAProxy health check (10s) = 25s

Test Plan

  • Review architectural decision and trade-offs
  • Validate HAProxy configuration syntax
  • Review external health check script logic
  • Confirm PostgreSQL resource requirements
  • Validate integration with EFM failover
  • Review monitoring and alerting strategy
  • Assess long-term supportability concerns

Related Issues

Addresses AAP/pgBouncer compatibility constraints while maintaining EDB PostgreSQL high availability requirements.


🤖 Generated with Claude Code

Add comprehensive architectural decision record (ADR) for replacing
pgBouncer with HAProxy for AAP database connection routing due to
AAP/pgBouncer compatibility issues.

Changes:
- Add haproxy-pgbouncer-architectural-analysis.md: 500+ line ADR
  covering architecture comparison, design validation, implementation
  guidance, health check scripts, and trade-off analysis
- Update aap-containerized-enterprise-dr-architecture.md: Revise
  HAProxy configuration, network topology, and inventory files to
  reflect HAProxy database router pattern
- Update .gitignore: Add .pub pattern

Key architectural decision:
- HAProxy routes AAP containers to PostgreSQL VIP (EFM-managed)
- External health check validates writable node via pg_is_in_recovery()
- Clean separation: EFM handles DB failover, HAProxy handles routing
- Trade-off: Requires +67% max_connections (no pooling) but simpler ops

RTO/RPO impact: Failover detection ~25s (well within 5min target)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@chadmf chadmf requested a review from nickarellano April 2, 2026 16:26
@chadmf chadmf merged commit d50c4e3 into main Apr 2, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant