Overview
Design and implement a robust, self-healing backup service that ensures all platform databases are continuously backed up to multiple independent storage locations. Essentially we take a the several github action scripts we have in all the repos and replace them with a service that we can also use in the Cloud API later.
Current State
We have basic backup workflows in place:
service-secrets: Infisical API export → age encryption → Storacha + GitHub Artifacts
service-cloud-api: PostgreSQL backup workflow (needs sidecar implementation)
service-auth: SQLite backup workflow (needs sidecar implementation)
Current Limitations:
- PostgreSQL/SQLite are internal to Akash deployments (not accessible from GitHub Actions)
- No automatic verification that backups are valid
- No alerting on backup failures
- Manual restore process
- No backup rotation/retention management
Requirements
Core Features
Advanced Features
Technical Design
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Akash Deployment │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │
│ │ App │ │ Database │ │ Backup Sidecar │ │
│ │ (API) │◄──►│ (PostgreSQL)│◄──►│ (pg_dump daily) │ │
│ └─────────────┘ └─────────────┘ └────────┬────────┘ │
└─────────────────────────────────────────────────┼──────────┘
│
┌─────────────────────────────┼─────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌─────────────┐
│ Storacha │ │ GitHub │ │ Backup │
│ (IPFS) │ │ Artifacts │ │ Catalog │
└──────────┘ └──────────────┘ └─────────────┘
Backup Sidecar Container
- Alpine Linux + pg_dump/sqlite3 + age + w3 CLI
- Cron schedule: 3 AM UTC daily
- Process: dump → compress → encrypt → upload to Storacha → POST metadata to catalog
- Health endpoint:
/health for monitoring
- Metrics: backup size, duration, success/failure
GitHub Secrets Required (per service)
AGE_PUBLIC_KEY: Encryption key
W3_PRINCIPAL: Storacha authentication
W3_PROOF: Storacha delegation proof
INFISICAL_CLIENT_*: For fetching other secrets
Implementation Steps
- Create backup sidecar Docker image
- Update Akash SDLs to include sidecar
- Implement backup catalog service (can be simple JSON in Storacha initially)
- Add backup verification job
- Add alerting integration
- Document restore procedures
Labels
infrastructure, backup, priority-high
Overview
Design and implement a robust, self-healing backup service that ensures all platform databases are continuously backed up to multiple independent storage locations. Essentially we take a the several github action scripts we have in all the repos and replace them with a service that we can also use in the Cloud API later.
Current State
We have basic backup workflows in place:
service-secrets: Infisical API export → age encryption → Storacha + GitHub Artifactsservice-cloud-api: PostgreSQL backup workflow (needs sidecar implementation)service-auth: SQLite backup workflow (needs sidecar implementation)Current Limitations:
Requirements
Core Features
Advanced Features
Technical Design
Architecture
Backup Sidecar Container
/healthfor monitoringGitHub Secrets Required (per service)
AGE_PUBLIC_KEY: Encryption keyW3_PRINCIPAL: Storacha authenticationW3_PROOF: Storacha delegation proofINFISICAL_CLIENT_*: For fetching other secretsImplementation Steps
Labels
infrastructure,backup,priority-high