Skip to content

fix(deploy): restrict firewall defaults, remove hardcoded Grafana password, bound Prometheus retention#9

Open
wpank wants to merge 1 commit into
Nunchi-trade:mainfrom
wpank:fix/deploy-security-firewall-grafana-prometheus
Open

fix(deploy): restrict firewall defaults, remove hardcoded Grafana password, bound Prometheus retention#9
wpank wants to merge 1 commit into
Nunchi-trade:mainfrom
wpank:fix/deploy-security-firewall-grafana-prometheus

Conversation

@wpank
Copy link
Copy Markdown

@wpank wpank commented May 30, 2026

Summary

  • #181 -- Firewall trusted_ips open to internet: Changed trusted_ips default from 0.0.0.0/0 to 127.0.0.1/32 (localhost-only). Added a preflight Ansible assertion in the firewall role that fails the play if 0.0.0.0/0 is ever configured, preventing accidental re-introduction.

  • #182 -- Grafana admin password hardcoded as "admin": Removed the grafana_admin_password: admin default from group_vars/devnet.yml. The observe role now fails early if the password is undefined, shorter than 8 characters, or still "admin". The docker-compose Grafana service uses ${GF_SECURITY_ADMIN_PASSWORD:?...} so compose itself refuses to start without an explicit password.

  • #186 -- Prometheus retention unbounded: Added --storage.tsdb.retention.time=15d and --storage.tsdb.retention.size=10GB flags to the Prometheus command in docker-compose, preventing unbounded disk growth on long-running devnets.

  • #187 -- No alerting destination configured: Added an Alertmanager service to docker-compose (hardened with read_only, no-new-privileges, cap_drop ALL), created docker/config/alertmanager.yml with placeholder webhook receivers and inhibition rules, wired Prometheus alerting config to forward to Alertmanager, and added an Alertmanager health check to the observe Ansible role.

Closes #181, #182, #186, #187.

Test plan

  • Verify ansible-playbook provision.yml fails if trusted_ips contains 0.0.0.0/0
  • Verify ansible-playbook observe.yml fails without grafana_admin_password or with admin
  • Verify docker compose --profile observability config shows retention flags on Prometheus
  • Verify docker compose --profile observability config shows alertmanager service
  • Verify GF_SECURITY_ADMIN_PASSWORD is required (compose errors without it)
  • Deploy to a test devnet and confirm all services come up healthy

🤖 Generated with Claude Code

…sword, bound Prometheus retention

- Change trusted_ips default from 0.0.0.0/0 to 127.0.0.1/32 so RPC,
  metrics, Prometheus, and Grafana are not exposed to the internet
  out of the box. Add a preflight assertion in the firewall role that
  rejects 0.0.0.0/0 to prevent accidental re-introduction. (Closes #181)

- Remove the hardcoded grafana_admin_password: admin from group_vars.
  The observe role now fails early if the password is unset, shorter
  than 8 characters, or still "admin". The docker-compose Grafana
  service uses a :? parameter expansion so compose itself also refuses
  to start without an explicit password. (Closes #182)

- Add --storage.tsdb.retention.time=15d and
  --storage.tsdb.retention.size=10GB to the Prometheus command so TSDB
  cannot grow unbounded on long-running devnets. (Closes #186)

- Add Alertmanager service (docker-compose), alertmanager.yml config
  with placeholder webhook receivers, wire Prometheus alerting section
  to forward firing alerts, and add Alertmanager health check to the
  observe Ansible role. (Closes #187)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant