diff --git a/.gitignore b/.gitignore index b12730e2..4c1b874d 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,4 @@ .DS_STORE Chart.lock +charts/ .claude \ No newline at end of file diff --git a/cassandra/icon.png b/cassandra/icon.png new file mode 100644 index 00000000..653cc6c6 Binary files /dev/null and b/cassandra/icon.png differ diff --git a/cassandra/versions/1.0.0/Chart.yaml b/cassandra/versions/1.0.0/Chart.yaml new file mode 100644 index 00000000..6fe2b3ca --- /dev/null +++ b/cassandra/versions/1.0.0/Chart.yaml @@ -0,0 +1,17 @@ +apiVersion: v2 +name: cassandra +description: Cassandra cluster for Control Plane +type: application +version: 1.0.0 +appVersion: "5.0" + +annotations: + created: "2026-05-18" + lastModified: "2026-05-18" + category: "database" + createsGvc: false + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" diff --git a/cassandra/versions/1.0.0/README.md b/cassandra/versions/1.0.0/README.md new file mode 100644 index 00000000..8913d436 --- /dev/null +++ b/cassandra/versions/1.0.0/README.md @@ -0,0 +1,224 @@ +# Cassandra + +This app deploys a Cassandra 5.0 cluster in a single location. Each node runs as a stateful replica with its own persistent volume, forming a peer-to-peer cluster that distributes and replicates data across nodes according to the configured replication factor. The template includes optional scheduled backups (logical or physical) and periodic anti-entropy repair. + +## Architecture + +- **Cassandra cluster**: Multi-node cluster deployed in a single location where each node owns a slice of the token ring and replicates data to peers +- **Per-node volumes**: Each node gets its own persistent volume so SSTable data survives restarts +- **Repair** (optional): Scheduled cron job that runs `nodetool repair` across all nodes to keep data consistent +- **Backup** (optional): Logical (`cqlsh COPY TO`) or physical (`nodetool snapshot`) backup to S3 or GCS + +## Configuration + +### Core Settings + +```yaml +replicas: 3 # Number of Cassandra nodes +replicationFactor: 3 # Copies of each partition stored across the cluster + # Must not exceed replicas + +superuserPassword: supersecretpassword # Built-in cassandra superuser password +username: username # Application user +password: password # Application user password +keyspaceName: mydatabase # Keyspace created on startup + +image: cassandra:5.0 +cpu: 1 +memory: 4Gi +jvmHeapSize: 2G # Set to ~50% of memory — Cassandra needs the rest for off-heap cache +clusterName: my-cassandra +``` + +**Volume** — set the initial storage capacity and optionally enable autoscaling: + +```yaml +volumes: + data: + initialCapacity: 10 # GiB + autoscaling: + maxCapacity: 100 + minFreePercentage: 20 + scalingFactor: 1.5 +``` + +Configure which workloads can reach Cassandra: + +```yaml +internal_access: + type: same-gvc # Options: same-gvc, same-org, workload-list + workloads: + # Uncomment and specify workloads if using workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME +``` + +- `same-gvc`: Allow access from all workloads in the same GVC +- `same-org`: Allow access from all workloads in the org +- `workload-list`: Allow access only from specified workloads + +## Connecting + +Each Cassandra replica is reachable via its own DNS name: + +``` +Host: {release-name}-cassandra-{n}.{gvc}.cpln.local +Port: 9042 (CQL, native transport) +Username: {username} +Password: {password} +Keyspace: {keyspaceName} +``` + +Provide multiple node hostnames as contact points in your application so it can discover the full cluster topology. + +## Replicas vs Replication Factor + +These are two separate settings that work together: + +- **`replicas`** — how many Cassandra nodes are deployed. More nodes means more capacity and better throughput, as the token ring is split across more nodes. +- **`replicationFactor`** — how many copies of each partition are stored across the cluster. A replication factor of 3 means every row exists on 3 different nodes, so the cluster can survive 2 node failures without data loss (with `QUORUM` consistency). + +`replicationFactor` must not exceed `replicas` — you cannot store 3 copies of data across only 2 nodes. + +## Multi-Zone + +When `multiZone.enabled: true`, Control Plane spreads replicas across availability zones within the location: + +```yaml +multiZone: + enabled: true +``` + +With a replication factor of 3 across 3 zones, each zone holds one copy of every partition. The cluster survives a complete zone outage with no data loss, provided your client uses `LOCAL_QUORUM` consistency (reads and writes succeed with responses from the surviving 2 zones). + +Verify your selected location supports multi-zone before enabling this option. + +## Repair + +Cassandra uses eventual consistency — when nodes miss writes during downtime, data can drift out of sync. `nodetool repair` runs an anti-entropy process that compares and reconciles data across all replicas. Repair must complete across all nodes at least once within `gc_grace_seconds` (default: 10 days) to prevent deleted data from reappearing. + +The template includes a scheduled repair cron job: + +```yaml +repair: + enabled: true + schedule: "0 2 * * 0" # Weekly, Sunday at 2am UTC +``` + +The default weekly schedule satisfies the 10-day `gc_grace_seconds` requirement with margin. Do not disable repair in production or increase the interval beyond 10 days. + +Repair can be resource-intensive on large datasets. If it impacts query performance, consider running it during low-traffic windows or increasing node resources. + +## Backing Up + +Two backup modes are available: + +- **Logical** — exports keyspace tables as CSVs using `cqlsh COPY TO`, then uploads to cloud storage. Runs as a standalone cron workload on schedule. Suitable for smaller datasets or when portability matters. +- **Physical** — creates SSTable snapshots using `nodetool snapshot` and syncs them to cloud storage. Runs as a sidecar container on each Cassandra replica. Faster and more space-efficient for large datasets, but backups are per-node and must be restored node-by-node. + +Set `backup.enabled: true`, choose a `type`, set `backup.provider`, and fill in the corresponding cloud block: + +```yaml +backup: + enabled: true + type: logical # logical or physical + image: ghcr.io/controlplane-com/backup-images/cassandra-backup:5.0 + schedule: "0 2 * * *" # daily at 2am UTC + + resources: + cpu: 250m + memory: 256Mi + + provider: aws # aws or gcp + + aws: + bucket: my-backup-bucket + region: us-east-1 + cloudAccountName: my-backup-cloudaccount + policyName: my-s3-policy + prefix: cassandra/backups + + gcp: + bucket: my-backup-bucket + cloudAccountName: my-cloud-account + prefix: cassandra/backups +``` + +### AWS S3 + +1. Create your S3 bucket. Set `aws.bucket` and `aws.region` to match. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Set `aws.cloudAccountName` to match. + +3. Create an AWS IAM policy with the following JSON (replace `YOUR_BUCKET_NAME`): + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:ListBucket", + "s3:GetObjectVersion", + "s3:DeleteObjectVersion" + ], + "Resource": [ + "arn:aws:s3:::YOUR_BUCKET_NAME", + "arn:aws:s3:::YOUR_BUCKET_NAME/*" + ] + } + ] +} +``` + +4. Set `aws.policyName` to the name of the policy created in step 3. + +### GCS + +1. Create your GCS bucket. Set `gcp.bucket` to match. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Set `gcp.cloudAccountName` to match. + +**Important**: Add the `Storage Admin` role to the GCP service account created for the Cloud Account. + +## Restoring a Backup + +### Logical Restore + +Exec into the backup cron workload and run `restore.sh` with the timestamp of the backup you want to restore: + +```bash +RESTORE_TIMESTAMP=2026-05-15T02-00-00Z /usr/local/bin/restore.sh +``` + +The timestamp format matches the backup filename in your bucket (e.g. `cassandra/backups/2026-05-15T02-00-00Z/`). + +The script downloads the CSVs for the configured keyspace and replays them into Cassandra using `cqlsh COPY FROM`. Existing rows with matching primary keys are overwritten; rows not in the backup are left in place. + +### Physical Restore + +Physical backups are per-node — each replica backed up its own SSTable slice. To restore, exec into the **backup sidecar container** (not the cassandra container) on each replica that needs to be restored and run: + +```bash +RESTORE_TIMESTAMP=2026-05-15T02-00-00Z /usr/local/bin/restore.sh +``` + +The script downloads the snapshot files for that replica from `{prefix}/{timestamp}/{hostname}/`, writes them to the shared volume, then calls `nodetool import` to load the SSTables into the live Cassandra instance without a restart. + +**Important**: Repeat this on every replica. Because each node owns a different token range, restoring only one replica leaves the cluster with incomplete data. + +## Important Notes + +- **Minimum replicas for production**: Use at least 3 replicas with a replication factor of 3 so the cluster can survive a node failure while still achieving quorum +- **JVM heap**: Set `jvmHeapSize` to approximately 50% of `memory` — Cassandra relies heavily on off-heap memory for bloom filters, row cache, and OS page cache +- **gc_grace_seconds**: The default is 10 days. Ensure repair runs at least once within this window on all nodes, or deleted data may reappear after a node recovers from downtime +- **Scaling up**: Adding replicas after initial deployment does not automatically rebalance data. Run `nodetool rebuild` on new nodes and then `nodetool cleanup` on existing nodes after scaling +- **Multi-zone**: Verify your selected location supports multi-zone before enabling + +## Supported External Services + +- [Cassandra Documentation](https://cassandra.apache.org/doc/latest/) +- [Cassandra Driver Documentation](https://docs.datastax.com/en/developer/driver-matrix/doc/common/driverMatrix.html) diff --git a/cassandra/versions/1.0.0/templates/_helpers.tpl b/cassandra/versions/1.0.0/templates/_helpers.tpl new file mode 100644 index 00000000..8dc8cf0c --- /dev/null +++ b/cassandra/versions/1.0.0/templates/_helpers.tpl @@ -0,0 +1,80 @@ +{{/* Resource Naming */}} + +{{- define "cassandra.workload.name" -}} +{{- printf "%s-cassandra" .Release.Name }} +{{- end }} + +{{- define "cassandra.secret.init.name" -}} +{{- printf "%s-cassandra-init" .Release.Name }} +{{- end }} + +{{- define "cassandra.secret.config.name" -}} +{{- printf "%s-cassandra-config" .Release.Name }} +{{- end }} + +{{- define "cassandra.identity.name" -}} +{{- printf "%s-cassandra-identity" .Release.Name }} +{{- end }} + +{{- define "cassandra.policy.name" -}} +{{- printf "%s-cassandra-policy" .Release.Name }} +{{- end }} + +{{- define "cassandra.volumeset.name" -}} +{{- printf "%s-cassandra-data" .Release.Name }} +{{- end }} + +{{- define "cassandra.secret.credentials.name" -}} +{{- printf "%s-cassandra-credentials" .Release.Name }} +{{- end }} + +{{- define "cassandra.workload.repair.name" -}} +{{- printf "%s-cassandra-repair" .Release.Name }} +{{- end }} + +{{- define "cassandra.workload.backup.name" -}} +{{- printf "%s-cassandra-backup" .Release.Name }} +{{- end }} + + +{{/* Validation */}} + +{{- define "cassandra.validate" -}} +{{- if gt (.Values.replicationFactor | int) (.Values.replicas | int) }} +{{- fail (printf "replicationFactor (%d) cannot exceed replicas (%d)" (.Values.replicationFactor | int) (.Values.replicas | int)) }} +{{- end }} +{{- if .Values.backup.enabled }} + {{- if not (or (eq .Values.backup.type "logical") (eq .Values.backup.type "physical")) }} + {{- fail (printf "backup.type must be 'logical' or 'physical', got: %s" .Values.backup.type) }} + {{- end }} + {{- if not (or (eq .Values.backup.provider "aws") (eq .Values.backup.provider "gcp")) }} + {{- fail (printf "backup.provider must be 'aws' or 'gcp', got: %s" .Values.backup.provider) }} + {{- end }} + {{- if eq .Values.backup.provider "aws" }} + {{- if not .Values.backup.aws.cloudAccountName }} + {{- fail "backup.aws.cloudAccountName is required when backup.provider is aws" }} + {{- end }} + {{- if not .Values.backup.aws.policyName }} + {{- fail "backup.aws.policyName is required when backup.provider is aws" }} + {{- end }} + {{- if not .Values.backup.aws.bucket }} + {{- fail "backup.aws.bucket is required when backup.provider is aws" }} + {{- end }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + {{- if not .Values.backup.gcp.cloudAccountName }} + {{- fail "backup.gcp.cloudAccountName is required when backup.provider is gcp" }} + {{- end }} + {{- if not .Values.backup.gcp.bucket }} + {{- fail "backup.gcp.bucket is required when backup.provider is gcp" }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} + + +{{/* Labeling */}} + +{{- define "cassandra.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} diff --git a/cassandra/versions/1.0.0/templates/identity.yaml b/cassandra/versions/1.0.0/templates/identity.yaml new file mode 100644 index 00000000..4a4229f8 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/identity.yaml @@ -0,0 +1,22 @@ +kind: identity +name: {{ include "cassandra.identity.name" . }} +description: {{ include "cassandra.workload.name" . }} identity +tags: {{- include "cassandra.tags" . | nindent 2 }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.backup.aws.cloudAccountName }} + policyRefs: + - cpln-connector + - aws::ReadOnlyAccess + - {{ .Values.backup.aws.policyName | quote }} +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} +gcp: + bindings: + - resource: //storage.googleapis.com/projects/_/buckets/{{ .Values.backup.gcp.bucket }} + roles: + - roles/storage.objectAdmin + cloudAccountLink: //cloudaccount/{{ .Values.backup.gcp.cloudAccountName }} + scopes: + - https://www.googleapis.com/auth/cloud-platform +{{- end }} diff --git a/cassandra/versions/1.0.0/templates/policy.yaml b/cassandra/versions/1.0.0/templates/policy.yaml new file mode 100644 index 00000000..40708455 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/policy.yaml @@ -0,0 +1,13 @@ +kind: policy +name: {{ include "cassandra.policy.name" . }} +origin: default +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "cassandra.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "cassandra.secret.init.name" . }} + - //secret/{{ include "cassandra.secret.config.name" . }} + - //secret/{{ include "cassandra.secret.credentials.name" . }} diff --git a/cassandra/versions/1.0.0/templates/secret-config.yaml b/cassandra/versions/1.0.0/templates/secret-config.yaml new file mode 100644 index 00000000..19fe8c90 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/secret-config.yaml @@ -0,0 +1,39 @@ +kind: secret +name: {{ include "cassandra.secret.config.name" . }} +type: opaque +data: + encoding: plain + payload: | + cluster_name: '{{ .Values.clusterName }}' + + # Networking — replaced at pod startup by the init script. + # listen_address and broadcast_address both use the pod's own FQDN (resolves + # to pod IP). Within a single location all pods share the same cluster network. + listen_address: LISTEN_ADDRESS_PLACEHOLDER + broadcast_address: LISTEN_ADDRESS_PLACEHOLDER + storage_port: 9043 + rpc_address: 0.0.0.0 + broadcast_rpc_address: LISTEN_ADDRESS_PLACEHOLDER + + # Seeds — replaced at pod startup by the init script + seed_provider: + - class_name: org.apache.cassandra.locator.SimpleSeedProvider + parameters: + - seeds: "SEEDS_PLACEHOLDER" + + # Required directives (Cassandra 5.x has no built-in fallback for these) + partitioner: org.apache.cassandra.dht.Murmur3Partitioner + num_tokens: 16 + commitlog_sync: periodic + commitlog_sync_period: 10000ms + + # Storage + data_file_directories: + - /var/lib/cassandra/data + commitlog_directory: /var/lib/cassandra/commitlog + saved_caches_directory: /var/lib/cassandra/saved_caches + + endpoint_snitch: GossipingPropertyFileSnitch + + authenticator: PasswordAuthenticator + authorizer: CassandraAuthorizer diff --git a/cassandra/versions/1.0.0/templates/secret-credentials.yaml b/cassandra/versions/1.0.0/templates/secret-credentials.yaml new file mode 100644 index 00000000..486ad1a6 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/secret-credentials.yaml @@ -0,0 +1,15 @@ +kind: secret +name: {{ include "cassandra.secret.credentials.name" . }} +type: dictionary +data: + username: {{ .Values.username }} + password: {{ .Values.password }} + keyspace: {{ .Values.keyspaceName }} + superuser-password: {{ .Values.superuserPassword }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} + backup-bucket: {{ .Values.backup.aws.bucket | quote }} + aws-region: {{ .Values.backup.aws.region | quote }} +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} + backup-bucket: {{ .Values.backup.gcp.bucket | quote }} +{{- end }} diff --git a/cassandra/versions/1.0.0/templates/secret-init.yaml b/cassandra/versions/1.0.0/templates/secret-init.yaml new file mode 100644 index 00000000..e842d8ba --- /dev/null +++ b/cassandra/versions/1.0.0/templates/secret-init.yaml @@ -0,0 +1,158 @@ +kind: secret +name: {{ include "cassandra.secret.init.name" . }} +type: opaque +data: + encoding: plain + payload: | + #!/bin/bash + set -euo pipefail + + # Derive own internal FQDN and pod IP from /etc/hosts. + # Control Plane inserts a line like: + # 10.x.x.x cassandra-0.cassandra..svc.cluster.local cassandra-0 + MY_FQDN=$(grep -E "^[0-9]" /etc/hosts | grep "${HOSTNAME}" | awk '{print $2}') + + if [ -z "${MY_FQDN}" ]; then + echo "ERROR: Could not derive FQDN for ${HOSTNAME} from /etc/hosts" + cat /etc/hosts + exit 1 + fi + + MY_IP=$(grep -E "^[0-9]" /etc/hosts | grep "${HOSTNAME}" | awk '{print $1}') + REPLICA_INDEX=$(echo "${HOSTNAME}" | awk -F'-' '{print $NF}') + LOCATION=$(basename "${CPLN_LOCATION}") + WORKLOAD_NAME=$(basename "${CPLN_WORKLOAD}") + GVC_NAME=$(basename "${CPLN_GVC}") + + echo "HOSTNAME: ${HOSTNAME}" + echo "MY_FQDN: ${MY_FQDN}" + echo "MY_IP: ${MY_IP}" + echo "REPLICA_INDEX: ${REPLICA_INDEX}" + echo "LOCATION: ${LOCATION}" + echo "WORKLOAD_NAME: ${WORKLOAD_NAME}" + echo "GVC_NAME: ${GVC_NAME}" + + # Seeds: stable per-replica headless-service FQDNs (K8s stateful-set DNS). + # Strip the pod-specific prefix from MY_FQDN to get the base domain suffix, + # then construct the FQDN for every replica without hard-coding the namespace hash. + FQDN_BASE="${MY_FQDN#${HOSTNAME}.}" + SEEDS="" + for i in $(seq 0 $(( {{ .Values.replicas | int }} - 1 ))); do + SEEDS="${SEEDS}${WORKLOAD_NAME}-${i}.${FQDN_BASE}," + done + SEEDS="${SEEDS%,}" + echo "SEEDS: ${SEEDS}" + + # Cassandra data dir ownership (cassandra runs as uid 999 in the official image). + mkdir -p /var/lib/cassandra + chown -R cassandra:cassandra /var/lib/cassandra 2>/dev/null || true + rm -rf /var/lib/cassandra/lost+found 2>/dev/null || true + + # Write cassandra-rackdc.properties for DC/rack awareness. + # DC = location name (e.g. aws-us-east-2), rack = rack1. + mkdir -p /etc/cassandra + printf 'dc=%s\nrack=rack1\nprefer_local=true\n' "${LOCATION}" > /etc/cassandra/cassandra-rackdc.properties + echo "cassandra-rackdc.properties written: dc=${LOCATION} rack=rack1 prefer_local=true" + + # JMX credentials required for remote nodetool (LOCAL_JMX=no enables auth) + printf 'cassandra %s\n' "{{ .Values.superuserPassword }}" > /etc/cassandra/jmxremote.password + chmod 400 /etc/cassandra/jmxremote.password + printf 'cassandra readwrite\n' > /etc/cassandra/jmxremote.access + chmod 644 /etc/cassandra/jmxremote.access + chown cassandra:cassandra /etc/cassandra/jmxremote.password /etc/cassandra/jmxremote.access 2>/dev/null || true + + # Copy the mounted config template and replace placeholders with runtime values. + # (Heredocs cannot be used inside a Helm YAML payload — << is a YAML merge key.) + cp /cassandra-config/cassandra.yaml /etc/cassandra/cassandra.yaml + sed -i "s|LISTEN_ADDRESS_PLACEHOLDER|${MY_FQDN}|g" /etc/cassandra/cassandra.yaml + sed -i "s|SEEDS_PLACEHOLDER|${SEEDS}|g" /etc/cassandra/cassandra.yaml + + echo "cassandra.yaml written." + + if [ "${REPLICA_INDEX}" = "0" ]; then + # Replica 0 runs bootstrap after Cassandra is ready. + # exec cannot be used here because bootstrap runs post-startup, so we + # background Cassandra, trap SIGTERM for forwarding, and wait on its PID. + _stop() { kill -TERM "${CASS_PID}" 2>/dev/null; wait "${CASS_PID}" 2>/dev/null || true; } + trap _stop TERM INT + # Set HOSTNAME to FQDN so cassandra-env.sh advertises the full FQDN as the + # JMX/RMI callback address, making remote nodetool (repair cron) reachable. + export HOSTNAME="${MY_FQDN}" + if [ "$(id -u)" = "0" ]; then + chown -R cassandra:cassandra /var/lib/cassandra 2>/dev/null || true + gosu cassandra cassandra -f & + else + cassandra -f & + fi + CASS_PID=$! + + echo "Waiting for Cassandra to accept CQL connections..." + ATTEMPTS=0 + until cqlsh 127.0.0.1 9042 -u cassandra -p cassandra -e "SELECT now() FROM system.local" > /dev/null 2>&1 \ + || cqlsh 127.0.0.1 9042 -u cassandra -p "{{ .Values.superuserPassword }}" -e "SELECT now() FROM system.local" > /dev/null 2>&1; do + ATTEMPTS=$(( ATTEMPTS + 1 )) + if [ "${ATTEMPTS}" -ge 60 ]; then + echo "ERROR: Cassandra did not become CQL-ready after 300 seconds" + exit 1 + fi + sleep 5 + done + echo "CQL ready." + + BOOTSTRAP_FLAG="/var/lib/cassandra/.bootstrapped" + if [ -f "${BOOTSTRAP_FLAG}" ]; then + echo "Bootstrap flag found — skipping first-time initialisation." + else + # Wait for all replicas to join before writing any auth data. + # This ensures token ranges are finalised so writes land on the correct + # long-term owners and are not redistributed inconsistently after the fact. + echo "Waiting for all {{ .Values.replicas | int }} replicas to join before bootstrapping..." + WAIT_ATTEMPTS=0 + until [ "$(nodetool status 2>/dev/null | grep -c '^UN')" -eq "{{ .Values.replicas | int }}" ]; do + WAIT_ATTEMPTS=$(( WAIT_ATTEMPTS + 1 )) + if [ "${WAIT_ATTEMPTS}" -ge 60 ]; then + echo "WARN: Timed out waiting for all replicas — proceeding with available nodes" + break + fi + sleep 10 + done + + echo "Running first-time bootstrap..." + + # Detect which cassandra password to use in case a previous partial bootstrap + # already ran ALTER USER before the script exited. + CQLSH_PASS="cassandra" + if ! cqlsh 127.0.0.1 9042 -u cassandra -p cassandra -e "SELECT now() FROM system.local" > /dev/null 2>&1; then + CQLSH_PASS="{{ .Values.superuserPassword }}" + echo "Default cassandra password no longer valid — using configured superuser password." + fi + + BOOTSTRAP_CQL=/tmp/cassandra-bootstrap.cql + printf "ALTER USER cassandra WITH PASSWORD '%s';\n" "{{ .Values.superuserPassword }}" > "${BOOTSTRAP_CQL}" + printf "ALTER KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': %s};\n" "{{ .Values.replicas | int }}" >> "${BOOTSTRAP_CQL}" + printf "CREATE KEYSPACE IF NOT EXISTS %s WITH replication = {'class': 'SimpleStrategy', 'replication_factor': %s};\n" "{{ .Values.keyspaceName }}" "{{ .Values.replicationFactor | int }}" >> "${BOOTSTRAP_CQL}" + printf "CREATE ROLE IF NOT EXISTS '%s' WITH PASSWORD = '%s' AND LOGIN = true;\n" "{{ .Values.username }}" "{{ .Values.password }}" >> "${BOOTSTRAP_CQL}" + printf "GRANT ALL PERMISSIONS ON KEYSPACE %s TO '%s';\n" "{{ .Values.keyspaceName }}" "{{ .Values.username }}" >> "${BOOTSTRAP_CQL}" + + cqlsh 127.0.0.1 9042 -u cassandra -p "${CQLSH_PASS}" -f "${BOOTSTRAP_CQL}" + rm -f "${BOOTSTRAP_CQL}" + + nodetool repair system_auth \ + && echo "system_auth repair complete." \ + || echo "WARN: system_auth repair did not complete — repair cron will handle it" + + touch "${BOOTSTRAP_FLAG}" + echo "Bootstrap complete: keyspace '{{ .Values.keyspaceName }}', user '{{ .Values.username }}' created." + fi + + wait "${CASS_PID}" + else + # Non-replica-0: exec directly so Cassandra is PID 1. + export HOSTNAME="${MY_FQDN}" + if [ "$(id -u)" = "0" ]; then + chown -R cassandra:cassandra /var/lib/cassandra 2>/dev/null || true + exec gosu cassandra cassandra -f + else + exec cassandra -f + fi + fi diff --git a/cassandra/versions/1.0.0/templates/volumeset.yaml b/cassandra/versions/1.0.0/templates/volumeset.yaml new file mode 100644 index 00000000..9f077cc1 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/volumeset.yaml @@ -0,0 +1,12 @@ +kind: volumeset +name: {{ include "cassandra.volumeset.name" . }} +description: {{ include "cassandra.workload.name" . }} data +tags: {{- include "cassandra.tags" . | nindent 2 }} +spec: + initialCapacity: {{ .Values.volumes.data.initialCapacity }} + performanceClass: general-purpose-ssd + fileSystemType: ext4 + autoscaling: + maxCapacity: {{ .Values.volumes.data.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.volumes.data.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.volumes.data.autoscaling.scalingFactor }} diff --git a/cassandra/versions/1.0.0/templates/workload-backup.yaml b/cassandra/versions/1.0.0/templates/workload-backup.yaml new file mode 100644 index 00000000..9cd943e0 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/workload-backup.yaml @@ -0,0 +1,65 @@ +{{- if and .Values.backup.enabled (eq .Values.backup.type "logical") }} +kind: workload +name: {{ include "cassandra.workload.backup.name" . }} +description: Scheduled Cassandra logical backup job +tags: {{- include "cassandra.tags" . | nindent 2 }} +spec: + type: cron + containers: + - name: backup + image: {{ .Values.backup.image }} + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + inheritEnv: false + env: + - name: BACKUP_TYPE + value: logical + - name: BACKUP_PROVIDER + value: {{ .Values.backup.provider | quote }} + - name: BACKUP_BUCKET + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.backup-bucket + {{- if eq .Values.backup.provider "aws" }} + - name: AWS_REGION + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.aws-region + - name: BACKUP_PREFIX + value: {{ .Values.backup.aws.prefix | quote }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + - name: BACKUP_PREFIX + value: {{ .Values.backup.gcp.prefix | quote }} + {{- end }} + - name: CASSANDRA_HOST + value: {{ include "cassandra.workload.name" . }}.{{ .Values.global.cpln.gvc }}.cpln.local + - name: CASSANDRA_PORT + value: "9042" + - name: CASSANDRA_USER + value: cassandra + - name: CASSANDRA_PASSWORD + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.superuser-password + - name: CASSANDRA_KEYSPACE + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.keyspace + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "cassandra.identity.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: disabled + minScale: 1 + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + suspend: false + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + inboundAllowType: none + job: + schedule: {{ .Values.backup.schedule | quote }} + concurrencyPolicy: Forbid + restartPolicy: Never + historyLimit: 5 +{{- end }} diff --git a/cassandra/versions/1.0.0/templates/workload-cassandra.yaml b/cassandra/versions/1.0.0/templates/workload-cassandra.yaml new file mode 100644 index 00000000..bf901424 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/workload-cassandra.yaml @@ -0,0 +1,136 @@ +{{- include "cassandra.validate" . }} +kind: workload +name: {{ include "cassandra.workload.name" . }} +description: Cassandra cluster +tags: {{- include "cassandra.tags" . | nindent 2 }} +spec: + type: stateful + containers: + - name: cassandra + command: /bin/bash + args: + - '-c' + - >- + cp /scripts/cassandra-init.sh /tmp/cassandra-init.sh && + chmod +x /tmp/cassandra-init.sh && + /tmp/cassandra-init.sh + env: + - name: MAX_HEAP_SIZE + value: {{ .Values.jvmHeapSize | quote }} + - name: LOCAL_JMX + value: "no" + lifecycle: + preStop: + exec: + command: + - bash + - '-c' + - nodetool drain + image: {{ .Values.image }} + cpu: {{ .Values.cpu | quote }} + memory: {{ .Values.memory | quote }} + inheritEnv: false + ports: + - number: 7199 + protocol: tcp + - number: 9042 + protocol: tcp + - number: 9043 + protocol: tcp + livenessProbe: + failureThreshold: 5 + initialDelaySeconds: 120 + periodSeconds: 30 + successThreshold: 1 + tcpSocket: + port: 9042 + timeoutSeconds: 10 + readinessProbe: + failureThreshold: 10 + initialDelaySeconds: 30 + periodSeconds: 10 + successThreshold: 1 + tcpSocket: + port: 9042 + timeoutSeconds: 5 + volumes: + - path: /var/lib/cassandra + recoveryPolicy: retain + uri: 'cpln://volumeset/{{ include "cassandra.volumeset.name" . }}' + - path: /cassandra-config/cassandra.yaml + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "cassandra.secret.config.name" . }}' + - path: /scripts/cassandra-init.sh + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "cassandra.secret.init.name" . }}' +{{- if and .Values.backup.enabled (eq .Values.backup.type "physical") }} + - name: backup + image: {{ .Values.backup.image }} + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + inheritEnv: false + env: + - name: BACKUP_TYPE + value: physical + - name: BACKUP_SCHEDULE + value: {{ .Values.backup.schedule | quote }} + - name: BACKUP_PROVIDER + value: {{ .Values.backup.provider | quote }} + - name: BACKUP_BUCKET + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.backup-bucket + {{- if eq .Values.backup.provider "aws" }} + - name: AWS_REGION + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.aws-region + - name: BACKUP_PREFIX + value: {{ .Values.backup.aws.prefix | quote }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + - name: BACKUP_PREFIX + value: {{ .Values.backup.gcp.prefix | quote }} + {{- end }} + - name: CASSANDRA_JMX_PASSWORD + value: cpln://secret/{{ include "cassandra.secret.credentials.name" . }}.superuser-password + volumes: + - path: /var/lib/cassandra + recoveryPolicy: retain + uri: 'cpln://volumeset/{{ include "cassandra.volumeset.name" . }}' +{{- end }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ .Values.replicas | int }} + metric: disabled + minScale: {{ .Values.replicas | int }} + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + multiZone: + enabled: {{ .Values.multiZone.enabled }} + suspend: false + timeoutSeconds: 60 + firewallConfig: + {{- if and .Values.backup.enabled (eq .Values.backup.type "physical") }} + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + {{- end }} + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "cassandra.identity.name" . }} + loadBalancer: + direct: + enabled: false + replicaDirect: true + rolloutOptions: + maxSurgeReplicas: 0% + maxUnavailableReplicas: '1' + minReadySeconds: 60 + scalingPolicy: OrderedReady + securityOptions: + filesystemGroupId: 999 + supportDynamicTags: false diff --git a/cassandra/versions/1.0.0/templates/workload-repair.yaml b/cassandra/versions/1.0.0/templates/workload-repair.yaml new file mode 100644 index 00000000..3613b1b2 --- /dev/null +++ b/cassandra/versions/1.0.0/templates/workload-repair.yaml @@ -0,0 +1,63 @@ +{{- if .Values.repair.enabled }} +kind: workload +name: {{ include "cassandra.workload.repair.name" . }} +description: Scheduled Cassandra cluster repair job +tags: {{- include "cassandra.tags" . | nindent 2 }} +spec: + type: cron + containers: + - name: repair + image: {{ .Values.image }} + cpu: 250m + memory: 256Mi + inheritEnv: false + command: /bin/bash + args: + - '-c' + - | + set -euo pipefail + + CASSANDRA_WORKLOAD="{{ include "cassandra.workload.name" . }}" + + # The svc.cluster.local hostnames resolve directly to pod IPs, bypassing + # the CP service mesh proxy that blocks JMX (port 7199) connections. + # The namespace hash is extracted from the pod's DNS search domain. + K8S_NAMESPACE=$(awk '/^search/{print $2}' /etc/resolv.conf | cut -d'.' -f1) + + echo "Starting full repair on all {{ .Values.replicas | int }} replicas..." + FAILED=0 + for i in $(seq 0 $(( {{ .Values.replicas | int }} - 1 ))); do + HOST="${CASSANDRA_WORKLOAD}-${i}.${CASSANDRA_WORKLOAD}.${K8S_NAMESPACE}.svc.cluster.local" + echo "--- Repairing replica ${i} (${HOST}) ---" + nodetool -u cassandra -pw "{{ .Values.superuserPassword }}" -h "${HOST}" -p 7199 repair \ + && echo "Replica ${i} repair complete." \ + || { echo "WARN: Replica ${i} repair failed."; FAILED=$(( FAILED + 1 )); } + done + + if [ "${FAILED}" -gt 0 ]; then + echo "ERROR: ${FAILED} replica(s) failed repair." + exit 1 + else + echo "All replicas repaired successfully." + fi + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "cassandra.identity.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: disabled + minScale: 1 + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + suspend: false + firewallConfig: + internal: + inboundAllowType: {{ .Values.internal_access.type }} + job: + schedule: {{ .Values.repair.schedule | quote }} + concurrencyPolicy: Forbid + restartPolicy: Never + historyLimit: 5 +{{- end }} diff --git a/cassandra/versions/1.0.0/values.yaml b/cassandra/versions/1.0.0/values.yaml new file mode 100644 index 00000000..debbcdc0 --- /dev/null +++ b/cassandra/versions/1.0.0/values.yaml @@ -0,0 +1,60 @@ +replicas: 3 +# replicationFactor must not exceed replicas +replicationFactor: 1 + +# IMPORTANT: Change all credentials before deploying to production +superuserPassword: supersecretpassword +username: username +password: password +keyspaceName: mydatabase + +image: cassandra:5.0 +cpu: 1 +memory: 4Gi +# JVM heap: leave ~50% of container memory for off-heap (bloom filters, page cache, etc.) +# Cassandra 5.x uses G1GC — only MAX_HEAP_SIZE is valid; HEAP_NEWSIZE is ignored. +jvmHeapSize: 2G +clusterName: my-cassandra +volumes: + data: + initialCapacity: 10 + autoscaling: + maxCapacity: 100 + minFreePercentage: 20 + scalingFactor: 1.5 + +multiZone: + enabled: false + +internal_access: + type: same-gvc + workloads: + +backup: + enabled: false + type: logical # options: logical, physical + image: ghcr.io/controlplane-com/backup-images/cassandra-backup:5.0 + schedule: "0 2 * * *" # daily at 2am UTC + + resources: + cpu: 250m + memory: 256Mi + + provider: aws # options: aws, gcp + + aws: + bucket: my-backup-bucket + region: us-east-1 + cloudAccountName: my-backup-cloudaccount + policyName: my-s3-policy + prefix: cassandra/backups + + gcp: + bucket: my-backup-bucket + cloudAccountName: my-cloud-account + prefix: cassandra/backups + +repair: + enabled: true + # Cron schedule for full cluster repair (must run within gc_grace_seconds = 10 days) + schedule: "0 2 * * 0" diff --git a/cdc-pipeline/icon.png b/cdc-pipeline/icon.png new file mode 100644 index 00000000..f89204d1 Binary files /dev/null and b/cdc-pipeline/icon.png differ diff --git a/cdc-pipeline/versions/1.0.0/Chart.yaml b/cdc-pipeline/versions/1.0.0/Chart.yaml new file mode 100644 index 00000000..c53dc01b --- /dev/null +++ b/cdc-pipeline/versions/1.0.0/Chart.yaml @@ -0,0 +1,23 @@ +apiVersion: v2 +name: cdc-pipeline +description: CDC pipeline with PostgreSQL HA, Kafka, and Debezium. Auto-configured WAL, SASL credentials, and cross-service DNS. +type: application +version: 1.0.0 +appVersion: "1.0" + +dependencies: + - name: postgres-highly-available + version: 2.2.0 + repository: "oci://ghcr.io/controlplane-com/templates" + - name: kafka + version: 4.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + - name: debezium-server + version: 1.1.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2026-04-13" + lastModified: "2026-05-07" + category: "event-streaming" + createsGvc: false diff --git a/cdc-pipeline/versions/1.0.0/README.md b/cdc-pipeline/versions/1.0.0/README.md new file mode 100644 index 00000000..22262ac3 --- /dev/null +++ b/cdc-pipeline/versions/1.0.0/README.md @@ -0,0 +1,86 @@ +# CDC Pipeline + +A meta-template that deploys a complete Change Data Capture (CDC) pipeline on Control Plane, bundling: + +- **PostgreSQL HA** (Patroni + etcd + HAProxy) as the source database +- **Apache Kafka** (KRaft mode + Kafbat UI) as the event streaming platform +- **Debezium Server** as the CDC connector (PostgreSQL -> Kafka) + +## Why Use This Template? + +When deploying these three components individually, you must manually coordinate: + +- PostgreSQL WAL level (`logical` is required for CDC) +- Database credentials between PostgreSQL and Debezium +- Kafka SASL credentials between Kafka and Debezium +- Internal DNS hostnames for cross-service communication + +This meta-template handles all of that automatically. Shared values are defined once and validated at deploy time. + +## Quick Start + +1. Install the template and customize `values.yaml`: + - Set real passwords (replace all `changeme-*` values) + - Configure `source.tableIncludeList` to specify which tables to capture + - Adjust resource sizes and replica counts as needed + +2. Internal DNS names are computed automatically from the release name: + - PostgreSQL: `-postgres-ha-proxy..cpln.local:5432` + - Kafka: `-cluster..cpln.local:9092` + - Debezium: `-debezium..cpln.local` + +## Configuration + +### Shared Values + +These values must match between components. The default `values.yaml` pre-coordinates them: + +| Value | PostgreSQL Path | Debezium Path | +|-------|----------------|---------------| +| DB Username | `postgres-highly-available.postgres.username` | `debezium-server.source.database.user` | +| DB Password | `postgres-highly-available.postgres.password` | `debezium-server.source.database.password` | +| DB Name | `postgres-highly-available.postgres.database` | `debezium-server.source.database.name` | + +| Value | Kafka Path | Debezium Path | +|-------|-----------|---------------| +| SASL Username | `kafka.kafka.listeners.client.sasl.users` | `debezium-server.sink.kafka.saslUsername` | +| SASL Password | `kafka.kafka.listeners.client.sasl.passwords` | `debezium-server.sink.kafka.saslPassword` | + +### Cross-Component Validation + +The template validates at deploy time that: + +- `postgres-highly-available.postgres.walLevel` is `logical` +- Database credentials match between PostgreSQL and Debezium +- Debezium's Kafka SASL username exists in Kafka's configured users + +### Connecting to External Instances + +To use an external PostgreSQL or Kafka instead of the bundled one, set the hostname/bootstrap servers explicitly: + +```yaml +debezium-server: + source: + database: + hostname: "my-external-postgres.example.com" + sink: + kafka: + bootstrapServers: "my-external-kafka.example.com:9092" +``` + +### Debezium Heartbeat (Recommended for HA) + +The default configuration enables Debezium heartbeats (every 5 seconds) to prevent WAL accumulation during low-traffic periods. You must create the heartbeat table in PostgreSQL after deployment: + +```sql +CREATE TABLE IF NOT EXISTS debezium_heartbeat (id INT PRIMARY KEY, ts TIMESTAMPTZ); +INSERT INTO debezium_heartbeat VALUES (1, now()); +``` + +## Component Versions + +| Component | Version | +|-----------|---------| +| PostgreSQL HA | 2.2.0 (Patroni, PostgreSQL 17) | +| Kafka | 4.0.0 (Apache Kafka 3.9.1, KRaft) | +| Debezium Server | 1.1.0 (Debezium 3.0) | diff --git a/cdc-pipeline/versions/1.0.0/templates/_helpers.tpl b/cdc-pipeline/versions/1.0.0/templates/_helpers.tpl new file mode 100644 index 00000000..b2472041 --- /dev/null +++ b/cdc-pipeline/versions/1.0.0/templates/_helpers.tpl @@ -0,0 +1,80 @@ +{{/* +================================================================================ +CDC Pipeline - Cross-Component Validation +================================================================================ +*/}} + +{{/* +Validate that PostgreSQL WAL level is set to "logical" (required for CDC) +*/}} +{{- define "cdc.validateWalLevel" -}} +{{- $walLevel := index .Values "postgres-highly-available" "postgres" "walLevel" -}} +{{- if ne $walLevel "logical" -}} +{{- fail (printf "postgres-highly-available.postgres.walLevel must be 'logical' for CDC, got '%s'" $walLevel) -}} +{{- end -}} +{{- end -}} + +{{/* +Validate that database credentials match between PostgreSQL and Debezium +*/}} +{{- define "cdc.validateCredentials" -}} +{{- $pgUser := index .Values "postgres-highly-available" "postgres" "username" -}} +{{- $pgPass := index .Values "postgres-highly-available" "postgres" "password" -}} +{{- $pgDb := index .Values "postgres-highly-available" "postgres" "database" -}} +{{- $dbzUser := index .Values "debezium-server" "source" "database" "user" -}} +{{- $dbzPass := index .Values "debezium-server" "source" "database" "password" -}} +{{- $dbzDb := index .Values "debezium-server" "source" "database" "name" -}} +{{- if ne $pgUser $dbzUser -}} +{{- fail (printf "Credential mismatch: postgres-highly-available.postgres.username ('%s') != debezium-server.source.database.user ('%s')" $pgUser $dbzUser) -}} +{{- end -}} +{{- if ne $pgPass $dbzPass -}} +{{- fail "Credential mismatch: postgres-highly-available.postgres.password != debezium-server.source.database.password" -}} +{{- end -}} +{{- if ne $pgDb $dbzDb -}} +{{- fail (printf "Database mismatch: postgres-highly-available.postgres.database ('%s') != debezium-server.source.database.name ('%s')" $pgDb $dbzDb) -}} +{{- end -}} +{{- end -}} + +{{/* +Validate that Kafka SASL credentials match between Kafka and Debezium +*/}} +{{- define "cdc.validateKafkaCredentials" -}} +{{- $dbzSinkType := index .Values "debezium-server" "sink" "type" -}} +{{- if eq $dbzSinkType "kafka" -}} +{{- $kafkaUsers := index .Values "kafka" "kafka" "listeners" "client" "sasl" "users" -}} +{{- $dbzUser := index .Values "debezium-server" "sink" "kafka" "saslUsername" -}} +{{- if not (contains $dbzUser $kafkaUsers) -}} +{{- fail (printf "Kafka SASL mismatch: debezium saslUsername ('%s') not found in kafka listeners.client.sasl.users ('%s')" $dbzUser $kafkaUsers) -}} +{{- end -}} +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Labeling +================================================================================ +*/}} + +{{/* +Create chart name and version as used by the chart label +*/}} +{{- define "cdc.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Marketplace tags for the meta-template +*/}} +{{- define "cdc.tags" -}} +helm.sh/chart: {{ include "cdc.chart" . }} +app.cpln.io/name: {{ .Release.Name }} +app.cpln.io/instance: {{ .Release.Name }} +{{- if .Chart.AppVersion }} +app.cpln.io/version: {{ .Chart.AppVersion | quote }} +{{- end }} +app.cpln.io/managed-by: {{ .Release.Service }} +cpln/marketplace: "true" +cpln/marketplace-template: cdc-pipeline +cpln/marketplace-template-version: {{ .Chart.Version }} +cpln/marketplace-gvc: {{ .Values.global.cpln.gvc }} +{{- end }} diff --git a/cdc-pipeline/versions/1.0.0/templates/validation.yaml b/cdc-pipeline/versions/1.0.0/templates/validation.yaml new file mode 100644 index 00000000..ec2e6138 --- /dev/null +++ b/cdc-pipeline/versions/1.0.0/templates/validation.yaml @@ -0,0 +1,3 @@ +{{- include "cdc.validateWalLevel" . -}} +{{- include "cdc.validateCredentials" . -}} +{{- include "cdc.validateKafkaCredentials" . -}} diff --git a/cdc-pipeline/versions/1.0.0/values.yaml b/cdc-pipeline/versions/1.0.0/values.yaml new file mode 100644 index 00000000..339ea5e3 --- /dev/null +++ b/cdc-pipeline/versions/1.0.0/values.yaml @@ -0,0 +1,379 @@ +# ============================================================================= +# CDC Pipeline Meta-Template +# ============================================================================= +# Deploys a complete Change Data Capture pipeline: +# - PostgreSQL HA (Patroni + etcd + HAProxy) +# - Kafka (KRaft cluster + Kafbat UI) +# - Debezium Server (CDC connector: postgres -> kafka) +# +# Shared credentials are defined once here and wired to each component. +# Internal DNS names are computed automatically from the release name: +# PostgreSQL: -postgres-ha-proxy..cpln.local:5432 +# Kafka: -cluster..cpln.local:9092 +# Debezium: -debezium..cpln.local +# ============================================================================= + +# ============================================================================= +# PostgreSQL HA +# ============================================================================= +postgres-highly-available: + replicas: 3 + + resources: + minCpu: 500m + minMemory: 1Gi + maxCpu: 1 + maxMemory: 2Gi + + image: controlplanecorporation/patroni-postgres:0.7 + + postgres: + username: cdc_user + password: "changeme-postgres-password" + database: cdcdb + walLevel: logical # Required for CDC -- do not change + + multiZone: false + + volumeset: + capacity: 10 + autoscaling: + enabled: false + maxCapacity: 100 + minFreePercentage: 10 + scalingFactor: 1.2 + + internal_access: + type: same-gvc + + etcd: + replicas: 3 + resources: + cpu: 500m + memory: 512Mi + multiZone: false + volumeset: + capacity: 10 + internal_access: + type: same-gvc + + pgbouncer: + enabled: false + + proxy: + enabled: true + image: haproxy:2.9 + resources: + cpu: 100m + memory: 128Mi + minReplicas: 2 + maxReplicas: 2 + + backup: + enabled: false + +# ============================================================================= +# Kafka +# ============================================================================= +kafka: + kafka: + name: cluster + image: apache/kafka:3.9.1 + suspend: false + deletionProtection: false + replicas: 3 + minReadySeconds: 0 + debug: false + multiZone: false + logDirs: /opt/kafka/logs-0,/opt/kafka/logs-1 + env: [] + volumes: + logs: + initialCapacity: 10 + performanceClass: general-purpose-ssd + fileSystemType: ext4 + snapshots: + createFinalSnapshot: true + retentionDuration: 7d + schedule: 0 0 * * * + autoscaling: + maxCapacity: 1000 + minFreePercentage: 20 + scalingFactor: 1.2 + cpu: 1000m + memory: 2000Mi + minCpu: 250m + minMemory: 2000Mi + firewall: + internal_inboundAllowType: "same-gvc" + listeners: + client: + protocol: SASL_PLAINTEXT + name: CLIENT + containerPort: 9092 + sasl: + admin: + username: admin + password: "changeme-kafka-admin-password" + users: "debezium" + passwords: "changeme-kafka-debezium-password" + acl: + superUsers: "User:admin" + allowEveryoneIfNoAclFound: false + secrets: + kraft_cluster_id: "changeme-kraft-cluster-id" + inter_broker_password: "changeme-inter-broker-password" + controller_password: "changeme-controller-password" + extra_configurations: + default.replication.factor: 3 + auto.create.topics.enable: true + log.retention.hours: 168 + + kafka_exporter: + name: exporter + image: danielqsj/kafka-exporter:v1.9.0 + debug: false + cpu: 50m + memory: 128Mi + listener: client + env: [] + dropMetrics: [] + + jmx_exporter: + name: jmx-exporter + image: ghcr.io/controlplane-com/bitnami/jmx-exporter + kafkaJmxPort: 5557 + exporterPort: 5556 + debug: false + cpu: 250m + memory: 256Mi + minCpu: 80m + minMemory: 125Mi + listener: client + dropMetrics: [] + config: + jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:5557/jmxrmi + lowercaseOutputName: true + lowercaseOutputLabelNames: true + ssl: false + whitelistObjectNames: + - kafka.controller:* + - kafka.server:* + - java.lang:* + - kafka.network:* + - kafka.log:* + - kafka.producer:* + - kafka.consumer:* + rules: + - labels: + request: "$3" + name: kafka_request_count + pattern: kafka.network<>(Count) + - labels: + request: "$3" + stat: "$4" + name: kafka_request_metrics_totaltimems + pattern: kafka.network<>(.+) + - labels: + request: "$3" + component: "$2" + stat: "$4" + name: kafka_request_latency_ms + pattern: kafka.network<>(.+) + - labels: + client_type: "$3" + metric: "$2" + stat: "$4" + name: kafka_client_metrics + pattern: kafka.network<>(.+) + - labels: + client_id: "$1" + metric: "$2" + name: kafka_consumer_metrics + pattern: kafka.consumer<>(.+) + - labels: + client_id: "$1" + metric: "$2" + name: kafka_producer_metrics + pattern: kafka.producer<>(.+) + - name: kafka_server_$1_$2_$3 + pattern: kafka.server<>(Count|Value) + - name: java_lang_$1_$2 + pattern: java.lang<>(.+) + + kafbat_ui: + enabled: true + deletionProtection: false + name: kafbat-ui + image: ghcr.io/kafbat/kafka-ui + cpu: 300m + memory: 1000Mi + minCpu: 100m + minMemory: 400Mi + replicas: 1 + timeoutSeconds: 30 + configuration_secret: kafka-kafbat-ui-config + firewall: + external_inboundAllowCIDR: "0.0.0.0/0" + external_outboundAllowCIDR: "0.0.0.0/0" + + kafka_rest_proxy: + enabled: false + + kafka_client: + name: client + image: apache/kafka:3.9.1 + cpu: 500m + memory: 1000Mi + firewall: + external_outboundAllowCIDR: "0.0.0.0/0" + + kafka_ui: + enabled: false + +# ============================================================================= +# Debezium Server +# ============================================================================= +# Database hostname and Kafka bootstrap servers are auto-computed from +# the release name when left empty. Override only if connecting to +# external instances not managed by this meta-template. +debezium-server: + image: quay.io/debezium/server:3.0 + + resources: + cpu: 500m + memory: 512Mi + + source: + type: postgres + + database: + hostname: "" # Auto-computed: -postgres-ha-proxy..cpln.local + port: 5432 + name: cdcdb # Must match postgres-highly-available.postgres.database + user: cdc_user # Must match postgres-highly-available.postgres.username + password: "changeme-postgres-password" # Must match postgres-highly-available.postgres.password + + serverName: "dbserver1" + tableIncludeList: "" + tableExcludeList: "" + + postgres: + slotName: "debezium" + publicationName: "dbz_publication" + pluginName: "pgoutput" + slotDropOnStop: false + heartbeatIntervalMs: 5000 + heartbeatActionQuery: "UPDATE debezium_heartbeat SET ts = now() WHERE id = 1" + + offset: + storage: file + flushIntervalMs: 10000 + flushTimeoutMs: 60000 + file: + filename: "/debezium/data/offsets.dat" + redis: + address: "" + key: "debezium:offsets" + password: "" + ssl: false + jdbc: + url: "" + user: "" + password: "" + tableName: "debezium_offsets" + + schemaHistory: + storage: file + file: + filename: "/debezium/data/schema-history.dat" + redis: + address: "" + key: "debezium:schema-history" + password: "" + ssl: false + jdbc: + url: "" + user: "" + password: "" + tableName: "debezium_schema_history" + + errors: + retryDelayInitialMs: 300 + retryDelayMaxMs: 10000 + maxRetries: -1 + + sink: + type: kafka + kafka: + bootstrapServers: "" # Auto-computed: -cluster..cpln.local:9092 + topic: "" + securityProtocol: "SASL_PLAINTEXT" + saslMechanism: "PLAIN" + saslUsername: "debezium" # Must match kafka.kafka.listeners.client.sasl.users + saslPassword: "changeme-kafka-debezium-password" # Must match kafka.kafka.listeners.client.sasl.passwords + + redis: + address: "" + password: "" + ssl: false + streamName: "" + + nats: + url: "" + subject: "" + username: "" + password: "" + + http: + url: "" + headers: {} + authType: "" + username: "" + password: "" + bearerToken: "" + + kinesis: + region: "" + streamName: "" + credentialsProvider: "default" + cloudAccount: + enabled: false + name: "" + + pubsub: + projectId: "" + topic: "" + cloudAccount: + enabled: false + name: "" + + pulsar: + serviceUrl: "" + topic: "" + authPluginClassName: "" + authToken: "" + + eventhubs: + connectionString: "" + hubName: "" + + format: + key: json + value: json + schemaRegistry: + url: "" + username: "" + password: "" + + volumeset: + capacity: 10 + performanceClass: general-purpose-ssd + + firewall: + internal: + inboundAllowType: same-gvc + workloads: [] + external: + outboundAllowCIDR: + - 0.0.0.0/0 diff --git a/cockroach/versions/1.4.0/Chart.yaml b/cockroach/versions/1.4.0/Chart.yaml new file mode 100644 index 00000000..b802b8be --- /dev/null +++ b/cockroach/versions/1.4.0/Chart.yaml @@ -0,0 +1,18 @@ +apiVersion: v2 +name: cockroach +description: Distributed PostgreSQL-compatible database for Control Plane + +type: application +version: 1.4.0 +appVersion: "25.4.0" + +annotations: + created: "2025-08-25" + lastModified: "2026-05-06" + category: "database" + createsGvc: true + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" \ No newline at end of file diff --git a/cockroach/versions/1.4.0/README.md b/cockroach/versions/1.4.0/README.md new file mode 100644 index 00000000..3b12d1ec --- /dev/null +++ b/cockroach/versions/1.4.0/README.md @@ -0,0 +1,164 @@ +# CockroachDB + +CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. It provides automatic replication, distribution, and survivability across multiple locations with minimal latency and maximum throughput. CockroachDB offers ACID transactions, horizontal scalability, and built-in fault tolerance, making it ideal for applications requiring global data distribution and high availability. + +## Configuration + +To configure your CockroachDB cluster across multiple locations, update the `gvc.locations` section in the `values.yaml` file. + +**Note**: While CockroachDB can run on 1 location, a minimum of 3 locations and 3 replicas per location is recommended for high resilience. + +### Volume Storage + +Configure initial storage capacity and optional autoscaling for the CockroachDB data volume in `values.yaml`: + +```yaml +volumeset: + capacity: 10 # initial capacity in GiB (minimum is 10) + autoscaling: + enabled: false + maxCapacity: 100 # maximum capacity in GiB + minFreePercentage: 10 # scale when free space drops below this percentage + scalingFactor: 1.2 # multiply current capacity by this factor when scaling +``` + +### Database Initialization + +To create a database with a user on initialization, configure the `database` section in your `values.yaml` file. The database and user are created automatically on first deploy only — they are not re-created on restarts. + +### Internal Access Configuration + +To specify which workloads can access this CockroachDB cluster internally, configure the `internal_access` section in your `values.yaml` file: + +**Access Types:** +- `same-gvc`: Allow access from all workloads in the same GVC +- `same-org`: Allow access from all workloads in the same organization +- `workload-list`: Allow access only from specific workloads listed in `outside_workloads` and can be used in conjunction with `same-gvc` + +Once deployed, CockroachDB will be available on port 26257. CockroachDB is configured in `--insecure` mode because Control Plane handles mTLS for all inter-workload communication. Connect using the internal hostname: + +```bash +cockroach sql --insecure --host=-cockroach..cpln.local:26257 +``` + +### Admin Dashboard + +The CockroachDB admin UI runs on port 8080 but is not exposed externally. Access it via port forward and open `http://localhost:8080` in your browser. + +The cluster automatically handles data distribution and replication across your configured locations. + +**Note on GVC Naming** + +- This template creates a GVC with a default name defined in the `values.yaml`. If you plan to deploy multiple instances of this template, you **must assign a unique GVC name** for each deployment. + +### Multi-Region Survivability + +On first deploy, the cluster automatically configures the database with all configured locations as regions and sets the survival goal to `REGION`, meaning the cluster can tolerate the loss of an entire location without downtime. To verify: + +```sql +SHOW SURVIVAL GOAL FROM DATABASE mydb; +``` + +**Note**: A production CockroachDB setup can survive a location outage cleanly, but rolling out or restarting replicas in the remaining locations during that outage exceeds the cluster's fault tolerance and will cause a brief period of downtime for ranges on those restarting nodes. + +## PgBouncer Connection Pooling (Optional) + +PgBouncer multiplexes application connections into a smaller pool of real database connections, reducing overhead and protecting CockroachDB from connection exhaustion under high concurrency. It connects to all CockroachDB nodes across all locations, so failover and load distribution are handled transparently. + +When enabled, PgBouncer becomes the primary connection endpoint. Connect to `{release-name}-pgbouncer.{gvc}.cpln.local:5432` instead of the CockroachDB workload directly. + +```yaml +pgbouncer: + enabled: true + poolMode: transaction # options: session, transaction, statement + defaultPoolSize: 25 # real CockroachDB connections per PgBouncer pod + maxClientConn: 250 # max app connections per PgBouncer pod + maxDbConnections: 100 # hard cap on total CockroachDB connections regardless of how many PgBouncer pods are running + minReplicas: 2 + maxReplicas: 4 +``` + +**Pool modes:** +- `transaction` — connection held only for the duration of a transaction. Best for most web and API workloads. Not compatible with session-level features like `SET` variables, temporary tables, or advisory locks. +- `session` — connection held for the entire client session. Compatible with all features but provides less connection reuse. +- `statement` — connection returned after every statement. Transactions are not supported. Rarely used. + +**`maxDbConnections`** is a hard cap on the total number of real CockroachDB connections PgBouncer will open, shared across all PgBouncer pods. Set it to a value your cluster can safely handle regardless of how many PgBouncer pods are running. + +**Scaling:** PgBouncer autoscales on RPS between `minReplicas` and `maxReplicas`. Increase `maxReplicas` for high-throughput workloads where PgBouncer becomes the bottleneck before CockroachDB does. + +## Application Retry Logic + +**Your application must implement retry logic on database connections.** PgBouncer routes around failed CockroachDB nodes, but transient errors are still surfaced to the application during failover events such as a location outage or rolling restarts — while PgBouncer cycles through backends and Raft leader elections complete. Without retries, these transient errors will propagate directly to the client. + +## Backing Up + +Set your desired backup schedule in the values file and configure your AWS S3 or GCS bucket. You can also set a prefix where your backups will be stored in the bucket. + +Set `backup.location` to the region closest to your storage bucket to minimize cross-region transfer latency. CockroachDB nodes upload backup data directly to cloud storage using their own workload identity — the backup job only triggers the SQL command. + +### AWS S3 + +For the backup job to have access to an S3 bucket, ensure the following prerequisites are completed in your AWS account before installing: + +1. Create your bucket. Update `aws.bucket` to include its name and `aws.region` to include its region. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Update `aws.cloudAccountName`. + +3. Create a new AWS IAM policy with the following JSON (replace `YOUR_BUCKET_NAME`): + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:ListBucket", + "s3:GetObjectVersion", + "s3:DeleteObjectVersion" + ], + "Resource": [ + "arn:aws:s3:::YOUR_BUCKET_NAME", + "arn:aws:s3:::YOUR_BUCKET_NAME/*" + ] + } + ] +} +``` + +4. Set `aws.policyName` to match the policy created in step 3. + +### GCS + +For the backup job to have access to a GCS bucket, ensure the following prerequisites are completed in your GCP account before installing: + +1. Create your bucket. Update `gcp.bucket` to include its name. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Update `gcp.cloudAccountName`. + +**Important**: You must add the `Storage Admin` role to the created GCP service account. + +### Restoring a Backup + +Backups are stored at `BUCKET/PREFIX/`. To restore, run `cockroach sql` from a machine with access to the bucket and network access to the cluster. + +**AWS S3** +```sh +cockroach sql --insecure \ + --host="WORKLOAD_INTERNAL_HOSTNAME:26257" \ + --execute="RESTORE FROM LATEST IN 's3://BUCKET_NAME/PREFIX?AUTH=implicit&AWS_REGION=BUCKET_REGION';" +``` + +**GCS** +```sh +cockroach sql --insecure \ + --host="WORKLOAD_INTERNAL_HOSTNAME:26257" \ + --execute="RESTORE FROM LATEST IN 'gs://BUCKET_NAME/PREFIX?AUTH=implicit';" +``` + +### Supported External Services +- [CockroachDB Documentation](https://www.cockroachlabs.com/docs/stable/) diff --git a/cockroach/versions/1.4.0/templates/_helpers.tpl b/cockroach/versions/1.4.0/templates/_helpers.tpl new file mode 100644 index 00000000..1d68cd9f --- /dev/null +++ b/cockroach/versions/1.4.0/templates/_helpers.tpl @@ -0,0 +1,86 @@ +{{/* Resource Naming */}} + +{{/* +Cockroach Workload Name +*/}} +{{- define "cockroach.name" -}} +{{- printf "%s-cockroach" .Release.Name }} +{{- end }} + +{{/* +Cockroach Secret Database Config Name +*/}} +{{- define "cockroach.secretDatabase.name" -}} +{{- printf "%s-cockroach-config" .Release.Name }} +{{- end }} + +{{/* +Cockroach Secret Startup Name +*/}} +{{- define "cockroach.secretStartup.name" -}} +{{- printf "%s-cockroach-startup" .Release.Name }} +{{- end }} + +{{/* +Cockroach Identity Name +*/}} +{{- define "cockroach.identity.name" -}} +{{- printf "%s-cockroach-identity" .Release.Name }} +{{- end }} + +{{/* +Cockroach Policy Name +*/}} +{{- define "cockroach.policy.name" -}} +{{- printf "%s-cockroach-policy" .Release.Name }} +{{- end }} + +{{/* +Cockroach Volume Set Name +*/}} +{{- define "cockroach.volume.name" -}} +{{- printf "%s-cockroach-vs" .Release.Name }} +{{- end }} + +{{/* +Cockroach Backup Workload Name +*/}} +{{- define "cockroach.backup.name" -}} +{{- printf "%s-cockroach-backup" .Release.Name }} +{{- end }} + +{{/* +Cockroach PgBouncer Workload Name +*/}} +{{- define "cockroach.pgbouncer.name" -}} +{{- printf "%s-cockroach-pgbouncer" .Release.Name }} +{{- end }} + +{{/* +Cockroach PgBouncer Startup Secret Name +*/}} +{{- define "cockroach.pgbouncer.secretStartup.name" -}} +{{- printf "%s-cockroach-pgbouncer-startup" .Release.Name }} +{{- end }} + + +{{/* Validation */}} + +{{/* +Validate that gvc.locations has at least 1 entry +*/}} +{{- define "cockroach.validateLocations" -}} +{{- if lt (len .Values.gvc.locations) 1 -}} +{{- fail "gvc.locations must contain at least 1 location" -}} +{{- end -}} +{{- end -}} + + +{{/* Labeling */}} + +{{/* +Common labels - delegated to cpln-common +*/}} +{{- define "cockroach.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/cockroach/versions/1.4.0/templates/gvc.yaml b/cockroach/versions/1.4.0/templates/gvc.yaml new file mode 100644 index 00000000..f86826ec --- /dev/null +++ b/cockroach/versions/1.4.0/templates/gvc.yaml @@ -0,0 +1,11 @@ +kind: gvc +name: {{ .Values.gvc.name }} +description: {{ .Values.gvc.name }} +tags: {{- include "cockroach.tags" . | nindent 4 }} +spec: + endpointNamingFormat: org + staticPlacement: + locationLinks: + {{- range .Values.gvc.locations }} + - //location/{{ .name }} + {{- end }} diff --git a/cockroach/versions/1.4.0/templates/identity.yaml b/cockroach/versions/1.4.0/templates/identity.yaml new file mode 100644 index 00000000..10dbc1a6 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/identity.yaml @@ -0,0 +1,24 @@ +--- +kind: identity +gvc: {{ .Values.gvc.name }} +name: {{ include "cockroach.identity.name" . }} +description: CockroachDB identity +tags: {{- include "cockroach.tags" . | nindent 4 }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.backup.aws.cloudAccountName }} + policyRefs: + - cpln-connector + - aws::ReadOnlyAccess + - {{ .Values.backup.aws.policyName | quote }} +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} +gcp: + bindings: + - resource: //storage.googleapis.com/projects/_/buckets/{{ .Values.backup.gcp.bucket }} + roles: + - roles/storage.objectAdmin + cloudAccountLink: //cloudaccount/{{ .Values.backup.gcp.cloudAccountName }} + scopes: + - https://www.googleapis.com/auth/cloud-platform +{{- end }} \ No newline at end of file diff --git a/cockroach/versions/1.4.0/templates/policy.yaml b/cockroach/versions/1.4.0/templates/policy.yaml new file mode 100644 index 00000000..811adaff --- /dev/null +++ b/cockroach/versions/1.4.0/templates/policy.yaml @@ -0,0 +1,17 @@ +--- +kind: policy +name: {{ include "cockroach.policy.name" . }} +description: CockroachDB policy +tags: {{- include "cockroach.tags" . | nindent 4 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.gvc.name }}/identity/{{ include "cockroach.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "cockroach.secretStartup.name" . }} + - //secret/{{ include "cockroach.secretDatabase.name" . }} + {{- if .Values.pgbouncer.enabled }} + - //secret/{{ include "cockroach.pgbouncer.secretStartup.name" . }} + {{- end }} diff --git a/cockroach/versions/1.4.0/templates/secret-pgbouncer-startup.yaml b/cockroach/versions/1.4.0/templates/secret-pgbouncer-startup.yaml new file mode 100644 index 00000000..e040ffd6 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/secret-pgbouncer-startup.yaml @@ -0,0 +1,55 @@ +{{- if .Values.pgbouncer.enabled }} +{{- $hosts := list }} +{{- $workload := include "cockroach.name" . }} +{{- $gvc := .Values.gvc.name }} +{{- range .Values.gvc.locations }} +{{- $loc := .name }} +{{- $reps := int .replicas }} +{{- range until $reps }} +{{- $hosts = append $hosts (printf "replica-%d.%s.%s.%s.cpln.local" . $workload $loc $gvc) }} +{{- end }} +{{- end }} +--- +kind: secret +name: {{ include "cockroach.pgbouncer.secretStartup.name" . }} +description: PgBouncer startup script +tags: {{- include "cockroach.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + #!/bin/sh + set -eu + + DB_NAME="{{ .Values.database.name }}" + DB_HOST="{{ join "," $hosts }}" + + echo "Starting PgBouncer, connecting to ${DB_HOST}" + + printf '"root" ""\n"{{ .Values.database.user }}" ""\n' > /tmp/userlist.txt + + cat > /tmp/pgbouncer.ini <&1) + EXIT_CODE=$? + set -e + + echo "Init attempt $ATTEMPT: $OUTPUT" + + ALREADY_INITIALIZED=$(echo "$OUTPUT" | tr -d '\r' | grep -qiE "already.*initialized" && echo "true" || echo "false") + + if [[ $EXIT_CODE -eq 0 ]]; then + echo "Cluster initialization succeeded." + SUCCESS=true + FRESH_INIT=true + break + elif [[ "$ALREADY_INITIALIZED" == "true" ]]; then + echo "Cluster already initialized — skipping init." + SUCCESS=true + FRESH_INIT=false + break + fi + + echo "Init not ready yet — retrying in 3s..." + sleep 3 + ((ATTEMPT++)) + done + + if [[ "$SUCCESS" == true && "$FRESH_INIT" == true ]]; then + echo "Proceeding to database/user creation..." + cockroach sql --insecure --host="$SELF_FQDN:26257" <&1) + ADD_EXIT_CODE=$? + set -e + + if [[ $ADD_EXIT_CODE -eq 0 ]]; then + echo "Region $loc added successfully." + ADD_REGION_SUCCESS=true + break + fi + + echo "Attempt $attempt failed to add region $loc: $ADD_OUTPUT — retrying in 5s..." + sleep 5 + done + + if [[ "$ADD_REGION_SUCCESS" == false ]]; then + echo "ERROR: Failed to add region $loc after 10 attempts." + fi + fi + done + + echo "Setting survival goal..." + cockroach sql --insecure --host="$SELF_FQDN:26257" \ + --execute="ALTER DATABASE {{ .Values.database.name | quote }} SURVIVE REGION FAILURE;" + + echo "Multi-region configuration complete." + else + echo "Single/dual location deployment — skipping multi-region configuration." + fi + + elif [[ "$SUCCESS" == true && "$FRESH_INIT" == false ]]; then + echo "Existing cluster detected — skipping region & DB configuration." + else + echo "ERROR: Failed to initialize cluster after $MAX_ATTEMPTS attempts." + fi + ) & + fi + + # Replace shell with CockroachDB — receives SIGTERM, container exits if CockroachDB crashes + exec cockroach start \ + --insecure \ + --listen-addr=0.0.0.0:26257 \ + --http-addr=0.0.0.0:8080 \ + --advertise-addr="$SELF_FQDN" \ + --join="$JOIN_HOSTS" \ + --cluster-name={{ include "cockroach.name" . }} \ + --locality=region=${LOCATION} +--- +kind: secret +name: {{ include "cockroach.secretDatabase.name" . }} +description: CockroachDB config +tags: {{- include "cockroach.tags" . | nindent 4 }} +type: dictionary +data: + db: {{ .Values.database.name | quote }} + user: {{ .Values.database.user | quote }} diff --git a/cockroach/versions/1.4.0/templates/volumeset.yaml b/cockroach/versions/1.4.0/templates/volumeset.yaml new file mode 100644 index 00000000..ee6c6d31 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/volumeset.yaml @@ -0,0 +1,18 @@ +kind: volumeset +name: {{ include "cockroach.volume.name" . }} +description: CockroachDB volumeset +gvc: {{ .Values.gvc.name }} +tags: {{- include "cockroach.tags" . | nindent 4 }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.capacity }} + {{- if .Values.volumeset.autoscaling.enabled }} + autoscaling: + maxCapacity: {{ .Values.volumeset.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.volumeset.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.volumeset.autoscaling.scalingFactor }} + {{- end }} + performanceClass: general-purpose-ssd + snapshots: + createFinalSnapshot: true + retentionDuration: 7d diff --git a/cockroach/versions/1.4.0/templates/workload-backup.yaml b/cockroach/versions/1.4.0/templates/workload-backup.yaml new file mode 100644 index 00000000..2c9d2d55 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/workload-backup.yaml @@ -0,0 +1,76 @@ +{{- if .Values.backup.enabled }} +--- +kind: workload +name: {{ include "cockroach.backup.name" . }} +description: CockroachDB Backup +tags: {{- include "cockroach.tags" . | nindent 4 }} +gvc: {{ .Values.gvc.name }} +spec: + type: cron + containers: + - name: backup-cockroach + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + image: {{ .Values.backup.image }} + inheritEnv: false + env: + - name: BACKUP_PROVIDER + value: {{ .Values.backup.provider }} + - name: COCKROACH_HOST + {{- if .Values.pgbouncer.enabled }} + value: {{ include "cockroach.pgbouncer.name" . }}.{{ .Values.gvc.name }}.cpln.local + {{- else }} + value: {{ include "cockroach.name" . }}.{{ .Values.gvc.name }}.cpln.local + {{- end }} + - name: COCKROACH_PORT + {{- if .Values.pgbouncer.enabled }} + value: "5432" + {{- else }} + value: "26257" + {{- end }} + - name: COCKROACH_DB + value: {{ .Values.database.name | quote }} + {{- if eq .Values.backup.provider "aws" }} + - name: AWS_BUCKET + value: {{ .Values.backup.aws.bucket }} + - name: AWS_REGION + value: {{ .Values.backup.aws.region }} + - name: AWS_PREFIX + value: {{ .Values.backup.aws.prefix | quote }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + - name: GCP_BUCKET + value: {{ .Values.backup.gcp.bucket }} + - name: GCP_PREFIX + value: {{ .Values.backup.gcp.prefix | quote }} + {{- end }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: disabled + minScale: 1 + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + suspend: true + timeoutSeconds: 3600 + localOptions: + - location: //location/{{ .Values.backup.location }} + suspend: false + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: [] + internal: + inboundAllowType: same-gvc + inboundAllowWorkload: [] + job: + activeDeadlineSeconds: {{ .Values.backup.activeDeadlineSeconds }} + concurrencyPolicy: Forbid + historyLimit: 5 + restartPolicy: Never + schedule: {{ .Values.backup.schedule }} + supportDynamicTags: false +{{- end }} diff --git a/cockroach/versions/1.4.0/templates/workload-cockroach.yaml b/cockroach/versions/1.4.0/templates/workload-cockroach.yaml new file mode 100644 index 00000000..01c2ccc1 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/workload-cockroach.yaml @@ -0,0 +1,105 @@ +{{- include "cockroach.validateLocations" . -}} +--- +kind: workload +name: {{ include "cockroach.name" . }} +description: CockroachDB cluster +tags: {{- include "cockroach.tags" . | nindent 2 }} +spec: + type: stateful + containers: + - name: cockroach + cpu: {{ .Values.resources.cpu | quote }} + memory: {{ .Values.resources.memory | quote }} + image: {{ .Values.image }} + command: "/bin/bash" + args: + - "/cockroach/start.sh" + inheritEnv: false + env: + - name: DB_NAME + value: cpln://secret/{{ include "cockroach.secretDatabase.name" . }}.db + - name: DB_USER + value: cpln://secret/{{ include "cockroach.secretDatabase.name" . }}.user + ports: + - number: 8080 + protocol: tcp + - number: 26257 + protocol: tcp + lifecycle: + preStop: + exec: + command: + - /bin/sh + - -c + - "cockroach node drain --insecure --host=localhost:26257 --self || true" + livenessProbe: + httpGet: + path: /health + port: 8080 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 6 + readinessProbe: + httpGet: + path: /health + port: 8080 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + volumes: + - path: /cockroach/cockroach-data + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "cockroach.volume.name" . }} + - path: /cockroach/start.sh + uri: cpln://secret/{{ include "cockroach.secretStartup.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ (index .Values.gvc.locations 0).replicas | int }} + metric: disabled + minScale: {{ (index .Values.gvc.locations 0).replicas | int }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + multiZone: + enabled: {{ .Values.multiZone }} + suspend: false + timeoutSeconds: 10 + firewallConfig: + external: + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + {{- if .Values.pgbouncer.enabled }} + inboundAllowType: workload-list + inboundAllowWorkload: + - //gvc/{{ .Values.gvc.name }}/workload/{{ include "cockroach.name" . }} + - //gvc/{{ .Values.gvc.name }}/workload/{{ include "cockroach.pgbouncer.name" . }} + {{- else }} + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + {{- end }} + identityLink: //gvc/{{ .Values.gvc.name }}/identity/{{ include "cockroach.identity.name" . }} + loadBalancer: + replicaDirect: true + localOptions: + {{- range $location := .Values.gvc.locations }} + - autoscaling: + maxConcurrency: 0 + maxScale: {{ if eq ($location.replicas | int) 0 }}1{{ else }}{{ $location.replicas | int }}{{ end }} + metric: disabled + minScale: {{ if eq ($location.replicas | int) 0 }}0{{ else }}{{ $location.replicas | int }}{{ end }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + location: //location/{{ $location.name }} + multiZone: + enabled: {{ $.Values.multiZone }} + suspend: {{ if eq ($location.replicas | int) 0 }}true{{ else }}false{{ end }} + timeoutSeconds: 10 + {{- end }} + supportDynamicTags: false diff --git a/cockroach/versions/1.4.0/templates/workload-pgbouncer.yaml b/cockroach/versions/1.4.0/templates/workload-pgbouncer.yaml new file mode 100644 index 00000000..d03b82e3 --- /dev/null +++ b/cockroach/versions/1.4.0/templates/workload-pgbouncer.yaml @@ -0,0 +1,62 @@ +{{- if .Values.pgbouncer.enabled }} +--- +kind: workload +name: {{ include "cockroach.pgbouncer.name" . }} +description: PgBouncer Connection Pooler for CockroachDB +tags: + {{- include "cockroach.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: pgbouncer + image: {{ .Values.pgbouncer.image }} + cpu: {{ .Values.pgbouncer.resources.cpu | quote }} + minCpu: {{ .Values.pgbouncer.resources.minCpu | quote }} + memory: {{ .Values.pgbouncer.resources.memory | quote }} + minMemory: {{ .Values.pgbouncer.resources.minMemory | quote }} + inheritEnv: false + command: "/bin/sh" + args: + - "/pgbouncer/start.sh" + volumes: + - path: /pgbouncer/start.sh + uri: cpln://secret/{{ include "cockroach.pgbouncer.secretStartup.name" . }} + readinessProbe: + exec: + command: + - /bin/sh + - -c + - "psql -h 127.0.0.1 -p 5432 -U root -d {{ .Values.database.name }} -c 'SELECT 1' -q 2>/dev/null" + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 4 + ports: + - number: 5432 + protocol: tcp + identityLink: //gvc/{{ .Values.gvc.name }}/identity/{{ include "cockroach.identity.name" . }} + defaultOptions: + autoscaling: + metric: rps + minScale: {{ .Values.pgbouncer.minReplicas }} + maxScale: {{ .Values.pgbouncer.maxReplicas }} + maxConcurrency: 0 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + timeoutSeconds: 5 + firewallConfig: + internal: + inboundAllowType: {{ .Values.pgbouncer.internal_access.type }} + {{- if .Values.pgbouncer.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.pgbouncer.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + external: + inboundAllowCIDR: [] + outboundAllowCIDR: [] + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: false +{{- end }} diff --git a/cockroach/versions/1.4.0/values.yaml b/cockroach/versions/1.4.0/values.yaml new file mode 100644 index 00000000..accbbed6 --- /dev/null +++ b/cockroach/versions/1.4.0/values.yaml @@ -0,0 +1,79 @@ +gvc: + name: cockroach-gvc + locations: + - name: aws-us-east-1 + replicas: 3 + - name: aws-eu-central-1 + replicas: 3 + - name: aws-us-west-2 + replicas: 3 + +image: cockroachdb/cockroach:v25.4.0 + +multiZone: false + +resources: + cpu: 2 + memory: 4Gi + +database: + name: mydb + user: myuser + +volumeset: + capacity: 10 # Initial capacity in GiB (minimum is 10) + autoscaling: + enabled: false # Set to true to enable autoscaling + maxCapacity: 100 # Maximum capacity in GiB when autoscaling is enabled + minFreePercentage: 10 # Minimum free percentage to trigger scaling when autoscaling is enabled + scalingFactor: 1.2 # Scaling factor to determine how much to scale up when autoscaling is triggered + +internal_access: + type: same-gvc # options: same-gvc, same-org, workload-list; used for CockroachDB when pgbouncer is disabled + workloads: # Note: can only be used if type is workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + +pgbouncer: + enabled: true + image: edoburu/pgbouncer:v1.25.1-p0 + poolMode: transaction # options: session, transaction, statement + defaultPoolSize: 25 # number of real CockroachDB connections PgBouncer maintains per pod + maxClientConn: 250 # maximum number of client connections PgBouncer accepts per pod + maxDbConnections: 100 # hard cap on total CockroachDB connections regardless of how many PgBouncer pods are running + minReplicas: 2 + maxReplicas: 4 + serverCheckDelay: 30 # seconds between idle server connection health checks (default 30) + serverConnectTimeout: 2 # seconds before giving up on a new server connection (default 15) + serverLoginRetry: 0 # seconds before retrying a failed server login; 0 = no caching of failures (default 15) + clientLoginTimeout: 10 # seconds before rejecting a client waiting for login (default 60) + queryWaitTimeout: 10 # seconds before rejecting a logged-in client waiting for a server connection (default 120) + internal_access: + type: same-gvc # options: same-gvc, same-org, workload-list; controls who can connect to PgBouncer + workloads: # Note: can only be used if type is workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + resources: + cpu: 200m + minCpu: 100m + memory: 1Gi + minMemory: 128Mi + +backup: + enabled: false + image: ghcr.io/controlplane-com/backup-images/cockroach-backup:1.1 + schedule: "0 2 * * *" + activeDeadlineSeconds: 14400 # hard kill after 4 hours if backup hangs + location: aws-us-east-1 # run the backup job in the same region as your storage bucket + resources: + cpu: 500m + memory: 512Mi + provider: aws # options: aws, gcp + aws: + bucket: my-backup-bucket + region: us-east-1 + cloudAccountName: my-backup-cloudaccount + policyName: my-backup-policy + prefix: cockroach/backups + gcp: + bucket: my-backup-bucket + cloudAccountName: my-backup-cloudaccount + prefix: cockroach/backups diff --git a/coraza/versions/1.1.1/Chart.yaml b/coraza/versions/1.1.1/Chart.yaml new file mode 100644 index 00000000..e3bb23ea --- /dev/null +++ b/coraza/versions/1.1.1/Chart.yaml @@ -0,0 +1,18 @@ +apiVersion: v2 +name: coraza +description: Coraza web application firewall (WAF) for Control Plane + +type: application +version: 1.1.1 +appVersion: "20241018" + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2025-10-23" + lastModified: "2026-05-13" + category: "security" + createsGvc: false \ No newline at end of file diff --git a/coraza/versions/1.1.1/README.md b/coraza/versions/1.1.1/README.md new file mode 100644 index 00000000..6789b964 --- /dev/null +++ b/coraza/versions/1.1.1/README.md @@ -0,0 +1,53 @@ +## Coraza WAF App + +Creates a Coraza Web Application Firewall (WAF) with OWASP Core Rule Set (CRS) integration that proxies traffic to a target workload, providing comprehensive security filtering and protection. + +### Configuration + +The following values can be configured in your values file: + +- `targetWorkload`: The internal name of the workload to proxy traffic to (`WORKLOAD_NAME.GVC_NAME.cpln.local`) +- `targetPort`: The port of the target workload to proxy traffic to +- `WAFPort`: The port on the WAF workload to expose to the internet +- `resources`: Reserved resources for the workload +- `multiZone`: Deploys replicas across multiple zones +- `diskBodyInspection`: When `true` (default), request bodies exceeding the 512KB in-memory limit are buffered to disk at `/tmp/coraza` for full inspection up to 12.5MB. When `false`, all body inspection is kept in memory — bodies up to 12.5MB are held in memory rather than spilling to disk, which avoids disk I/O but increases memory pressure on large requests. + +### Logging + +All Coraza logging is currently sent to `/dev/stdout` to be readable in the Control Plane built-in logging interface. Logging can be redirected by using the existing environment variables in the workload configuration. + +### Advanced Configuration + +Coraza configuration is largely specified through environment variables and can be customized by the user once installed. You can modify these environment variables in the workload configuration to adjust Coraza's behavior, logging levels, and security policies according to your specific requirements. + +### Usage + +The Coraza WAF will act as a reverse proxy, filtering incoming requests before forwarding them to your target workload. Configure the `targetWorkload` and `targetPort` values to point to your application, then the WAF will be accessible on the specified `WAFPort`. + +**Important**: The target workload must be configured with internal access set to `same-gvc`, `same-org`, or specifically allow this workload in order for the WAF to reach it. + +### Security Features + +Coraza provides web application firewall capabilities including: +- Automatic integration of OWASP Core Rule Set (CRS) for comprehensive protection +- Request filtering and validation +- Protection against common web attacks +- Custom rule configuration +- Traffic monitoring and logging + +### Custom Rules + +After installation, you can add custom rules by editing the created secret with the suffix `coraza-custom-rules`. The secret contains an example rule that blocks requests containing "attack" in the URI: + +``` +SecRule REQUEST_URI "@rx attack" "id:1001,phase:1,deny,msg:'Blocked attack attempt'" +``` + +**Note**: After modifying the custom rules secret, you must restart the workload replicas for the changes to take effect. See the Coraza and CRS documentation below for instructions on creating custom rules. + +## Additional Resources + +- [OWASP Coraza Docs](https://coraza.io/docs/tutorials/introduction/) +- [OWASP CRS Docs](https://coreruleset.org/docs/) +- [Coraza Caddy README](https://github.com/coreruleset/coraza-crs-docker#) \ No newline at end of file diff --git a/coraza/versions/1.1.1/templates/_helpers.tpl b/coraza/versions/1.1.1/templates/_helpers.tpl new file mode 100644 index 00000000..be9ee0b8 --- /dev/null +++ b/coraza/versions/1.1.1/templates/_helpers.tpl @@ -0,0 +1,46 @@ +{{/* Resource Naming */}} + +{{/* +Coraza Workload Name +*/}} +{{- define "coraza.name" -}} +{{- printf "%s-coraza-waf" .Release.Name }} +{{- end }} + +{{/* +Coraza Secret Custom Rules Name +*/}} +{{- define "coraza.secretRules.name" -}} +{{- printf "%s-coraza-custom-rules" .Release.Name }} +{{- end }} + +{{/* +Coraza Secret Startup Name +*/}} +{{- define "coraza.secretStartup.name" -}} +{{- printf "%s-coraza-startup" .Release.Name }} +{{- end }} + +{{/* +Coraza Identity Name +*/}} +{{- define "coraza.identity.name" -}} +{{- printf "%s-coraza-identity" .Release.Name }} +{{- end }} + +{{/* +Coraza Policy Name +*/}} +{{- define "coraza.policy.name" -}} +{{- printf "%s-coraza-policy" .Release.Name }} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "coraza.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} diff --git a/coraza/versions/1.1.1/templates/identity.yaml b/coraza/versions/1.1.1/templates/identity.yaml new file mode 100644 index 00000000..aa2418c8 --- /dev/null +++ b/coraza/versions/1.1.1/templates/identity.yaml @@ -0,0 +1,6 @@ +--- +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "coraza.identity.name" . }} +description: Coraza identity +tags: {{- include "coraza.tags" . | nindent 4 }} \ No newline at end of file diff --git a/coraza/versions/1.1.1/templates/policy.yaml b/coraza/versions/1.1.1/templates/policy.yaml new file mode 100644 index 00000000..c560cae9 --- /dev/null +++ b/coraza/versions/1.1.1/templates/policy.yaml @@ -0,0 +1,14 @@ +--- +kind: policy +name: {{ include "coraza.policy.name" . }} +description: Coraza WAF policy +tags: {{- include "coraza.tags" . | nindent 4 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "coraza.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "coraza.secretStartup.name" . }} + - //secret/{{ include "coraza.secretRules.name" . }} \ No newline at end of file diff --git a/coraza/versions/1.1.1/templates/secret-custom-rules.yaml b/coraza/versions/1.1.1/templates/secret-custom-rules.yaml new file mode 100644 index 00000000..0e025824 --- /dev/null +++ b/coraza/versions/1.1.1/templates/secret-custom-rules.yaml @@ -0,0 +1,15 @@ +--- +kind: secret +name: {{ include "coraza.secretRules.name" . }} +description: Coraza WAF custom rules +tags: {{- include "coraza.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + # Add your custom rules here + + # Example Rule: block requests containing "attack" in the URI + # Test by querying the endpoint with /attack + + SecRule REQUEST_URI "@rx attack" "id:1001,phase:1,deny,msg:'Blocked attack attempt'" \ No newline at end of file diff --git a/coraza/versions/1.1.1/templates/secret-startup.yaml b/coraza/versions/1.1.1/templates/secret-startup.yaml new file mode 100644 index 00000000..50c37dda --- /dev/null +++ b/coraza/versions/1.1.1/templates/secret-startup.yaml @@ -0,0 +1,88 @@ +--- +kind: secret +name: {{ include "coraza.secretStartup.name" . }} +description: Coraza WAF startup script +tags: {{- include "coraza.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: > + #!/bin/sh + + set -u + + # Script sends patch request to Caddy admin API to configure proper header for reverse proxy to target workload + + # Caddy admin API URL and port + + ADMIN_URL="127.0.0.1" + + ADMIN_PORT="2019" + + ROUTE_PATH="/config/apps/http/servers/srv0/routes/0/handle/1" + + PATCH_FILE="/tmp/proxy_patch.json" + + + {{ if .Values.diskBodyInspection }} + mkdir -p /tmp/coraza + {{ end }} + + echo "[INFO] Waiting for Caddy admin API..." + + until wget -q --spider "http://${ADMIN_URL}:${ADMIN_PORT}/config/" + 2>/dev/null; do + sleep 1 + done + + echo "[INFO] Caddy admin API is up." + + + # Create patch file with proper header + + cat > "${PATCH_FILE}" <<'EOF' + + { + "handler": "reverse_proxy", + "headers": { + "request": { + "set": { + "Host": ["{{ .Values.targetWorkload }}"] + } + } + }, + "trusted_proxies": ["192.168.0.0/16","172.16.0.0/12","10.0.0.0/8","127.0.0.1/8","fd00::/8","::1"], + "upstreams": [ + { "dial": "{{ .Values.targetWorkload }}:{{ .Values.targetPort }}" } + ] + } + + EOF + + + echo "[INFO] Applying PATCH via netcat..." + + LEN=$(wc -c < "${PATCH_FILE}") + + # Requests patch command to admin API using netcat + + RESPONSE=$( + { + printf "PATCH %s HTTP/1.1\r\nHost: %s:%s\r\nContent-Type: application/json\r\nContent-Length: %s\r\nConnection: close\r\n\r\n" \ + "${ROUTE_PATH}" "${ADMIN_URL}" "${ADMIN_PORT}" "${LEN}" + cat "${PATCH_FILE}" + } | nc ${ADMIN_URL} ${ADMIN_PORT} || true + ) + + + echo "$RESPONSE" + + + if echo "$RESPONSE" | grep -q "200 OK"; then + echo "[INFO] Patch succeeded." + else + echo "[WARN] Patch may have failed — see response above." + fi + + + rm -f "${PATCH_FILE}" \ No newline at end of file diff --git a/coraza/versions/1.1.1/templates/workload.yaml b/coraza/versions/1.1.1/templates/workload.yaml new file mode 100644 index 00000000..868d2e55 --- /dev/null +++ b/coraza/versions/1.1.1/templates/workload.yaml @@ -0,0 +1,82 @@ +kind: workload +name: {{ include "coraza.name" . }} +description: Coraza Web Application Firewall (WAF) +tags: {{- include "coraza.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: coraza-crs + cpu: {{ .Values.resources.cpu | quote }} + env: + - name: ACCESSLOG + value: /dev/stdout + - name: BACKEND + value: http://{{ .Values.targetWorkload }}:{{ .Values.targetPort }} + - name: CORAZA_AUDIT_LOG + value: /dev/stdout + - name: CORAZA_AUDIT_LOG_PARTS + value: ABDEFHIJZ + - name: CORAZA_DEBUG_LOG + value: /dev/stdout + - name: CORAZA_DEBUG_LOGLEVEL + value: '1' + - name: CORAZA_RULE_ENGINE + value: 'On' + {{- if not .Values.diskBodyInspection }} + - name: CORAZA_REQ_BODY_NOFILES_LIMIT + value: '13107200' + {{- end }} + - name: PORT + value: "{{ .Values.WAFPort }}" + image: {{ .Values.image }} + inheritEnv: false + lifecycle: + postStart: + exec: + command: + - /bin/sh + - /opt/coraza/startup.sh + memory: {{ .Values.resources.memory | quote }} + ports: + - number: {{ .Values.WAFPort }} + protocol: http + volumes: + - path: /opt/coraza/rules.d/custom.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "coraza.secretRules.name" . }} + - path: /opt/coraza/startup.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "coraza.secretStartup.name" . }} + defaultOptions: + multiZone: + enabled: {{ .Values.multiZone }} + autoscaling: + maxConcurrency: 0 + maxScale: 3 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: + - 0.0.0.0/0 + inboundBlockedCIDR: [] + outboundAllowCIDR: [] + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: same-gvc + inboundAllowWorkload: [] + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "coraza.identity.name" . }} + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: false + supportDynamicTags: false diff --git a/coraza/versions/1.1.1/values.yaml b/coraza/versions/1.1.1/values.yaml new file mode 100644 index 00000000..a28eee54 --- /dev/null +++ b/coraza/versions/1.1.1/values.yaml @@ -0,0 +1,16 @@ +image: ghcr.io/coreruleset/coraza-crs@sha256:eed7280e0de4820507b500b1ee10de820c175165d5cce329609bf34f32977af8 # Coraza GitHub image + +# MUST BE CHANGED +targetWorkload: my-workload.my-gvc.cpln.local # Workload internal name of the workload to proxy traffic to + +targetPort: 8080 # Port of the workload to proxy traffic to + +WAFPort: 80 # Port on the WAF workload to expose to the internet + +resources: + cpu: 50m + memory: 128Mi + +multiZone: false + +diskBodyInspection: true # When true, request bodies exceeding the in-memory limit are buffered to disk for inspection. Disable to keep all body inspection in memory. \ No newline at end of file diff --git a/debezium-server/deploy-aws-us-east-2.yaml b/debezium-server/deploy-aws-us-east-2.yaml new file mode 100644 index 00000000..cc7b8eef --- /dev/null +++ b/debezium-server/deploy-aws-us-east-2.yaml @@ -0,0 +1,37 @@ +source: + type: postgres + database: + hostname: data-postgres-ha-proxy.aws-us-east-2.cpln.local + port: 5432 + name: test + user: username + password: password + serverName: dbserver1 + postgres: + slotName: debezium + publicationName: dbz_publication + pluginName: pgoutput + slotDropOnStop: false + heartbeatIntervalMs: 5000 + heartbeatActionQuery: "UPDATE debezium_heartbeat SET ts = now() WHERE id = 1" + offset: + storage: file + flushIntervalMs: 5000 + flushTimeoutMs: 10000 + errors: + retryDelayInitialMs: 300 + retryDelayMaxMs: 10000 + maxRetries: 10 + +sink: + type: kafka + kafka: + bootstrapServers: "replica-0.etl-cluster.aws-us-east-2.aws-us-east-2.cpln.local:9092,replica-1.etl-cluster.aws-us-east-2.aws-us-east-2.cpln.local:9092,replica-2.etl-cluster.aws-us-east-2.aws-us-east-2.cpln.local:9092" + securityProtocol: SASL_PLAINTEXT + saslMechanism: PLAIN + saslUsername: admin + saslPassword: your-admin-password + +format: + key: json + value: json diff --git a/debezium-server/icon.png b/debezium-server/icon.png new file mode 100644 index 00000000..f89204d1 Binary files /dev/null and b/debezium-server/icon.png differ diff --git a/debezium-server/versions/1.0.0/Chart.yaml b/debezium-server/versions/1.0.0/Chart.yaml new file mode 100644 index 00000000..de9327f6 --- /dev/null +++ b/debezium-server/versions/1.0.0/Chart.yaml @@ -0,0 +1,12 @@ +apiVersion: v2 +name: debezium-server +description: Debezium Server CDC app for Control Plane (standalone mode) +type: application +version: 1.0.0 +appVersion: "3.0" + +annotations: + created: "2026-04-03" + lastModified: "2026-04-03" + category: "event-streaming" + createsGvc: false diff --git a/debezium-server/versions/1.0.0/README.md b/debezium-server/versions/1.0.0/README.md new file mode 100644 index 00000000..02c3ac45 --- /dev/null +++ b/debezium-server/versions/1.0.0/README.md @@ -0,0 +1,347 @@ +# Debezium Server Template + +Debezium Server is a standalone Change Data Capture (CDC) application that streams database changes to various messaging systems. Unlike Debezium connectors that run on Kafka Connect, Debezium Server runs as a standalone application and can send events directly to Kafka, Redis, NATS, HTTP endpoints, cloud services, and more. + +## Overview + +This template deploys Debezium Server on Control Plane with: + +- Configurable source database connectors (PostgreSQL, MySQL, MongoDB, SQL Server, Oracle) +- Multiple sink options (Kafka, Redis, NATS JetStream, HTTP, AWS Kinesis, GCP Pub/Sub, Pulsar, Event Hubs) +- Flexible offset storage (file, Redis, JDBC) +- Universal Cloud Identity integration for AWS and GCP sinks +- Automatic secret management for credentials + +## Quick Start + +### PostgreSQL to Kafka + +```yaml +source: + type: postgres + database: + hostname: postgres.mygvc.cpln.local + port: 5432 + name: mydb + user: debezium + password: secret123 + serverName: myserver + tableIncludeList: "public.users,public.orders" + postgres: + slotName: debezium_slot + publicationName: dbz_publication + +sink: + type: kafka + kafka: + bootstrapServers: kafka.mygvc.cpln.local:9092 + topic: cdc-events + +format: + key: json + value: json +``` + +### MySQL to Redis Streams + +```yaml +source: + type: mysql + database: + hostname: mysql.mygvc.cpln.local + port: 3306 + name: mydb + user: debezium + password: secret123 + serverName: myserver + mysql: + serverId: 85744 + includeSchemaChanges: true + +sink: + type: redis + redis: + address: redis.mygvc.cpln.local:6379 + streamName: cdc-stream +``` + +### PostgreSQL to AWS Kinesis (Universal Cloud Identity) + +```yaml +source: + type: postgres + database: + hostname: my-rds-instance.us-east-1.rds.amazonaws.com + port: 5432 + name: mydb + user: debezium + password: secret123 + serverName: myserver + +sink: + type: kinesis + kinesis: + region: us-east-1 + streamName: cdc-events + credentialsProvider: default + cloudAccount: + enabled: true + name: my-aws-account +``` + +## Supported Sources + +| Database | Connector | Default Port | Key Configuration | +|----------|-----------|--------------|-------------------| +| PostgreSQL | PostgresConnector | 5432 | `slotName`, `publicationName`, `pluginName` | +| MySQL | MySqlConnector | 3306 | `serverId`, `includeSchemaChanges` | +| MongoDB | MongoDbConnector | 27017 | `connectionString`, `replicaSet` | +| SQL Server | SqlServerConnector | 1433 | `databaseNames`, `snapshotMode` | +| Oracle | OracleConnector | 1521 | `pdbName`, `logMiningStrategy` | + +### PostgreSQL Prerequisites + +1. Enable logical replication in `postgresql.conf`: + ``` + wal_level = logical + max_replication_slots = 4 + max_wal_senders = 4 + ``` + +2. Create a publication and replication slot: + ```sql + CREATE PUBLICATION dbz_publication FOR ALL TABLES; + -- Slot is created automatically by Debezium + ``` + +3. Grant permissions: + ```sql + GRANT USAGE ON SCHEMA public TO debezium; + GRANT SELECT ON ALL TABLES IN SCHEMA public TO debezium; + ALTER USER debezium REPLICATION; + ``` + +### MySQL Prerequisites + +1. Enable binary logging in `my.cnf`: + ``` + server-id = 1 + log_bin = mysql-bin + binlog_format = ROW + binlog_row_image = FULL + ``` + +2. Grant permissions: + ```sql + GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'debezium'@'%'; + ``` + +## Supported Sinks + +| Sink | Required Configuration | Notes | +|------|------------------------|-------| +| Kafka | `bootstrapServers` | Simple Kafka producer (no Kafka Connect required) | +| Redis | `address` | Redis Streams for real-time event streaming | +| NATS JetStream | `url` | Cloud-native messaging with persistence | +| HTTP | `url` | Webhooks and custom HTTP endpoints | +| Kinesis | `region`, `streamName` | AWS Kinesis (uses Universal Cloud Identity) | +| Pub/Sub | `projectId` | GCP Pub/Sub (uses Universal Cloud Identity) | +| Pulsar | `serviceUrl` | Apache Pulsar with optional authentication | +| Event Hubs | `connectionString`, `hubName` | Azure Event Hubs | + +## Offset Storage + +Debezium tracks the position of captured changes using offset storage. Three options are available: + +### File Storage (Default) + +Stores offsets in a local file. Requires a volumeset for persistence. + +```yaml +source: + offset: + storage: file + file: + filename: /debezium/data/offsets.dat + +volumeset: + capacity: 10 + performanceClass: general-purpose-ssd +``` + +### Redis Storage + +Stores offsets in Redis. No volumeset required. + +```yaml +source: + offset: + storage: redis + redis: + address: redis.mygvc.cpln.local:6379 + key: debezium:offsets + password: "" + ssl: false +``` + +### JDBC Storage + +Stores offsets in a relational database. No volumeset required. + +```yaml +source: + offset: + storage: jdbc + jdbc: + url: jdbc:postgresql://postgres.mygvc.cpln.local:5432/offsets + user: debezium + password: secret123 + tableName: debezium_offsets +``` + +## Schema History (MySQL/SQL Server Only) + +MySQL and SQL Server connectors require schema history storage to track DDL changes: + +```yaml +source: + type: mysql + schemaHistory: + storage: file # or: redis, jdbc + file: + filename: /debezium/data/schema-history.dat +``` + +## Serialization Formats + +Supports JSON, Avro, and Protobuf serialization: + +```yaml +format: + key: json + value: json + + # For Avro/Protobuf, configure schema registry: + schemaRegistry: + url: http://schema-registry.mygvc.cpln.local:8081 + username: "" + password: "" +``` + +## Universal Cloud Identity + +For AWS Kinesis and GCP Pub/Sub sinks, this template integrates with Control Plane's Universal Cloud Identity for credential-less authentication. + +### AWS Kinesis + +1. Create an AWS cloud account in Control Plane +2. Configure the identity with appropriate IAM policies +3. Enable the cloud account in your values: + +```yaml +sink: + type: kinesis + kinesis: + region: us-east-1 + streamName: my-stream + credentialsProvider: default + cloudAccount: + enabled: true + name: my-aws-account +``` + +### GCP Pub/Sub + +```yaml +sink: + type: pubsub + pubsub: + projectId: my-gcp-project + cloudAccount: + enabled: true + name: my-gcp-account +``` + +## Resource Configuration + +```yaml +resources: + cpu: 500m # CPU allocation + memory: 512Mi # Memory allocation + +volumeset: + capacity: 10 # GiB (only used with file storage) + performanceClass: general-purpose-ssd +``` + +## Firewall Configuration + +```yaml +firewall: + internal: + inboundAllowType: same-gvc # none, same-gvc, same-org, workload-list + workloads: [] # For workload-list type + external: + outboundAllowCIDR: + - 0.0.0.0/0 # Required for external database connectivity +``` + +## Health Checks + +Debezium Server exposes Quarkus health endpoints: + +- **Readiness**: `/q/health/ready` - Checks if the connector is ready +- **Liveness**: `/q/health/live` - Checks if the server is alive + +## Installation + +```bash +cpln helm install debezium ./debezium-server/versions/1.0.0 \ + --gvc my-gvc \ + -f my-values.yaml +``` + +## Verification + +1. Check workload status: + ```bash + cpln workload get debezium--debezium --gvc my-gvc + ``` + +2. Check health endpoint: + ```bash + curl http://debezium--debezium.my-gvc.cpln.local:8080/q/health + ``` + +3. View logs: + ```bash + cpln workload logs debezium--debezium --gvc my-gvc + ``` + +4. Test CDC by making changes in the source database and verifying events appear in the configured sink. + +## Troubleshooting + +### Connector Not Starting + +- Check database connectivity and credentials +- Verify replication permissions are granted +- Review logs for specific error messages + +### Offset Storage Issues + +- For file storage: ensure volumeset is properly mounted +- For Redis/JDBC: verify connectivity and credentials +- Check that the storage backend is accessible from the GVC + +### Sink Delivery Failures + +- Verify sink connectivity and authentication +- For cloud sinks (Kinesis/Pub/Sub): ensure cloud account is properly configured +- Check firewall rules allow outbound traffic to the sink + +## Resources + +- [Debezium Documentation](https://debezium.io/documentation/) +- [Debezium Server Documentation](https://debezium.io/documentation/reference/stable/operations/debezium-server.html) +- [Control Plane Documentation](https://docs.controlplane.com/) diff --git a/debezium-server/versions/1.0.0/templates/_helpers.tpl b/debezium-server/versions/1.0.0/templates/_helpers.tpl new file mode 100644 index 00000000..d99d2eb9 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/_helpers.tpl @@ -0,0 +1,264 @@ +{{/* +================================================================================ +Resource Naming +================================================================================ +*/}} + +{{/* +Debezium Server Workload Name +*/}} +{{- define "debezium.name" -}} +{{- printf "%s-debezium" .Release.Name }} +{{- end }} + +{{/* +Debezium Identity Name +*/}} +{{- define "debezium.identity.name" -}} +{{- printf "%s-debezium-identity" .Release.Name }} +{{- end }} + +{{/* +Debezium Policy Name +*/}} +{{- define "debezium.policy.name" -}} +{{- printf "%s-debezium-policy" .Release.Name }} +{{- end }} + +{{/* +Debezium Config Secret Name (opaque - application.properties) +*/}} +{{- define "debezium.config.name" -}} +{{- printf "%s-debezium-config" .Release.Name }} +{{- end }} + +{{/* +Debezium Credentials Secret Name (dictionary) +*/}} +{{- define "debezium.credentials.name" -}} +{{- printf "%s-debezium-credentials" .Release.Name }} +{{- end }} + +{{/* +Debezium Volumeset Name +*/}} +{{- define "debezium.volumeset.name" -}} +{{- printf "%s-debezium-data" .Release.Name }} +{{- end }} + +{{/* +Debezium Entrypoint Secret Name +*/}} +{{- define "debezium.entrypoint.name" -}} +{{- printf "%s-debezium-entrypoint" .Release.Name }} +{{- end }} + +{{/* +================================================================================ +Validation Helpers +================================================================================ +*/}} + +{{/* +Validate source configuration +*/}} +{{- define "debezium.validateSource" -}} +{{- $validTypes := list "postgres" "mysql" "mongodb" "sqlserver" "oracle" -}} +{{- if not (has .Values.source.type $validTypes) -}} +{{- fail (printf "Invalid source.type '%s'. Must be one of: %s" .Values.source.type (join ", " $validTypes)) -}} +{{- end -}} +{{- if not .Values.source.database.hostname -}} +{{- fail "source.database.hostname is required" -}} +{{- end -}} +{{- if not .Values.source.database.name -}} +{{- fail "source.database.name is required" -}} +{{- end -}} +{{- if not .Values.source.database.user -}} +{{- fail "source.database.user is required" -}} +{{- end -}} +{{- if not .Values.source.database.password -}} +{{- fail "source.database.password is required" -}} +{{- end -}} +{{- end -}} + +{{/* +Validate sink configuration +*/}} +{{- define "debezium.validateSink" -}} +{{- $validTypes := list "kafka" "redis" "nats-jetstream" "http" "kinesis" "pubsub" "pulsar" "eventhubs" -}} +{{- if not (has .Values.sink.type $validTypes) -}} +{{- fail (printf "Invalid sink.type '%s'. Must be one of: %s" .Values.sink.type (join ", " $validTypes)) -}} +{{- end -}} +{{- if eq .Values.sink.type "kafka" -}} + {{- if not .Values.sink.kafka.bootstrapServers -}} + {{- fail "sink.kafka.bootstrapServers is required when sink.type is 'kafka'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "redis" -}} + {{- if not .Values.sink.redis.address -}} + {{- fail "sink.redis.address is required when sink.type is 'redis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "nats-jetstream" -}} + {{- if not .Values.sink.nats.url -}} + {{- fail "sink.nats.url is required when sink.type is 'nats-jetstream'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "http" -}} + {{- if not .Values.sink.http.url -}} + {{- fail "sink.http.url is required when sink.type is 'http'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "kinesis" -}} + {{- if not .Values.sink.kinesis.region -}} + {{- fail "sink.kinesis.region is required when sink.type is 'kinesis'" -}} + {{- end -}} + {{- if not .Values.sink.kinesis.streamName -}} + {{- fail "sink.kinesis.streamName is required when sink.type is 'kinesis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "pubsub" -}} + {{- if not .Values.sink.pubsub.projectId -}} + {{- fail "sink.pubsub.projectId is required when sink.type is 'pubsub'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "pulsar" -}} + {{- if not .Values.sink.pulsar.serviceUrl -}} + {{- fail "sink.pulsar.serviceUrl is required when sink.type is 'pulsar'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "eventhubs" -}} + {{- if not .Values.sink.eventhubs.connectionString -}} + {{- fail "sink.eventhubs.connectionString is required when sink.type is 'eventhubs'" -}} + {{- end -}} + {{- if not .Values.sink.eventhubs.hubName -}} + {{- fail "sink.eventhubs.hubName is required when sink.type is 'eventhubs'" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +Validate offset storage configuration +*/}} +{{- define "debezium.validateOffsetStorage" -}} +{{- $validTypes := list "file" "redis" "jdbc" -}} +{{- if not (has .Values.source.offset.storage $validTypes) -}} +{{- fail (printf "Invalid source.offset.storage '%s'. Must be one of: %s" .Values.source.offset.storage (join ", " $validTypes)) -}} +{{- end -}} +{{- if eq .Values.source.offset.storage "redis" -}} + {{- if not .Values.source.offset.redis.address -}} + {{- fail "source.offset.redis.address is required when offset storage is 'redis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.source.offset.storage "jdbc" -}} + {{- if not .Values.source.offset.jdbc.url -}} + {{- fail "source.offset.jdbc.url is required when offset storage is 'jdbc'" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Connector Class Mapping +================================================================================ +*/}} + +{{/* +Get the Debezium connector class for the source type +*/}} +{{- define "debezium.connectorClass" -}} +{{- $connectorMap := dict + "postgres" "io.debezium.connector.postgresql.PostgresConnector" + "mysql" "io.debezium.connector.mysql.MySqlConnector" + "mongodb" "io.debezium.connector.mongodb.MongoDbConnector" + "sqlserver" "io.debezium.connector.sqlserver.SqlServerConnector" + "oracle" "io.debezium.connector.oracle.OracleConnector" +-}} +{{- get $connectorMap .Values.source.type -}} +{{- end -}} + +{{/* +Get the default port for the source type +*/}} +{{- define "debezium.defaultPort" -}} +{{- $portMap := dict + "postgres" 5432 + "mysql" 3306 + "mongodb" 27017 + "sqlserver" 1433 + "oracle" 1521 +-}} +{{- get $portMap .Values.source.type -}} +{{- end -}} + +{{/* +Get the effective database port +*/}} +{{- define "debezium.databasePort" -}} +{{- if .Values.source.database.port -}} +{{- .Values.source.database.port -}} +{{- else -}} +{{- include "debezium.defaultPort" . -}} +{{- end -}} +{{- end -}} + +{{/* +Check if schema history is required (MySQL and SQL Server need it) +*/}} +{{- define "debezium.requiresSchemaHistory" -}} +{{- if or (eq .Values.source.type "mysql") (eq .Values.source.type "sqlserver") -}} +true +{{- else -}} +false +{{- end -}} +{{- end -}} + +{{/* +Check if file-based storage is used (requires volumeset) +*/}} +{{- define "debezium.requiresVolumeset" -}} +{{- if eq .Values.source.offset.storage "file" -}} +true +{{- else if and (eq (include "debezium.requiresSchemaHistory" .) "true") (eq .Values.source.schemaHistory.storage "file") -}} +true +{{- else -}} +false +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Labeling +================================================================================ +*/}} + +{{/* +Create chart name and version as used by the chart label +*/}} +{{- define "debezium.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Common labels/tags +*/}} +{{- define "debezium.tags" -}} +helm.sh/chart: {{ include "debezium.chart" . }} +{{ include "debezium.selectorLabels" . }} +{{- if .Chart.AppVersion }} +app.cpln.io/version: {{ .Chart.AppVersion | quote }} +{{- end }} +app.cpln.io/managed-by: {{ .Release.Service }} +cpln/marketplace: "true" +cpln/marketplace-template: debezium-server +cpln/marketplace-template-version: {{ .Chart.Version }} +cpln/marketplace-gvc: {{ .Values.global.cpln.gvc }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "debezium.selectorLabels" -}} +app.cpln.io/name: {{ .Release.Name }} +app.cpln.io/instance: {{ .Release.Name }} +{{- end }} diff --git a/debezium-server/versions/1.0.0/templates/identity.yaml b/debezium-server/versions/1.0.0/templates/identity.yaml new file mode 100644 index 00000000..ac53fdd2 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/identity.yaml @@ -0,0 +1,19 @@ +{{- include "debezium.validateSource" . -}} +{{- include "debezium.validateSink" . -}} +{{- include "debezium.validateOffsetStorage" . -}} +kind: identity +name: {{ include "debezium.identity.name" . }} +description: Debezium Server identity for secret access and cloud integration +gvc: {{ .Values.global.cpln.gvc }} +tags: + {{- include "debezium.tags" . | nindent 2 }} +{{- if and (eq .Values.sink.type "kinesis") .Values.sink.kinesis.cloudAccount.enabled }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.sink.kinesis.cloudAccount.name }} +{{- end }} +{{- if and (eq .Values.sink.type "pubsub") .Values.sink.pubsub.cloudAccount.enabled }} +gcp: + cloudAccountLink: //cloudaccount/{{ .Values.sink.pubsub.cloudAccount.name }} + scopes: + - https://www.googleapis.com/auth/pubsub +{{- end }} diff --git a/debezium-server/versions/1.0.0/templates/policy.yaml b/debezium-server/versions/1.0.0/templates/policy.yaml new file mode 100644 index 00000000..b9e40ef1 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/policy.yaml @@ -0,0 +1,17 @@ +kind: policy +name: {{ include "debezium.policy.name" . }} +description: Debezium Server policy for secret access +tags: + {{- include "debezium.tags" . | nindent 2 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "debezium.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "debezium.config.name" . }} + - //secret/{{ include "debezium.credentials.name" . }} + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + - //secret/{{ include "debezium.entrypoint.name" . }} + {{- end }} diff --git a/debezium-server/versions/1.0.0/templates/secret-config.yaml b/debezium-server/versions/1.0.0/templates/secret-config.yaml new file mode 100644 index 00000000..b32381b5 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/secret-config.yaml @@ -0,0 +1,260 @@ +kind: secret +name: {{ include "debezium.config.name" . }} +description: Debezium Server application.properties configuration +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + # ============================================================================= + # Debezium Server Configuration + # Generated by Control Plane Debezium Server Template + # ============================================================================= + + # ----------------------------------------------------------------------------- + # Quarkus Settings + # ----------------------------------------------------------------------------- + quarkus.http.port=8080 + quarkus.log.console.json=false + + # ----------------------------------------------------------------------------- + # Source Connector Configuration + # ----------------------------------------------------------------------------- + debezium.source.connector.class={{ include "debezium.connectorClass" . }} + debezium.source.topic.prefix={{ .Values.source.serverName }} + + # Database connection + debezium.source.database.hostname=${DB_HOSTNAME} + debezium.source.database.port={{ include "debezium.databasePort" . }} + debezium.source.database.user=${DB_USER} + debezium.source.database.password=${DB_PASSWORD} + {{- if or (eq .Values.source.type "oracle") (eq .Values.source.type "postgres") }} + debezium.source.database.dbname={{ .Values.source.database.name }} + {{- else }} + debezium.source.database.name={{ .Values.source.database.name }} + {{- end }} + + {{- if .Values.source.tableIncludeList }} + debezium.source.table.include.list={{ .Values.source.tableIncludeList }} + {{- end }} + {{- if .Values.source.tableExcludeList }} + debezium.source.table.exclude.list={{ .Values.source.tableExcludeList }} + {{- end }} + + {{- /* PostgreSQL-specific settings */}} + {{- if eq .Values.source.type "postgres" }} + debezium.source.plugin.name={{ .Values.source.postgres.pluginName }} + debezium.source.slot.name={{ .Values.source.postgres.slotName }} + debezium.source.publication.name={{ .Values.source.postgres.publicationName }} + debezium.source.slot.drop.on.stop={{ .Values.source.postgres.slotDropOnStop }} + {{- if gt (int .Values.source.postgres.heartbeatIntervalMs) 0 }} + debezium.source.heartbeat.interval.ms={{ .Values.source.postgres.heartbeatIntervalMs }} + {{- if .Values.source.postgres.heartbeatActionQuery }} + debezium.source.heartbeat.action.query={{ .Values.source.postgres.heartbeatActionQuery }} + {{- end }} + {{- end }} + {{- end }} + + {{- /* MySQL-specific settings */}} + {{- if eq .Values.source.type "mysql" }} + debezium.source.database.server.id={{ .Values.source.mysql.serverId }} + debezium.source.include.schema.changes={{ .Values.source.mysql.includeSchemaChanges }} + {{- end }} + + {{- /* MongoDB-specific settings */}} + {{- if eq .Values.source.type "mongodb" }} + {{- if .Values.source.mongodb.connectionString }} + debezium.source.mongodb.connection.string=${MONGODB_CONNECTION_STRING} + {{- end }} + {{- if .Values.source.mongodb.replicaSet }} + debezium.source.mongodb.replica.set={{ .Values.source.mongodb.replicaSet }} + {{- end }} + {{- end }} + + {{- /* SQL Server-specific settings */}} + {{- if eq .Values.source.type "sqlserver" }} + {{- if .Values.source.sqlserver.databaseNames }} + debezium.source.database.names={{ .Values.source.sqlserver.databaseNames }} + {{- end }} + debezium.source.snapshot.mode={{ .Values.source.sqlserver.snapshotMode }} + {{- end }} + + {{- /* Oracle-specific settings */}} + {{- if eq .Values.source.type "oracle" }} + {{- if .Values.source.oracle.pdbName }} + debezium.source.database.pdb.name={{ .Values.source.oracle.pdbName }} + {{- end }} + debezium.source.log.mining.strategy={{ .Values.source.oracle.logMiningStrategy }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Offset Storage Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.source.offset.storage "file" }} + debezium.source.offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore + debezium.source.offset.storage.file.filename={{ .Values.source.offset.file.filename }} + {{- else if eq .Values.source.offset.storage "redis" }} + debezium.source.offset.storage=io.debezium.storage.redis.offset.RedisOffsetBackingStore + debezium.source.offset.storage.redis.address=${OFFSET_REDIS_ADDRESS} + debezium.source.offset.storage.redis.key={{ .Values.source.offset.redis.key }} + {{- if .Values.source.offset.redis.password }} + debezium.source.offset.storage.redis.password=${OFFSET_REDIS_PASSWORD} + {{- end }} + {{- if .Values.source.offset.redis.ssl }} + debezium.source.offset.storage.redis.ssl.enabled=true + {{- end }} + {{- else if eq .Values.source.offset.storage "jdbc" }} + debezium.source.offset.storage=io.debezium.storage.jdbc.offset.JdbcOffsetBackingStore + debezium.source.offset.storage.jdbc.url=${OFFSET_JDBC_URL} + debezium.source.offset.storage.jdbc.user=${OFFSET_JDBC_USER} + debezium.source.offset.storage.jdbc.password=${OFFSET_JDBC_PASSWORD} + debezium.source.offset.storage.jdbc.offset.table.name={{ .Values.source.offset.jdbc.tableName }} + {{- end }} + debezium.source.offset.flush.interval.ms={{ .Values.source.offset.flushIntervalMs }} + debezium.source.offset.flush.timeout.ms={{ .Values.source.offset.flushTimeoutMs }} + + # ----------------------------------------------------------------------------- + # Error Retry Configuration + # ----------------------------------------------------------------------------- + debezium.source.errors.retry.delay.initial.ms={{ .Values.source.errors.retryDelayInitialMs }} + debezium.source.errors.retry.delay.max.ms={{ .Values.source.errors.retryDelayMaxMs }} + debezium.source.errors.max.retries={{ .Values.source.errors.maxRetries }} + + {{- /* Schema History Storage (MySQL and SQL Server only) */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + + # ----------------------------------------------------------------------------- + # Schema History Storage Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.source.schemaHistory.storage "file" }} + debezium.source.schema.history.internal=io.debezium.storage.file.history.FileSchemaHistory + debezium.source.schema.history.internal.file.filename={{ .Values.source.schemaHistory.file.filename }} + {{- else if eq .Values.source.schemaHistory.storage "redis" }} + debezium.source.schema.history.internal=io.debezium.storage.redis.history.RedisSchemaHistory + debezium.source.schema.history.internal.redis.address=${SCHEMA_HISTORY_REDIS_ADDRESS} + debezium.source.schema.history.internal.redis.key={{ .Values.source.schemaHistory.redis.key }} + {{- if .Values.source.schemaHistory.redis.password }} + debezium.source.schema.history.internal.redis.password=${SCHEMA_HISTORY_REDIS_PASSWORD} + {{- end }} + {{- if .Values.source.schemaHistory.redis.ssl }} + debezium.source.schema.history.internal.redis.ssl.enabled=true + {{- end }} + {{- else if eq .Values.source.schemaHistory.storage "jdbc" }} + debezium.source.schema.history.internal=io.debezium.storage.jdbc.history.JdbcSchemaHistory + debezium.source.schema.history.internal.jdbc.url=${SCHEMA_HISTORY_JDBC_URL} + debezium.source.schema.history.internal.jdbc.user=${SCHEMA_HISTORY_JDBC_USER} + debezium.source.schema.history.internal.jdbc.password=${SCHEMA_HISTORY_JDBC_PASSWORD} + debezium.source.schema.history.internal.jdbc.schema.history.table.name={{ .Values.source.schemaHistory.jdbc.tableName }} + {{- end }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Sink Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.sink.type "kafka" }} + debezium.sink.type=kafka + debezium.sink.kafka.producer.bootstrap.servers=${KAFKA_BOOTSTRAP_SERVERS} + debezium.sink.kafka.producer.key.serializer=org.apache.kafka.common.serialization.StringSerializer + debezium.sink.kafka.producer.value.serializer=org.apache.kafka.common.serialization.StringSerializer + {{- if .Values.sink.kafka.topic }} + debezium.sink.kafka.producer.topic.prefix={{ .Values.sink.kafka.topic }} + {{- end }} + debezium.sink.kafka.producer.security.protocol={{ .Values.sink.kafka.securityProtocol }} + {{- if .Values.sink.kafka.saslMechanism }} + debezium.sink.kafka.producer.sasl.mechanism={{ .Values.sink.kafka.saslMechanism }} + debezium.sink.kafka.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="${KAFKA_SASL_USERNAME}" password="${KAFKA_SASL_PASSWORD}"; + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + debezium.sink.type=redis + debezium.sink.redis.address=${SINK_REDIS_ADDRESS} + {{- if .Values.sink.redis.password }} + debezium.sink.redis.password=${SINK_REDIS_PASSWORD} + {{- end }} + {{- if .Values.sink.redis.ssl }} + debezium.sink.redis.ssl.enabled=true + {{- end }} + {{- if .Values.sink.redis.streamName }} + debezium.sink.redis.stream.name={{ .Values.sink.redis.streamName }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + debezium.sink.type=nats-jetstream + debezium.sink.nats-jetstream.url=${NATS_URL} + {{- if .Values.sink.nats.subject }} + debezium.sink.nats-jetstream.subject={{ .Values.sink.nats.subject }} + {{- end }} + {{- if .Values.sink.nats.username }} + debezium.sink.nats-jetstream.username=${NATS_USERNAME} + debezium.sink.nats-jetstream.password=${NATS_PASSWORD} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + debezium.sink.type=http + debezium.sink.http.url=${HTTP_SINK_URL} + {{- if eq .Values.sink.http.authType "basic" }} + debezium.sink.http.authentication.type=basic + debezium.sink.http.authentication.username=${HTTP_SINK_USERNAME} + debezium.sink.http.authentication.password=${HTTP_SINK_PASSWORD} + {{- else if eq .Values.sink.http.authType "bearer" }} + debezium.sink.http.authentication.type=bearer + debezium.sink.http.authentication.bearer.token=${HTTP_SINK_BEARER_TOKEN} + {{- end }} + {{- range $key, $value := .Values.sink.http.headers }} + debezium.sink.http.headers.{{ $key }}={{ $value }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "kinesis" }} + debezium.sink.type=kinesis + debezium.sink.kinesis.region={{ .Values.sink.kinesis.region }} + debezium.sink.kinesis.stream={{ .Values.sink.kinesis.streamName }} + debezium.sink.kinesis.credentials.provider={{ .Values.sink.kinesis.credentialsProvider }} + {{- end }} + + {{- if eq .Values.sink.type "pubsub" }} + debezium.sink.type=pubsub + debezium.sink.pubsub.project.id={{ .Values.sink.pubsub.projectId }} + {{- if .Values.sink.pubsub.topic }} + debezium.sink.pubsub.topic.prefix={{ .Values.sink.pubsub.topic }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + debezium.sink.type=pulsar + debezium.sink.pulsar.client.serviceUrl=${PULSAR_SERVICE_URL} + {{- if .Values.sink.pulsar.topic }} + debezium.sink.pulsar.topic.prefix={{ .Values.sink.pulsar.topic }} + {{- end }} + {{- if .Values.sink.pulsar.authPluginClassName }} + debezium.sink.pulsar.client.authPluginClassName={{ .Values.sink.pulsar.authPluginClassName }} + debezium.sink.pulsar.client.authParams=token:${PULSAR_AUTH_TOKEN} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + debezium.sink.type=eventhubs + debezium.sink.eventhubs.connection.string=${EVENTHUBS_CONNECTION_STRING} + debezium.sink.eventhubs.hub.name={{ .Values.sink.eventhubs.hubName }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Serialization Format + # ----------------------------------------------------------------------------- + debezium.format.key={{ .Values.format.key }} + debezium.format.value={{ .Values.format.value }} + {{- if or (eq .Values.format.key "avro") (eq .Values.format.key "protobuf") (eq .Values.format.value "avro") (eq .Values.format.value "protobuf") }} + {{- if .Values.format.schemaRegistry.url }} + debezium.format.key.schemas.enable=true + debezium.format.value.schemas.enable=true + debezium.format.schema.registry.url=${SCHEMA_REGISTRY_URL} + {{- if .Values.format.schemaRegistry.username }} + debezium.format.schema.registry.basic.auth.credentials.source=USER_INFO + debezium.format.schema.registry.basic.auth.user.info=${SCHEMA_REGISTRY_USERNAME}:${SCHEMA_REGISTRY_PASSWORD} + {{- end }} + {{- end }} + {{- end }} diff --git a/debezium-server/versions/1.0.0/templates/secret-credentials.yaml b/debezium-server/versions/1.0.0/templates/secret-credentials.yaml new file mode 100644 index 00000000..c4008cf6 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/secret-credentials.yaml @@ -0,0 +1,99 @@ +kind: secret +name: {{ include "debezium.credentials.name" . }} +description: Debezium Server credentials +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: dictionary +data: + # Database credentials + db-hostname: {{ .Values.source.database.hostname | quote }} + db-user: {{ .Values.source.database.user | quote }} + db-password: {{ .Values.source.database.password | quote }} + + {{- /* MongoDB connection string */}} + {{- if and (eq .Values.source.type "mongodb") .Values.source.mongodb.connectionString }} + mongodb-connection-string: {{ .Values.source.mongodb.connectionString | quote }} + {{- end }} + + {{- /* Offset storage credentials */}} + {{- if eq .Values.source.offset.storage "redis" }} + offset-redis-address: {{ .Values.source.offset.redis.address | quote }} + {{- if .Values.source.offset.redis.password }} + offset-redis-password: {{ .Values.source.offset.redis.password | quote }} + {{- end }} + {{- end }} + {{- if eq .Values.source.offset.storage "jdbc" }} + offset-jdbc-url: {{ .Values.source.offset.jdbc.url | quote }} + offset-jdbc-user: {{ .Values.source.offset.jdbc.user | quote }} + offset-jdbc-password: {{ .Values.source.offset.jdbc.password | quote }} + {{- end }} + + {{- /* Schema history storage credentials (MySQL/SQL Server only) */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + {{- if eq .Values.source.schemaHistory.storage "redis" }} + schema-history-redis-address: {{ .Values.source.schemaHistory.redis.address | quote }} + {{- if .Values.source.schemaHistory.redis.password }} + schema-history-redis-password: {{ .Values.source.schemaHistory.redis.password | quote }} + {{- end }} + {{- end }} + {{- if eq .Values.source.schemaHistory.storage "jdbc" }} + schema-history-jdbc-url: {{ .Values.source.schemaHistory.jdbc.url | quote }} + schema-history-jdbc-user: {{ .Values.source.schemaHistory.jdbc.user | quote }} + schema-history-jdbc-password: {{ .Values.source.schemaHistory.jdbc.password | quote }} + {{- end }} + {{- end }} + + {{- /* Sink credentials */}} + {{- if eq .Values.sink.type "kafka" }} + kafka-bootstrap-servers: {{ .Values.sink.kafka.bootstrapServers | quote }} + {{- if .Values.sink.kafka.saslUsername }} + kafka-sasl-username: {{ .Values.sink.kafka.saslUsername | quote }} + kafka-sasl-password: {{ .Values.sink.kafka.saslPassword | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + sink-redis-address: {{ .Values.sink.redis.address | quote }} + {{- if .Values.sink.redis.password }} + sink-redis-password: {{ .Values.sink.redis.password | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + nats-url: {{ .Values.sink.nats.url | quote }} + {{- if .Values.sink.nats.username }} + nats-username: {{ .Values.sink.nats.username | quote }} + nats-password: {{ .Values.sink.nats.password | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + http-sink-url: {{ .Values.sink.http.url | quote }} + {{- if eq .Values.sink.http.authType "basic" }} + http-sink-username: {{ .Values.sink.http.username | quote }} + http-sink-password: {{ .Values.sink.http.password | quote }} + {{- end }} + {{- if eq .Values.sink.http.authType "bearer" }} + http-sink-bearer-token: {{ .Values.sink.http.bearerToken | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + pulsar-service-url: {{ .Values.sink.pulsar.serviceUrl | quote }} + {{- if .Values.sink.pulsar.authToken }} + pulsar-auth-token: {{ .Values.sink.pulsar.authToken | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + eventhubs-connection-string: {{ .Values.sink.eventhubs.connectionString | quote }} + {{- end }} + + {{- /* Schema registry credentials */}} + {{- if .Values.format.schemaRegistry.url }} + schema-registry-url: {{ .Values.format.schemaRegistry.url | quote }} + {{- if .Values.format.schemaRegistry.username }} + schema-registry-username: {{ .Values.format.schemaRegistry.username | quote }} + schema-registry-password: {{ .Values.format.schemaRegistry.password | quote }} + {{- end }} + {{- end }} diff --git a/debezium-server/versions/1.0.0/templates/secret-entrypoint.yaml b/debezium-server/versions/1.0.0/templates/secret-entrypoint.yaml new file mode 100644 index 00000000..469c0503 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/secret-entrypoint.yaml @@ -0,0 +1,46 @@ +{{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} +kind: secret +name: {{ include "debezium.entrypoint.name" . }} +description: Debezium Server entrypoint script for PostgreSQL prerequisites +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: | + #!/bin/bash + + set -o nounset + + echo "=== Debezium Server Entrypoint ===" + + # Find the PostgreSQL JDBC driver JAR (bundled with Debezium Server) + JDBC_JAR=$(find /debezium -name "postgresql-*.jar" -print -quit 2>/dev/null || true) + if [ -z "${JDBC_JAR}" ]; then + echo "WARNING: PostgreSQL JDBC driver not found. Skipping prerequisites." + exec /debezium/run.sh + fi + echo "Using JDBC driver: ${JDBC_JAR}" + + # Decode pre-compiled PgInit.class (Java 17+ compatible) + # Connects via JDBC, creates heartbeat table + failover replication slot + echo "yv66vgAAAD0AtQoAAgADBwAEDAAFAAYBABBqYXZhL2xhbmcvT2JqZWN0AQAGPGluaXQ+AQADKClWCQAIAAkHAAoMAAsADAEAEGphdmEvbGFuZy9TeXN0ZW0BAANlcnIBABVMamF2YS9pby9QcmludFN0cmVhbTsIAA4BAE5Vc2FnZTogUGdJbml0IDxob3N0PiA8cG9ydD4gPGRibmFtZT4gPHVzZXI+IDxwYXNzd29yZD4gPHNsb3ROYW1lPiA8cGx1Z2luTmFtZT4KABAAEQcAEgwAEwAUAQATamF2YS9pby9QcmludFN0cmVhbQEAB3ByaW50bG4BABUoTGphdmEvbGFuZy9TdHJpbmc7KVYKAAgAFgwAFwAYAQAEZXhpdAEABChJKVYSAAAAGgwAGwAcAQAXbWFrZUNvbmNhdFdpdGhDb25zdGFudHMBAEooTGphdmEvbGFuZy9TdHJpbmc7TGphdmEvbGFuZy9TdHJpbmc7TGphdmEvbGFuZy9TdHJpbmc7KUxqYXZhL2xhbmcvU3RyaW5nOwoAHgAfBwAgDAAhACIBABZqYXZhL3NxbC9Ecml2ZXJNYW5hZ2VyAQANZ2V0Q29ubmVjdGlvbgEATShMamF2YS9sYW5nL1N0cmluZztMamF2YS9sYW5nL1N0cmluZztMamF2YS9sYW5nL1N0cmluZzspTGphdmEvc3FsL0Nvbm5lY3Rpb247CQAIACQMACUADAEAA291dAgAJwEAGENvbm5lY3RlZCB0byBQb3N0Z3JlU1FMLgcAKQEAFWphdmEvc3FsL1NRTEV4Y2VwdGlvbgoAKAArDAAsAC0BAApnZXRNZXNzYWdlAQAUKClMamF2YS9sYW5nL1N0cmluZzsSAAEALwwAGwAwAQAnKElMamF2YS9sYW5nL1N0cmluZzspTGphdmEvbGFuZy9TdHJpbmc7EgACADIMABsAMwEAKChJSUxqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzsFAAAAAAAAE4gKADcAOAcAOQwAOgA7AQAQamF2YS9sYW5nL1RocmVhZAEABXNsZWVwAQAEKEopVgcAPQEAHmphdmEvbGFuZy9JbnRlcnJ1cHRlZEV4Y2VwdGlvbgoANwA/DABAAEEBAA1jdXJyZW50VGhyZWFkAQAUKClMamF2YS9sYW5nL1RocmVhZDsKADcAQwwARAAGAQAJaW50ZXJydXB0CABGAQArRW5zdXJpbmcgZGViZXppdW1faGVhcnRiZWF0IHRhYmxlIGV4aXN0cy4uLgsASABJBwBKDABLAEwBABNqYXZhL3NxbC9Db25uZWN0aW9uAQAPY3JlYXRlU3RhdGVtZW50AQAWKClMamF2YS9zcWwvU3RhdGVtZW50OwgATgEAa0NSRUFURSBUQUJMRSBJRiBOT1QgRVhJU1RTIGRlYmV6aXVtX2hlYXJ0YmVhdCAoaWQgSU5URUdFUiBQUklNQVJZIEtFWSwgdHMgVElNRVNUQU1QIE5PVCBOVUxMIERFRkFVTFQgbm93KCkpCwBQAFEHAFIMAFMAVAEAEmphdmEvc3FsL1N0YXRlbWVudAEAB2V4ZWN1dGUBABUoTGphdmEvbGFuZy9TdHJpbmc7KVoIAFYBAFVJTlNFUlQgSU5UTyBkZWJleml1bV9oZWFydGJlYXQgKGlkLCB0cykgVkFMVUVTICgxLCBub3coKSkgT04gQ09ORkxJQ1QgKGlkKSBETyBOT1RISU5HCwBQAFgMAFkABgEABWNsb3NlBwBbAQATamF2YS9sYW5nL1Rocm93YWJsZQoAWgBdDABeAF8BAA1hZGRTdXBwcmVzc2VkAQAYKExqYXZhL2xhbmcvVGhyb3dhYmxlOylWCABhAQAWSGVhcnRiZWF0IHRhYmxlIHJlYWR5LhIAAwBjDAAbAGQBACYoTGphdmEvbGFuZy9TdHJpbmc7KUxqYXZhL2xhbmcvU3RyaW5nOwgAZgEAPVNFTEVDVCBjb3VudCgqKSBGUk9NIHBnX3JlcGxpY2F0aW9uX3Nsb3RzIFdIRVJFIHNsb3RfbmFtZSA9ID8LAEgAaAwAaQBqAQAQcHJlcGFyZVN0YXRlbWVudAEAMChMamF2YS9sYW5nL1N0cmluZzspTGphdmEvc3FsL1ByZXBhcmVkU3RhdGVtZW50OwsAbABtBwBuDABvAHABABpqYXZhL3NxbC9QcmVwYXJlZFN0YXRlbWVudAEACXNldFN0cmluZwEAFihJTGphdmEvbGFuZy9TdHJpbmc7KVYLAGwAcgwAcwB0AQAMZXhlY3V0ZVF1ZXJ5AQAWKClMamF2YS9zcWwvUmVzdWx0U2V0OwsAdgB3BwB4DAB5AHoBABJqYXZhL3NxbC9SZXN1bHRTZXQBAARuZXh0AQADKClaCwB2AHwMAH0AfgEABmdldEludAEABChJKUkSAAQAgAwAGwCBAQA4KExqYXZhL2xhbmcvU3RyaW5nO0xqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzsSAAUAYxIABgBjCwBsAFgLAEgAWAgAhwEAF1ByZXJlcXVpc2l0ZXMgY29tcGxldGUuEgAHAGMIAIoBACJEZWJleml1bSBTZXJ2ZXIgd2lsbCBzdGFydCBhbnl3YXkuBwCMAQAGUGdJbml0AQAEQ29kZQEAD0xpbmVOdW1iZXJUYWJsZQEABG1haW4BABYoW0xqYXZhL2xhbmcvU3RyaW5nOylWAQANU3RhY2tNYXBUYWJsZQcAkwEAE1tMamF2YS9sYW5nL1N0cmluZzsHAJUBABBqYXZhL2xhbmcvU3RyaW5nAQAKU291cmNlRmlsZQEAC1BnSW5pdC5qYXZhAQAQQm9vdHN0cmFwTWV0aG9kcwgAmgEAF2pkYmM6cG9zdGdyZXNxbDovLwE6AS8BCACcAQAsRVJST1I6IENvdWxkIG5vdCBjb25uZWN0IGFmdGVyIAEgYXR0ZW1wdHM6IAEIAJ4BACUgIEF0dGVtcHQgAS8BIC0gcmV0cnlpbmcgaW4gNXMuLi4gKAEpCACgAQAwRW5zdXJpbmcgZmFpbG92ZXIgcmVwbGljYXRpb24gc2xvdCAnAScgZXhpc3RzLi4uCACiAQBAU0VMRUNUIHBnX2NyZWF0ZV9sb2dpY2FsX3JlcGxpY2F0aW9uX3Nsb3QoJwEnLCAnAScsIGZhbHNlLCB0cnVlKQgApAEAJkNyZWF0ZWQgZmFpbG92ZXIgcmVwbGljYXRpb24gc2xvdCAnAScuCACmAQAkUmVwbGljYXRpb24gc2xvdCAnAScgYWxyZWFkeSBleGlzdHMuCACoAQAlV0FSTklORzogUHJlcmVxdWlzaXRlIHNldHVwIGZhaWxlZDogAQ8GAKoKAKsArAcArQwAGwCuAQAkamF2YS9sYW5nL2ludm9rZS9TdHJpbmdDb25jYXRGYWN0b3J5AQCYKExqYXZhL2xhbmcvaW52b2tlL01ldGhvZEhhbmRsZXMkTG9va3VwO0xqYXZhL2xhbmcvU3RyaW5nO0xqYXZhL2xhbmcvaW52b2tlL01ldGhvZFR5cGU7TGphdmEvbGFuZy9TdHJpbmc7W0xqYXZhL2xhbmcvT2JqZWN0OylMamF2YS9sYW5nL2ludm9rZS9DYWxsU2l0ZTsBAAxJbm5lckNsYXNzZXMHALEBACVqYXZhL2xhbmcvaW52b2tlL01ldGhvZEhhbmRsZXMkTG9va3VwBwCzAQAeamF2YS9sYW5nL2ludm9rZS9NZXRob2RIYW5kbGVzAQAGTG9va3VwACEAiwACAAAAAAACAAEABQAGAAEAjQAAAB0AAQABAAAABSq3AAGxAAAAAQCOAAAABgABAAAAAwAJAI8AkAABAI0AAARsAAQAEAAAAgIqvhAHogAPsgAHEg22AA8EuAAVKgMyTCoEMk0qBTJOKgYyOgQqBzI6BSoIMjoGKhAGMjoHKywtugAZAAA6CBAeNgkBOgoENgsVCxUJowBjGQgZBBkFuAAdOgqyACMSJrYAD6cATToMFQsVCaAAGbIABxUJGQy2ACq6AC4AALYADwO4ABWyACMVCxUJGQy2ACq6ADEAALYADxQANLgANqcACzoNuAA+tgBChAsBp/+csgAjEkW2AA8ZCrkARwEAOgsZCxJNuQBPAgBXGQsSVbkATwIAVxkLxgAqGQu5AFcBAKcAIDoMGQvGABYZC7kAVwEApwAMOg0ZDBkNtgBcGQy/sgAjEmC2AA+yACMZBroAYgAAtgAPGQoSZbkAZwIAOgsZCwQZBrkAawMAGQu5AHEBADoMGQy5AHUBAFcZDAS5AHsCAJoAWRkKuQBHAQA6DRkNGQYZB7oAfwAAuQBPAgBXGQ3GACoZDbkAVwEApwAgOg4ZDcYAFhkNuQBXAQCnAAw6DxkOGQ+2AFwZDr+yACMZBroAggAAtgAPpwAQsgAjGQa6AIMAALYADxkLxgAqGQu5AIQBAKcAIDoMGQvGABYZC7kAhAEApwAMOg0ZDBkNtgBcGQy/GQq5AIUBALIAIxKGtgAPpwAdOguyAAcZC7YAKroAiAAAtgAPsgAHEom2AA+xAAkATwBiAGUAKACYAJ4AoQA8AMAA1ADjAFoA6gDxAPQAWgFPAWABbwBaAXYBfQGAAFoBIAGpAbgAWgG/AcYByQBaAK8B5AHnACgAAgCOAAAAwgAwAAAABQAHAAYADwAHABMACQAfAAoAKQALADQADAA+AA4AQgAPAEUAEABPABIAWgATAGIAFABlABUAZwAWAG4AFwCAABgAhAAaAJgAGwCpABAArwAgALcAIQDAACIAygAjANQAJADjACEBAAAlAQgAJwEVACgBIAApASoAKgEzACsBOwAsAUYALQFPAC4BYAAvAW8ALQGMADABnAAyAakANAG4ACgB1QA2AdwANwHkADsB5wA4AekAOQH5ADoCAQA8AJEAAAFIABcT/wA0AAwHAJIHAJQHAJQHAJQHAJQHAJQHAJQHAJQHAJQBBwBIAQAAXAcAKPwAHgcAKFwHADz6AAf6AAX/ADMADAcAkgcAlAcAlAcAlAcAlAcAlAcAlAcAlAcAlAEHAEgHAFAAAQcAWv8AEAANBwCSBwCUBwCUBwCUBwCUBwCUBwCUBwCUBwCUAQcASAcAUAcAWgABBwBaCPkAAv8AbgAOBwCSBwCUBwCUBwCUBwCUBwCUBwCUBwCUBwCUAQcASAcAbAcAdgcAUAABBwBa/wAQAA8HAJIHAJQHAJQHAJQHAJQHAJQHAJQHAJQHAJQBBwBIBwBsBwB2BwBQBwBaAAEHAFoI+QACD/oADE4HAFr/ABAADQcAkgcAlAcAlAcAlAcAlAcAlAcAlAcAlAcAlAEHAEgHAGwHAFoAAQcAWgj5AAJRBwAoGQADAJYAAAACAJcAmAAAADIACACpAAEAmQCpAAEAmwCpAAEAnQCpAAEAnwCpAAEAoQCpAAEAowCpAAEApQCpAAEApwCvAAAACgABALAAsgC0ABk=" | base64 -d > /tmp/PgInit.class + + if [ $? -ne 0 ]; then + echo "WARNING: Failed to decode PgInit.class. Skipping prerequisites." + exec /debezium/run.sh + fi + + echo "Running PostgreSQL prerequisites..." + java -cp "${JDBC_JAR}:/tmp" PgInit \ + "${DB_HOSTNAME}" \ + "{{ include "debezium.databasePort" . }}" \ + "{{ .Values.source.database.name }}" \ + "${DB_USER}" \ + "${DB_PASSWORD}" \ + "{{ .Values.source.postgres.slotName }}" \ + "{{ .Values.source.postgres.pluginName }}" || echo "WARNING: Prerequisites script returned non-zero. Continuing anyway." + + echo "=== Starting Debezium Server ===" + exec /debezium/run.sh +{{- end }} diff --git a/debezium-server/versions/1.0.0/templates/volumeset.yaml b/debezium-server/versions/1.0.0/templates/volumeset.yaml new file mode 100644 index 00000000..b6c4a5c3 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/volumeset.yaml @@ -0,0 +1,15 @@ +{{- if eq (include "debezium.requiresVolumeset" .) "true" }} +kind: volumeset +name: {{ include "debezium.volumeset.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +description: Debezium Server data volumeset for offset and schema history storage +tags: + {{- include "debezium.tags" . | nindent 2 }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.capacity }} + performanceClass: {{ .Values.volumeset.performanceClass }} + snapshots: + createFinalSnapshot: true + retentionDuration: 7d +{{- end }} diff --git a/debezium-server/versions/1.0.0/templates/workload-debezium.yaml b/debezium-server/versions/1.0.0/templates/workload-debezium.yaml new file mode 100644 index 00000000..e71cc327 --- /dev/null +++ b/debezium-server/versions/1.0.0/templates/workload-debezium.yaml @@ -0,0 +1,221 @@ +kind: workload +name: {{ include "debezium.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +description: Debezium Server CDC workload +tags: + {{- include "debezium.tags" . | nindent 2 }} +spec: + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + type: stateful + {{- else }} + type: standard + {{- end }} + identityLink: //identity/{{ include "debezium.identity.name" . }} + containers: + - name: debezium-server + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + command: /bin/bash + args: + - '-c' + - >- + cp /scripts/debezium-entrypoint.sh /tmp/ && chmod +x /tmp/debezium-entrypoint.sh && + /tmp/debezium-entrypoint.sh + {{- end }} + image: {{ .Values.image }} + inheritEnv: false + cpu: {{ .Values.resources.cpu | quote }} + memory: {{ .Values.resources.memory | quote }} + ports: + - number: 8080 + protocol: http + env: + # Database credentials from secret + - name: DB_HOSTNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-hostname' + - name: DB_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-user' + - name: DB_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-password' + + {{- /* MongoDB connection string */}} + {{- if and (eq .Values.source.type "mongodb") .Values.source.mongodb.connectionString }} + - name: MONGODB_CONNECTION_STRING + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.mongodb-connection-string' + {{- end }} + + {{- /* Offset storage credentials */}} + {{- if eq .Values.source.offset.storage "redis" }} + - name: OFFSET_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-redis-address' + {{- if .Values.source.offset.redis.password }} + - name: OFFSET_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-redis-password' + {{- end }} + {{- end }} + {{- if eq .Values.source.offset.storage "jdbc" }} + - name: OFFSET_JDBC_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-url' + - name: OFFSET_JDBC_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-user' + - name: OFFSET_JDBC_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-password' + {{- end }} + + {{- /* Schema history storage credentials */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + {{- if eq .Values.source.schemaHistory.storage "redis" }} + - name: SCHEMA_HISTORY_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-redis-address' + {{- if .Values.source.schemaHistory.redis.password }} + - name: SCHEMA_HISTORY_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-redis-password' + {{- end }} + {{- end }} + {{- if eq .Values.source.schemaHistory.storage "jdbc" }} + - name: SCHEMA_HISTORY_JDBC_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-url' + - name: SCHEMA_HISTORY_JDBC_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-user' + - name: SCHEMA_HISTORY_JDBC_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-password' + {{- end }} + {{- end }} + + {{- /* Sink credentials */}} + {{- if eq .Values.sink.type "kafka" }} + - name: KAFKA_BOOTSTRAP_SERVERS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-bootstrap-servers' + {{- if .Values.sink.kafka.saslUsername }} + - name: KAFKA_SASL_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-sasl-username' + - name: KAFKA_SASL_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-sasl-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + - name: SINK_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.sink-redis-address' + {{- if .Values.sink.redis.password }} + - name: SINK_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.sink-redis-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + - name: NATS_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-url' + {{- if .Values.sink.nats.username }} + - name: NATS_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-username' + - name: NATS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + - name: HTTP_SINK_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-url' + {{- if eq .Values.sink.http.authType "basic" }} + - name: HTTP_SINK_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-username' + - name: HTTP_SINK_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-password' + {{- end }} + {{- if eq .Values.sink.http.authType "bearer" }} + - name: HTTP_SINK_BEARER_TOKEN + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-bearer-token' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + - name: PULSAR_SERVICE_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.pulsar-service-url' + {{- if .Values.sink.pulsar.authToken }} + - name: PULSAR_AUTH_TOKEN + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.pulsar-auth-token' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + - name: EVENTHUBS_CONNECTION_STRING + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.eventhubs-connection-string' + {{- end }} + + {{- /* Schema registry credentials */}} + {{- if .Values.format.schemaRegistry.url }} + - name: SCHEMA_REGISTRY_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-url' + {{- if .Values.format.schemaRegistry.username }} + - name: SCHEMA_REGISTRY_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-username' + - name: SCHEMA_REGISTRY_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-password' + {{- end }} + {{- end }} + + volumes: + # Mount application.properties from opaque secret + - path: /debezium/config/application.properties + recoveryPolicy: retain + uri: cpln://secret/{{ include "debezium.config.name" . }}.payload + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + # Mount entrypoint script for PostgreSQL prerequisites + - path: /scripts/debezium-entrypoint.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "debezium.entrypoint.name" . }}.payload + {{- end }} + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + # Mount data volume for offset and schema history storage + - path: /debezium/data + uri: cpln://volumeset/{{ include "debezium.volumeset.name" . }} + {{- end }} + + readinessProbe: + httpGet: + path: /q/health/ready + port: 8080 + scheme: HTTP + failureThreshold: 3 + initialDelaySeconds: 10 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 5 + + livenessProbe: + httpGet: + path: /q/health/live + port: 8080 + scheme: HTTP + failureThreshold: 3 + initialDelaySeconds: 30 + periodSeconds: 30 + successThreshold: 1 + timeoutSeconds: 5 + + defaultOptions: + # Debezium should run as a single instance for CDC consistency + autoscaling: + metric: disabled + minScale: 1 + maxScale: 1 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 60 + + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + securityOptions: + filesystemGroupId: 185 + {{- end }} + + firewallConfig: + external: + outboundAllowCIDR: + {{- toYaml .Values.firewall.external.outboundAllowCIDR | nindent 8 }} + internal: + inboundAllowType: {{ .Values.firewall.internal.inboundAllowType }} + {{- if .Values.firewall.internal.workloads }} + inboundAllowWorkload: + {{- toYaml .Values.firewall.internal.workloads | nindent 8 }} + {{- end }} diff --git a/debezium-server/versions/1.0.0/values.yaml b/debezium-server/versions/1.0.0/values.yaml new file mode 100644 index 00000000..9719f2d1 --- /dev/null +++ b/debezium-server/versions/1.0.0/values.yaml @@ -0,0 +1,219 @@ +# Debezium Server CDC Template +# Documentation: https://debezium.io/documentation/reference/stable/operations/debezium-server.html + +image: quay.io/debezium/server:3.0 + +resources: + cpu: 500m + memory: 512Mi + +# ============================================================================= +# Source Database Configuration +# ============================================================================= +source: + # Database type: postgres, mysql, mongodb, sqlserver, oracle + type: postgres + + # Database connection settings + database: + hostname: "" # Required: database hostname or IP + port: 5432 # Default port varies by type (postgres:5432, mysql:3306, mongodb:27017, sqlserver:1433, oracle:1521) + name: "" # Required: database name (or SID for Oracle) + user: "" # Required: database username + password: "" # Required: database password (stored in credentials secret) + + # Server name used as prefix for topic names + serverName: "dbserver1" + + # Tables to capture (comma-separated, e.g., "public.users,public.orders") + # Leave empty to capture all tables + tableIncludeList: "" + + # Tables to exclude (comma-separated) + tableExcludeList: "" + + # PostgreSQL-specific settings + postgres: + slotName: "debezium" # Replication slot name + publicationName: "dbz_publication" # Publication name + pluginName: "pgoutput" # Logical decoding plugin: pgoutput (default), decoderbufs + slotDropOnStop: false # Keep replication slot on stop (required for HA/failover) + heartbeatIntervalMs: 0 # Heartbeat interval in ms (0=disabled; set 5000 for HA) + heartbeatActionQuery: "" # SQL executed on heartbeat (e.g., "UPDATE debezium_heartbeat SET ts = now() WHERE id = 1") + + # MySQL-specific settings + mysql: + serverId: 85744 # Unique server ID for MySQL replication + includeSchemaChanges: true # Include DDL events + + # MongoDB-specific settings + mongodb: + connectionString: "" # Full connection string (overrides hostname/port) + replicaSet: "" # Replica set name + + # SQL Server-specific settings + sqlserver: + databaseNames: "" # Comma-separated database names to capture + snapshotMode: "initial" # Snapshot mode: initial, schema_only, initial_only + + # Oracle-specific settings + oracle: + pdbName: "" # Pluggable database name + logMiningStrategy: "online_catalog" # Log mining strategy: online_catalog, redo_log_catalog + + # Offset storage configuration + offset: + # Storage type: file, redis, jdbc + storage: file + + # Flush settings + flushIntervalMs: 10000 # How often offsets flush to storage (ms) + flushTimeoutMs: 60000 # Timeout for offset flush operations (ms) + + # File storage settings (requires volumeset) + file: + filename: "/debezium/data/offsets.dat" + + # Redis storage settings + redis: + address: "" # Redis address (e.g., redis.mygvc.cpln.local:6379) + key: "debezium:offsets" # Redis key for offsets + password: "" # Redis password (stored in credentials secret) + ssl: false # Enable SSL/TLS + + # JDBC storage settings + jdbc: + url: "" # JDBC URL (e.g., jdbc:postgresql://host:5432/dbname) + user: "" # JDBC username + password: "" # JDBC password (stored in credentials secret) + tableName: "debezium_offsets" # Table name for storing offsets + + # Schema history storage (required for MySQL and SQL Server) + schemaHistory: + # Storage type: file, redis, jdbc (only used for mysql/sqlserver) + storage: file + + # File storage settings + file: + filename: "/debezium/data/schema-history.dat" + + # Redis storage settings + redis: + address: "" + key: "debezium:schema-history" + password: "" + ssl: false + + # JDBC storage settings + jdbc: + url: "" + user: "" + password: "" + tableName: "debezium_schema_history" + + # Error retry configuration + errors: + retryDelayInitialMs: 300 # Initial retry delay (ms) + retryDelayMaxMs: 10000 # Max retry delay (ms) + maxRetries: -1 # Max retries (-1 = infinite) + +# ============================================================================= +# Sink Configuration +# ============================================================================= +sink: + # Sink type: kafka, redis, nats-jetstream, http, kinesis, pubsub, pulsar, eventhubs + type: kafka + + # Kafka sink settings + kafka: + bootstrapServers: "" # Required: Kafka bootstrap servers (e.g., kafka.mygvc.cpln.local:9092) + topic: "" # Topic prefix (events sent to {topic}.{table}) + securityProtocol: "PLAINTEXT" # Security protocol: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL + saslMechanism: "" # SASL mechanism: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512 + saslUsername: "" # SASL username + saslPassword: "" # SASL password (stored in credentials secret) + + # Redis sink settings (Redis Streams) + redis: + address: "" # Required: Redis address (e.g., redis.mygvc.cpln.local:6379) + password: "" # Redis password (stored in credentials secret) + ssl: false # Enable SSL/TLS + streamName: "" # Stream name prefix (events sent to {streamName}.{table}) + + # NATS JetStream sink settings + nats: + url: "" # Required: NATS URL (e.g., nats://nats.mygvc.cpln.local:4222) + subject: "" # Subject prefix (events sent to {subject}.{table}) + username: "" # NATS username + password: "" # NATS password (stored in credentials secret) + + # HTTP sink settings (webhooks) + http: + url: "" # Required: HTTP endpoint URL + headers: {} # Additional headers (key-value pairs) + authType: "" # Auth type: none, basic, bearer + username: "" # Basic auth username + password: "" # Basic auth password (stored in credentials secret) + bearerToken: "" # Bearer token (stored in credentials secret) + + # AWS Kinesis sink settings (uses Universal Cloud Identity) + kinesis: + region: "" # Required: AWS region + streamName: "" # Required: Kinesis stream name + credentialsProvider: "default" # Use "default" for Universal Cloud Identity + # Cloud account for Universal Cloud Identity + cloudAccount: + enabled: false + name: "" # AWS cloud account name in Control Plane + + # GCP Pub/Sub sink settings (uses Universal Cloud Identity) + pubsub: + projectId: "" # Required: GCP project ID + topic: "" # Topic prefix (events sent to {topic}.{table}) + # Cloud account for Universal Cloud Identity + cloudAccount: + enabled: false + name: "" # GCP cloud account name in Control Plane + + # Apache Pulsar sink settings + pulsar: + serviceUrl: "" # Required: Pulsar service URL + topic: "" # Topic prefix + authPluginClassName: "" # Auth plugin class (e.g., org.apache.pulsar.client.impl.auth.AuthenticationToken) + authToken: "" # Auth token (stored in credentials secret) + + # Azure Event Hubs sink settings + eventhubs: + connectionString: "" # Required: Event Hubs connection string (stored in credentials secret) + hubName: "" # Required: Event Hub name + +# ============================================================================= +# Serialization Format +# ============================================================================= +format: + key: json # Key format: json, avro, protobuf + value: json # Value format: json, avro, protobuf + + # Schema registry settings (for avro/protobuf) + schemaRegistry: + url: "" # Schema registry URL + username: "" # Schema registry username + password: "" # Schema registry password (stored in credentials secret) + +# ============================================================================= +# Volumeset Configuration (for file-based offset/schema-history storage) +# ============================================================================= +volumeset: + capacity: 10 # Initial capacity in GiB (minimum 10) + performanceClass: general-purpose-ssd # Performance class: general-purpose-ssd, high-throughput-ssd + +# ============================================================================= +# Firewall Configuration +# ============================================================================= +firewall: + internal: + inboundAllowType: same-gvc # Options: none, same-gvc, same-org, workload-list + workloads: [] # Workload list for inbound access (when type is workload-list) + external: + outboundAllowCIDR: + - 0.0.0.0/0 # Allow all outbound by default (required for database connectivity) diff --git a/debezium-server/versions/1.1.0/Chart.yaml b/debezium-server/versions/1.1.0/Chart.yaml new file mode 100644 index 00000000..1fff4825 --- /dev/null +++ b/debezium-server/versions/1.1.0/Chart.yaml @@ -0,0 +1,12 @@ +apiVersion: v2 +name: debezium-server +description: Debezium Server CDC app for Control Plane (standalone mode) +type: application +version: 1.1.0 +appVersion: "3.0" + +annotations: + created: "2026-04-03" + lastModified: "2026-04-13" + category: "event-streaming" + createsGvc: false diff --git a/debezium-server/versions/1.1.0/README.md b/debezium-server/versions/1.1.0/README.md new file mode 100644 index 00000000..02c3ac45 --- /dev/null +++ b/debezium-server/versions/1.1.0/README.md @@ -0,0 +1,347 @@ +# Debezium Server Template + +Debezium Server is a standalone Change Data Capture (CDC) application that streams database changes to various messaging systems. Unlike Debezium connectors that run on Kafka Connect, Debezium Server runs as a standalone application and can send events directly to Kafka, Redis, NATS, HTTP endpoints, cloud services, and more. + +## Overview + +This template deploys Debezium Server on Control Plane with: + +- Configurable source database connectors (PostgreSQL, MySQL, MongoDB, SQL Server, Oracle) +- Multiple sink options (Kafka, Redis, NATS JetStream, HTTP, AWS Kinesis, GCP Pub/Sub, Pulsar, Event Hubs) +- Flexible offset storage (file, Redis, JDBC) +- Universal Cloud Identity integration for AWS and GCP sinks +- Automatic secret management for credentials + +## Quick Start + +### PostgreSQL to Kafka + +```yaml +source: + type: postgres + database: + hostname: postgres.mygvc.cpln.local + port: 5432 + name: mydb + user: debezium + password: secret123 + serverName: myserver + tableIncludeList: "public.users,public.orders" + postgres: + slotName: debezium_slot + publicationName: dbz_publication + +sink: + type: kafka + kafka: + bootstrapServers: kafka.mygvc.cpln.local:9092 + topic: cdc-events + +format: + key: json + value: json +``` + +### MySQL to Redis Streams + +```yaml +source: + type: mysql + database: + hostname: mysql.mygvc.cpln.local + port: 3306 + name: mydb + user: debezium + password: secret123 + serverName: myserver + mysql: + serverId: 85744 + includeSchemaChanges: true + +sink: + type: redis + redis: + address: redis.mygvc.cpln.local:6379 + streamName: cdc-stream +``` + +### PostgreSQL to AWS Kinesis (Universal Cloud Identity) + +```yaml +source: + type: postgres + database: + hostname: my-rds-instance.us-east-1.rds.amazonaws.com + port: 5432 + name: mydb + user: debezium + password: secret123 + serverName: myserver + +sink: + type: kinesis + kinesis: + region: us-east-1 + streamName: cdc-events + credentialsProvider: default + cloudAccount: + enabled: true + name: my-aws-account +``` + +## Supported Sources + +| Database | Connector | Default Port | Key Configuration | +|----------|-----------|--------------|-------------------| +| PostgreSQL | PostgresConnector | 5432 | `slotName`, `publicationName`, `pluginName` | +| MySQL | MySqlConnector | 3306 | `serverId`, `includeSchemaChanges` | +| MongoDB | MongoDbConnector | 27017 | `connectionString`, `replicaSet` | +| SQL Server | SqlServerConnector | 1433 | `databaseNames`, `snapshotMode` | +| Oracle | OracleConnector | 1521 | `pdbName`, `logMiningStrategy` | + +### PostgreSQL Prerequisites + +1. Enable logical replication in `postgresql.conf`: + ``` + wal_level = logical + max_replication_slots = 4 + max_wal_senders = 4 + ``` + +2. Create a publication and replication slot: + ```sql + CREATE PUBLICATION dbz_publication FOR ALL TABLES; + -- Slot is created automatically by Debezium + ``` + +3. Grant permissions: + ```sql + GRANT USAGE ON SCHEMA public TO debezium; + GRANT SELECT ON ALL TABLES IN SCHEMA public TO debezium; + ALTER USER debezium REPLICATION; + ``` + +### MySQL Prerequisites + +1. Enable binary logging in `my.cnf`: + ``` + server-id = 1 + log_bin = mysql-bin + binlog_format = ROW + binlog_row_image = FULL + ``` + +2. Grant permissions: + ```sql + GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'debezium'@'%'; + ``` + +## Supported Sinks + +| Sink | Required Configuration | Notes | +|------|------------------------|-------| +| Kafka | `bootstrapServers` | Simple Kafka producer (no Kafka Connect required) | +| Redis | `address` | Redis Streams for real-time event streaming | +| NATS JetStream | `url` | Cloud-native messaging with persistence | +| HTTP | `url` | Webhooks and custom HTTP endpoints | +| Kinesis | `region`, `streamName` | AWS Kinesis (uses Universal Cloud Identity) | +| Pub/Sub | `projectId` | GCP Pub/Sub (uses Universal Cloud Identity) | +| Pulsar | `serviceUrl` | Apache Pulsar with optional authentication | +| Event Hubs | `connectionString`, `hubName` | Azure Event Hubs | + +## Offset Storage + +Debezium tracks the position of captured changes using offset storage. Three options are available: + +### File Storage (Default) + +Stores offsets in a local file. Requires a volumeset for persistence. + +```yaml +source: + offset: + storage: file + file: + filename: /debezium/data/offsets.dat + +volumeset: + capacity: 10 + performanceClass: general-purpose-ssd +``` + +### Redis Storage + +Stores offsets in Redis. No volumeset required. + +```yaml +source: + offset: + storage: redis + redis: + address: redis.mygvc.cpln.local:6379 + key: debezium:offsets + password: "" + ssl: false +``` + +### JDBC Storage + +Stores offsets in a relational database. No volumeset required. + +```yaml +source: + offset: + storage: jdbc + jdbc: + url: jdbc:postgresql://postgres.mygvc.cpln.local:5432/offsets + user: debezium + password: secret123 + tableName: debezium_offsets +``` + +## Schema History (MySQL/SQL Server Only) + +MySQL and SQL Server connectors require schema history storage to track DDL changes: + +```yaml +source: + type: mysql + schemaHistory: + storage: file # or: redis, jdbc + file: + filename: /debezium/data/schema-history.dat +``` + +## Serialization Formats + +Supports JSON, Avro, and Protobuf serialization: + +```yaml +format: + key: json + value: json + + # For Avro/Protobuf, configure schema registry: + schemaRegistry: + url: http://schema-registry.mygvc.cpln.local:8081 + username: "" + password: "" +``` + +## Universal Cloud Identity + +For AWS Kinesis and GCP Pub/Sub sinks, this template integrates with Control Plane's Universal Cloud Identity for credential-less authentication. + +### AWS Kinesis + +1. Create an AWS cloud account in Control Plane +2. Configure the identity with appropriate IAM policies +3. Enable the cloud account in your values: + +```yaml +sink: + type: kinesis + kinesis: + region: us-east-1 + streamName: my-stream + credentialsProvider: default + cloudAccount: + enabled: true + name: my-aws-account +``` + +### GCP Pub/Sub + +```yaml +sink: + type: pubsub + pubsub: + projectId: my-gcp-project + cloudAccount: + enabled: true + name: my-gcp-account +``` + +## Resource Configuration + +```yaml +resources: + cpu: 500m # CPU allocation + memory: 512Mi # Memory allocation + +volumeset: + capacity: 10 # GiB (only used with file storage) + performanceClass: general-purpose-ssd +``` + +## Firewall Configuration + +```yaml +firewall: + internal: + inboundAllowType: same-gvc # none, same-gvc, same-org, workload-list + workloads: [] # For workload-list type + external: + outboundAllowCIDR: + - 0.0.0.0/0 # Required for external database connectivity +``` + +## Health Checks + +Debezium Server exposes Quarkus health endpoints: + +- **Readiness**: `/q/health/ready` - Checks if the connector is ready +- **Liveness**: `/q/health/live` - Checks if the server is alive + +## Installation + +```bash +cpln helm install debezium ./debezium-server/versions/1.0.0 \ + --gvc my-gvc \ + -f my-values.yaml +``` + +## Verification + +1. Check workload status: + ```bash + cpln workload get debezium--debezium --gvc my-gvc + ``` + +2. Check health endpoint: + ```bash + curl http://debezium--debezium.my-gvc.cpln.local:8080/q/health + ``` + +3. View logs: + ```bash + cpln workload logs debezium--debezium --gvc my-gvc + ``` + +4. Test CDC by making changes in the source database and verifying events appear in the configured sink. + +## Troubleshooting + +### Connector Not Starting + +- Check database connectivity and credentials +- Verify replication permissions are granted +- Review logs for specific error messages + +### Offset Storage Issues + +- For file storage: ensure volumeset is properly mounted +- For Redis/JDBC: verify connectivity and credentials +- Check that the storage backend is accessible from the GVC + +### Sink Delivery Failures + +- Verify sink connectivity and authentication +- For cloud sinks (Kinesis/Pub/Sub): ensure cloud account is properly configured +- Check firewall rules allow outbound traffic to the sink + +## Resources + +- [Debezium Documentation](https://debezium.io/documentation/) +- [Debezium Server Documentation](https://debezium.io/documentation/reference/stable/operations/debezium-server.html) +- [Control Plane Documentation](https://docs.controlplane.com/) diff --git a/debezium-server/versions/1.1.0/templates/_helpers.tpl b/debezium-server/versions/1.1.0/templates/_helpers.tpl new file mode 100644 index 00000000..ddc0309a --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/_helpers.tpl @@ -0,0 +1,293 @@ +{{/* +================================================================================ +Resource Naming +================================================================================ +*/}} + +{{/* +Debezium Server Workload Name +*/}} +{{- define "debezium.name" -}} +{{- printf "%s-debezium" .Release.Name }} +{{- end }} + +{{/* +Debezium Identity Name +*/}} +{{- define "debezium.identity.name" -}} +{{- printf "%s-debezium-identity" .Release.Name }} +{{- end }} + +{{/* +Debezium Policy Name +*/}} +{{- define "debezium.policy.name" -}} +{{- printf "%s-debezium-policy" .Release.Name }} +{{- end }} + +{{/* +Debezium Config Secret Name (opaque - application.properties) +*/}} +{{- define "debezium.config.name" -}} +{{- printf "%s-debezium-config" .Release.Name }} +{{- end }} + +{{/* +Debezium Credentials Secret Name (dictionary) +*/}} +{{- define "debezium.credentials.name" -}} +{{- printf "%s-debezium-credentials" .Release.Name }} +{{- end }} + +{{/* +Debezium Volumeset Name +*/}} +{{- define "debezium.volumeset.name" -}} +{{- printf "%s-debezium-data" .Release.Name }} +{{- end }} + +{{/* +Debezium Entrypoint Secret Name +*/}} +{{- define "debezium.entrypoint.name" -}} +{{- printf "%s-debezium-entrypoint" .Release.Name }} +{{- end }} + +{{/* +================================================================================ +Auto-Computation Helpers (for meta-template / umbrella chart usage) +================================================================================ +*/}} + +{{/* +Resolve database hostname: use explicit value if set, otherwise compute from Release.Name. +When used standalone, hostname is always set. When used as a subchart in a meta-template, +hostname can be left empty and will auto-compute to the postgres-ha-proxy DNS name. +*/}} +{{- define "debezium.dbHostname" -}} +{{- if .Values.source.database.hostname -}} +{{- .Values.source.database.hostname -}} +{{- else -}} +{{- printf "%s-postgres-ha-proxy.%s.cpln.local" .Release.Name .Values.global.cpln.gvc -}} +{{- end -}} +{{- end -}} + +{{/* +Resolve Kafka bootstrap servers: use explicit value if set, otherwise compute from Release.Name. +When used standalone, bootstrapServers is always set. When used as a subchart in a meta-template, +it can be left empty and will auto-compute to the kafka cluster DNS name. +*/}} +{{- define "debezium.kafkaBootstrapServers" -}} +{{- if .Values.sink.kafka.bootstrapServers -}} +{{- .Values.sink.kafka.bootstrapServers -}} +{{- else -}} +{{- printf "%s-cluster.%s.cpln.local:9092" .Release.Name .Values.global.cpln.gvc -}} +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Validation Helpers +================================================================================ +*/}} + +{{/* +Validate source configuration +*/}} +{{- define "debezium.validateSource" -}} +{{- $validTypes := list "postgres" "mysql" "mongodb" "sqlserver" "oracle" -}} +{{- if not (has .Values.source.type $validTypes) -}} +{{- fail (printf "Invalid source.type '%s'. Must be one of: %s" .Values.source.type (join ", " $validTypes)) -}} +{{- end -}} +{{- if not .Values.source.database.name -}} +{{- fail "source.database.name is required" -}} +{{- end -}} +{{- if not .Values.source.database.user -}} +{{- fail "source.database.user is required" -}} +{{- end -}} +{{- if not .Values.source.database.password -}} +{{- fail "source.database.password is required" -}} +{{- end -}} +{{- end -}} + +{{/* +Validate sink configuration +*/}} +{{- define "debezium.validateSink" -}} +{{- $validTypes := list "kafka" "redis" "nats-jetstream" "http" "kinesis" "pubsub" "pulsar" "eventhubs" -}} +{{- if not (has .Values.sink.type $validTypes) -}} +{{- fail (printf "Invalid sink.type '%s'. Must be one of: %s" .Values.sink.type (join ", " $validTypes)) -}} +{{- end -}} +{{- if eq .Values.sink.type "kafka" -}} + {{- if not (include "debezium.kafkaBootstrapServers" .) -}} + {{- fail "sink.kafka.bootstrapServers is required when sink.type is 'kafka'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "redis" -}} + {{- if not .Values.sink.redis.address -}} + {{- fail "sink.redis.address is required when sink.type is 'redis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "nats-jetstream" -}} + {{- if not .Values.sink.nats.url -}} + {{- fail "sink.nats.url is required when sink.type is 'nats-jetstream'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "http" -}} + {{- if not .Values.sink.http.url -}} + {{- fail "sink.http.url is required when sink.type is 'http'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "kinesis" -}} + {{- if not .Values.sink.kinesis.region -}} + {{- fail "sink.kinesis.region is required when sink.type is 'kinesis'" -}} + {{- end -}} + {{- if not .Values.sink.kinesis.streamName -}} + {{- fail "sink.kinesis.streamName is required when sink.type is 'kinesis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "pubsub" -}} + {{- if not .Values.sink.pubsub.projectId -}} + {{- fail "sink.pubsub.projectId is required when sink.type is 'pubsub'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "pulsar" -}} + {{- if not .Values.sink.pulsar.serviceUrl -}} + {{- fail "sink.pulsar.serviceUrl is required when sink.type is 'pulsar'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.sink.type "eventhubs" -}} + {{- if not .Values.sink.eventhubs.connectionString -}} + {{- fail "sink.eventhubs.connectionString is required when sink.type is 'eventhubs'" -}} + {{- end -}} + {{- if not .Values.sink.eventhubs.hubName -}} + {{- fail "sink.eventhubs.hubName is required when sink.type is 'eventhubs'" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +Validate offset storage configuration +*/}} +{{- define "debezium.validateOffsetStorage" -}} +{{- $validTypes := list "file" "redis" "jdbc" -}} +{{- if not (has .Values.source.offset.storage $validTypes) -}} +{{- fail (printf "Invalid source.offset.storage '%s'. Must be one of: %s" .Values.source.offset.storage (join ", " $validTypes)) -}} +{{- end -}} +{{- if eq .Values.source.offset.storage "redis" -}} + {{- if not .Values.source.offset.redis.address -}} + {{- fail "source.offset.redis.address is required when offset storage is 'redis'" -}} + {{- end -}} +{{- end -}} +{{- if eq .Values.source.offset.storage "jdbc" -}} + {{- if not .Values.source.offset.jdbc.url -}} + {{- fail "source.offset.jdbc.url is required when offset storage is 'jdbc'" -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Connector Class Mapping +================================================================================ +*/}} + +{{/* +Get the Debezium connector class for the source type +*/}} +{{- define "debezium.connectorClass" -}} +{{- $connectorMap := dict + "postgres" "io.debezium.connector.postgresql.PostgresConnector" + "mysql" "io.debezium.connector.mysql.MySqlConnector" + "mongodb" "io.debezium.connector.mongodb.MongoDbConnector" + "sqlserver" "io.debezium.connector.sqlserver.SqlServerConnector" + "oracle" "io.debezium.connector.oracle.OracleConnector" +-}} +{{- get $connectorMap .Values.source.type -}} +{{- end -}} + +{{/* +Get the default port for the source type +*/}} +{{- define "debezium.defaultPort" -}} +{{- $portMap := dict + "postgres" 5432 + "mysql" 3306 + "mongodb" 27017 + "sqlserver" 1433 + "oracle" 1521 +-}} +{{- get $portMap .Values.source.type -}} +{{- end -}} + +{{/* +Get the effective database port +*/}} +{{- define "debezium.databasePort" -}} +{{- if .Values.source.database.port -}} +{{- .Values.source.database.port -}} +{{- else -}} +{{- include "debezium.defaultPort" . -}} +{{- end -}} +{{- end -}} + +{{/* +Check if schema history is required (MySQL and SQL Server need it) +*/}} +{{- define "debezium.requiresSchemaHistory" -}} +{{- if or (eq .Values.source.type "mysql") (eq .Values.source.type "sqlserver") -}} +true +{{- else -}} +false +{{- end -}} +{{- end -}} + +{{/* +Check if file-based storage is used (requires volumeset) +*/}} +{{- define "debezium.requiresVolumeset" -}} +{{- if eq .Values.source.offset.storage "file" -}} +true +{{- else if and (eq (include "debezium.requiresSchemaHistory" .) "true") (eq .Values.source.schemaHistory.storage "file") -}} +true +{{- else -}} +false +{{- end -}} +{{- end -}} + +{{/* +================================================================================ +Labeling +================================================================================ +*/}} + +{{/* +Create chart name and version as used by the chart label +*/}} +{{- define "debezium.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Common labels/tags +*/}} +{{- define "debezium.tags" -}} +helm.sh/chart: {{ include "debezium.chart" . }} +{{ include "debezium.selectorLabels" . }} +{{- if .Chart.AppVersion }} +app.cpln.io/version: {{ .Chart.AppVersion | quote }} +{{- end }} +app.cpln.io/managed-by: {{ .Release.Service }} +cpln/marketplace: "true" +cpln/marketplace-template: debezium-server +cpln/marketplace-template-version: {{ .Chart.Version }} +cpln/marketplace-gvc: {{ .Values.global.cpln.gvc }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "debezium.selectorLabels" -}} +app.cpln.io/name: {{ .Release.Name }} +app.cpln.io/instance: {{ .Release.Name }} +{{- end }} diff --git a/debezium-server/versions/1.1.0/templates/identity.yaml b/debezium-server/versions/1.1.0/templates/identity.yaml new file mode 100644 index 00000000..ac53fdd2 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/identity.yaml @@ -0,0 +1,19 @@ +{{- include "debezium.validateSource" . -}} +{{- include "debezium.validateSink" . -}} +{{- include "debezium.validateOffsetStorage" . -}} +kind: identity +name: {{ include "debezium.identity.name" . }} +description: Debezium Server identity for secret access and cloud integration +gvc: {{ .Values.global.cpln.gvc }} +tags: + {{- include "debezium.tags" . | nindent 2 }} +{{- if and (eq .Values.sink.type "kinesis") .Values.sink.kinesis.cloudAccount.enabled }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.sink.kinesis.cloudAccount.name }} +{{- end }} +{{- if and (eq .Values.sink.type "pubsub") .Values.sink.pubsub.cloudAccount.enabled }} +gcp: + cloudAccountLink: //cloudaccount/{{ .Values.sink.pubsub.cloudAccount.name }} + scopes: + - https://www.googleapis.com/auth/pubsub +{{- end }} diff --git a/debezium-server/versions/1.1.0/templates/policy.yaml b/debezium-server/versions/1.1.0/templates/policy.yaml new file mode 100644 index 00000000..b9e40ef1 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/policy.yaml @@ -0,0 +1,17 @@ +kind: policy +name: {{ include "debezium.policy.name" . }} +description: Debezium Server policy for secret access +tags: + {{- include "debezium.tags" . | nindent 2 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "debezium.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "debezium.config.name" . }} + - //secret/{{ include "debezium.credentials.name" . }} + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + - //secret/{{ include "debezium.entrypoint.name" . }} + {{- end }} diff --git a/debezium-server/versions/1.1.0/templates/secret-config.yaml b/debezium-server/versions/1.1.0/templates/secret-config.yaml new file mode 100644 index 00000000..b32381b5 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/secret-config.yaml @@ -0,0 +1,260 @@ +kind: secret +name: {{ include "debezium.config.name" . }} +description: Debezium Server application.properties configuration +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + # ============================================================================= + # Debezium Server Configuration + # Generated by Control Plane Debezium Server Template + # ============================================================================= + + # ----------------------------------------------------------------------------- + # Quarkus Settings + # ----------------------------------------------------------------------------- + quarkus.http.port=8080 + quarkus.log.console.json=false + + # ----------------------------------------------------------------------------- + # Source Connector Configuration + # ----------------------------------------------------------------------------- + debezium.source.connector.class={{ include "debezium.connectorClass" . }} + debezium.source.topic.prefix={{ .Values.source.serverName }} + + # Database connection + debezium.source.database.hostname=${DB_HOSTNAME} + debezium.source.database.port={{ include "debezium.databasePort" . }} + debezium.source.database.user=${DB_USER} + debezium.source.database.password=${DB_PASSWORD} + {{- if or (eq .Values.source.type "oracle") (eq .Values.source.type "postgres") }} + debezium.source.database.dbname={{ .Values.source.database.name }} + {{- else }} + debezium.source.database.name={{ .Values.source.database.name }} + {{- end }} + + {{- if .Values.source.tableIncludeList }} + debezium.source.table.include.list={{ .Values.source.tableIncludeList }} + {{- end }} + {{- if .Values.source.tableExcludeList }} + debezium.source.table.exclude.list={{ .Values.source.tableExcludeList }} + {{- end }} + + {{- /* PostgreSQL-specific settings */}} + {{- if eq .Values.source.type "postgres" }} + debezium.source.plugin.name={{ .Values.source.postgres.pluginName }} + debezium.source.slot.name={{ .Values.source.postgres.slotName }} + debezium.source.publication.name={{ .Values.source.postgres.publicationName }} + debezium.source.slot.drop.on.stop={{ .Values.source.postgres.slotDropOnStop }} + {{- if gt (int .Values.source.postgres.heartbeatIntervalMs) 0 }} + debezium.source.heartbeat.interval.ms={{ .Values.source.postgres.heartbeatIntervalMs }} + {{- if .Values.source.postgres.heartbeatActionQuery }} + debezium.source.heartbeat.action.query={{ .Values.source.postgres.heartbeatActionQuery }} + {{- end }} + {{- end }} + {{- end }} + + {{- /* MySQL-specific settings */}} + {{- if eq .Values.source.type "mysql" }} + debezium.source.database.server.id={{ .Values.source.mysql.serverId }} + debezium.source.include.schema.changes={{ .Values.source.mysql.includeSchemaChanges }} + {{- end }} + + {{- /* MongoDB-specific settings */}} + {{- if eq .Values.source.type "mongodb" }} + {{- if .Values.source.mongodb.connectionString }} + debezium.source.mongodb.connection.string=${MONGODB_CONNECTION_STRING} + {{- end }} + {{- if .Values.source.mongodb.replicaSet }} + debezium.source.mongodb.replica.set={{ .Values.source.mongodb.replicaSet }} + {{- end }} + {{- end }} + + {{- /* SQL Server-specific settings */}} + {{- if eq .Values.source.type "sqlserver" }} + {{- if .Values.source.sqlserver.databaseNames }} + debezium.source.database.names={{ .Values.source.sqlserver.databaseNames }} + {{- end }} + debezium.source.snapshot.mode={{ .Values.source.sqlserver.snapshotMode }} + {{- end }} + + {{- /* Oracle-specific settings */}} + {{- if eq .Values.source.type "oracle" }} + {{- if .Values.source.oracle.pdbName }} + debezium.source.database.pdb.name={{ .Values.source.oracle.pdbName }} + {{- end }} + debezium.source.log.mining.strategy={{ .Values.source.oracle.logMiningStrategy }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Offset Storage Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.source.offset.storage "file" }} + debezium.source.offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore + debezium.source.offset.storage.file.filename={{ .Values.source.offset.file.filename }} + {{- else if eq .Values.source.offset.storage "redis" }} + debezium.source.offset.storage=io.debezium.storage.redis.offset.RedisOffsetBackingStore + debezium.source.offset.storage.redis.address=${OFFSET_REDIS_ADDRESS} + debezium.source.offset.storage.redis.key={{ .Values.source.offset.redis.key }} + {{- if .Values.source.offset.redis.password }} + debezium.source.offset.storage.redis.password=${OFFSET_REDIS_PASSWORD} + {{- end }} + {{- if .Values.source.offset.redis.ssl }} + debezium.source.offset.storage.redis.ssl.enabled=true + {{- end }} + {{- else if eq .Values.source.offset.storage "jdbc" }} + debezium.source.offset.storage=io.debezium.storage.jdbc.offset.JdbcOffsetBackingStore + debezium.source.offset.storage.jdbc.url=${OFFSET_JDBC_URL} + debezium.source.offset.storage.jdbc.user=${OFFSET_JDBC_USER} + debezium.source.offset.storage.jdbc.password=${OFFSET_JDBC_PASSWORD} + debezium.source.offset.storage.jdbc.offset.table.name={{ .Values.source.offset.jdbc.tableName }} + {{- end }} + debezium.source.offset.flush.interval.ms={{ .Values.source.offset.flushIntervalMs }} + debezium.source.offset.flush.timeout.ms={{ .Values.source.offset.flushTimeoutMs }} + + # ----------------------------------------------------------------------------- + # Error Retry Configuration + # ----------------------------------------------------------------------------- + debezium.source.errors.retry.delay.initial.ms={{ .Values.source.errors.retryDelayInitialMs }} + debezium.source.errors.retry.delay.max.ms={{ .Values.source.errors.retryDelayMaxMs }} + debezium.source.errors.max.retries={{ .Values.source.errors.maxRetries }} + + {{- /* Schema History Storage (MySQL and SQL Server only) */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + + # ----------------------------------------------------------------------------- + # Schema History Storage Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.source.schemaHistory.storage "file" }} + debezium.source.schema.history.internal=io.debezium.storage.file.history.FileSchemaHistory + debezium.source.schema.history.internal.file.filename={{ .Values.source.schemaHistory.file.filename }} + {{- else if eq .Values.source.schemaHistory.storage "redis" }} + debezium.source.schema.history.internal=io.debezium.storage.redis.history.RedisSchemaHistory + debezium.source.schema.history.internal.redis.address=${SCHEMA_HISTORY_REDIS_ADDRESS} + debezium.source.schema.history.internal.redis.key={{ .Values.source.schemaHistory.redis.key }} + {{- if .Values.source.schemaHistory.redis.password }} + debezium.source.schema.history.internal.redis.password=${SCHEMA_HISTORY_REDIS_PASSWORD} + {{- end }} + {{- if .Values.source.schemaHistory.redis.ssl }} + debezium.source.schema.history.internal.redis.ssl.enabled=true + {{- end }} + {{- else if eq .Values.source.schemaHistory.storage "jdbc" }} + debezium.source.schema.history.internal=io.debezium.storage.jdbc.history.JdbcSchemaHistory + debezium.source.schema.history.internal.jdbc.url=${SCHEMA_HISTORY_JDBC_URL} + debezium.source.schema.history.internal.jdbc.user=${SCHEMA_HISTORY_JDBC_USER} + debezium.source.schema.history.internal.jdbc.password=${SCHEMA_HISTORY_JDBC_PASSWORD} + debezium.source.schema.history.internal.jdbc.schema.history.table.name={{ .Values.source.schemaHistory.jdbc.tableName }} + {{- end }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Sink Configuration + # ----------------------------------------------------------------------------- + {{- if eq .Values.sink.type "kafka" }} + debezium.sink.type=kafka + debezium.sink.kafka.producer.bootstrap.servers=${KAFKA_BOOTSTRAP_SERVERS} + debezium.sink.kafka.producer.key.serializer=org.apache.kafka.common.serialization.StringSerializer + debezium.sink.kafka.producer.value.serializer=org.apache.kafka.common.serialization.StringSerializer + {{- if .Values.sink.kafka.topic }} + debezium.sink.kafka.producer.topic.prefix={{ .Values.sink.kafka.topic }} + {{- end }} + debezium.sink.kafka.producer.security.protocol={{ .Values.sink.kafka.securityProtocol }} + {{- if .Values.sink.kafka.saslMechanism }} + debezium.sink.kafka.producer.sasl.mechanism={{ .Values.sink.kafka.saslMechanism }} + debezium.sink.kafka.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="${KAFKA_SASL_USERNAME}" password="${KAFKA_SASL_PASSWORD}"; + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + debezium.sink.type=redis + debezium.sink.redis.address=${SINK_REDIS_ADDRESS} + {{- if .Values.sink.redis.password }} + debezium.sink.redis.password=${SINK_REDIS_PASSWORD} + {{- end }} + {{- if .Values.sink.redis.ssl }} + debezium.sink.redis.ssl.enabled=true + {{- end }} + {{- if .Values.sink.redis.streamName }} + debezium.sink.redis.stream.name={{ .Values.sink.redis.streamName }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + debezium.sink.type=nats-jetstream + debezium.sink.nats-jetstream.url=${NATS_URL} + {{- if .Values.sink.nats.subject }} + debezium.sink.nats-jetstream.subject={{ .Values.sink.nats.subject }} + {{- end }} + {{- if .Values.sink.nats.username }} + debezium.sink.nats-jetstream.username=${NATS_USERNAME} + debezium.sink.nats-jetstream.password=${NATS_PASSWORD} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + debezium.sink.type=http + debezium.sink.http.url=${HTTP_SINK_URL} + {{- if eq .Values.sink.http.authType "basic" }} + debezium.sink.http.authentication.type=basic + debezium.sink.http.authentication.username=${HTTP_SINK_USERNAME} + debezium.sink.http.authentication.password=${HTTP_SINK_PASSWORD} + {{- else if eq .Values.sink.http.authType "bearer" }} + debezium.sink.http.authentication.type=bearer + debezium.sink.http.authentication.bearer.token=${HTTP_SINK_BEARER_TOKEN} + {{- end }} + {{- range $key, $value := .Values.sink.http.headers }} + debezium.sink.http.headers.{{ $key }}={{ $value }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "kinesis" }} + debezium.sink.type=kinesis + debezium.sink.kinesis.region={{ .Values.sink.kinesis.region }} + debezium.sink.kinesis.stream={{ .Values.sink.kinesis.streamName }} + debezium.sink.kinesis.credentials.provider={{ .Values.sink.kinesis.credentialsProvider }} + {{- end }} + + {{- if eq .Values.sink.type "pubsub" }} + debezium.sink.type=pubsub + debezium.sink.pubsub.project.id={{ .Values.sink.pubsub.projectId }} + {{- if .Values.sink.pubsub.topic }} + debezium.sink.pubsub.topic.prefix={{ .Values.sink.pubsub.topic }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + debezium.sink.type=pulsar + debezium.sink.pulsar.client.serviceUrl=${PULSAR_SERVICE_URL} + {{- if .Values.sink.pulsar.topic }} + debezium.sink.pulsar.topic.prefix={{ .Values.sink.pulsar.topic }} + {{- end }} + {{- if .Values.sink.pulsar.authPluginClassName }} + debezium.sink.pulsar.client.authPluginClassName={{ .Values.sink.pulsar.authPluginClassName }} + debezium.sink.pulsar.client.authParams=token:${PULSAR_AUTH_TOKEN} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + debezium.sink.type=eventhubs + debezium.sink.eventhubs.connection.string=${EVENTHUBS_CONNECTION_STRING} + debezium.sink.eventhubs.hub.name={{ .Values.sink.eventhubs.hubName }} + {{- end }} + + # ----------------------------------------------------------------------------- + # Serialization Format + # ----------------------------------------------------------------------------- + debezium.format.key={{ .Values.format.key }} + debezium.format.value={{ .Values.format.value }} + {{- if or (eq .Values.format.key "avro") (eq .Values.format.key "protobuf") (eq .Values.format.value "avro") (eq .Values.format.value "protobuf") }} + {{- if .Values.format.schemaRegistry.url }} + debezium.format.key.schemas.enable=true + debezium.format.value.schemas.enable=true + debezium.format.schema.registry.url=${SCHEMA_REGISTRY_URL} + {{- if .Values.format.schemaRegistry.username }} + debezium.format.schema.registry.basic.auth.credentials.source=USER_INFO + debezium.format.schema.registry.basic.auth.user.info=${SCHEMA_REGISTRY_USERNAME}:${SCHEMA_REGISTRY_PASSWORD} + {{- end }} + {{- end }} + {{- end }} diff --git a/debezium-server/versions/1.1.0/templates/secret-credentials.yaml b/debezium-server/versions/1.1.0/templates/secret-credentials.yaml new file mode 100644 index 00000000..d516d6a0 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/secret-credentials.yaml @@ -0,0 +1,99 @@ +kind: secret +name: {{ include "debezium.credentials.name" . }} +description: Debezium Server credentials +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: dictionary +data: + # Database credentials + db-hostname: {{ include "debezium.dbHostname" . | quote }} + db-user: {{ .Values.source.database.user | quote }} + db-password: {{ .Values.source.database.password | quote }} + + {{- /* MongoDB connection string */}} + {{- if and (eq .Values.source.type "mongodb") .Values.source.mongodb.connectionString }} + mongodb-connection-string: {{ .Values.source.mongodb.connectionString | quote }} + {{- end }} + + {{- /* Offset storage credentials */}} + {{- if eq .Values.source.offset.storage "redis" }} + offset-redis-address: {{ .Values.source.offset.redis.address | quote }} + {{- if .Values.source.offset.redis.password }} + offset-redis-password: {{ .Values.source.offset.redis.password | quote }} + {{- end }} + {{- end }} + {{- if eq .Values.source.offset.storage "jdbc" }} + offset-jdbc-url: {{ .Values.source.offset.jdbc.url | quote }} + offset-jdbc-user: {{ .Values.source.offset.jdbc.user | quote }} + offset-jdbc-password: {{ .Values.source.offset.jdbc.password | quote }} + {{- end }} + + {{- /* Schema history storage credentials (MySQL/SQL Server only) */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + {{- if eq .Values.source.schemaHistory.storage "redis" }} + schema-history-redis-address: {{ .Values.source.schemaHistory.redis.address | quote }} + {{- if .Values.source.schemaHistory.redis.password }} + schema-history-redis-password: {{ .Values.source.schemaHistory.redis.password | quote }} + {{- end }} + {{- end }} + {{- if eq .Values.source.schemaHistory.storage "jdbc" }} + schema-history-jdbc-url: {{ .Values.source.schemaHistory.jdbc.url | quote }} + schema-history-jdbc-user: {{ .Values.source.schemaHistory.jdbc.user | quote }} + schema-history-jdbc-password: {{ .Values.source.schemaHistory.jdbc.password | quote }} + {{- end }} + {{- end }} + + {{- /* Sink credentials */}} + {{- if eq .Values.sink.type "kafka" }} + kafka-bootstrap-servers: {{ include "debezium.kafkaBootstrapServers" . | quote }} + {{- if .Values.sink.kafka.saslUsername }} + kafka-sasl-username: {{ .Values.sink.kafka.saslUsername | quote }} + kafka-sasl-password: {{ .Values.sink.kafka.saslPassword | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + sink-redis-address: {{ .Values.sink.redis.address | quote }} + {{- if .Values.sink.redis.password }} + sink-redis-password: {{ .Values.sink.redis.password | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + nats-url: {{ .Values.sink.nats.url | quote }} + {{- if .Values.sink.nats.username }} + nats-username: {{ .Values.sink.nats.username | quote }} + nats-password: {{ .Values.sink.nats.password | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + http-sink-url: {{ .Values.sink.http.url | quote }} + {{- if eq .Values.sink.http.authType "basic" }} + http-sink-username: {{ .Values.sink.http.username | quote }} + http-sink-password: {{ .Values.sink.http.password | quote }} + {{- end }} + {{- if eq .Values.sink.http.authType "bearer" }} + http-sink-bearer-token: {{ .Values.sink.http.bearerToken | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + pulsar-service-url: {{ .Values.sink.pulsar.serviceUrl | quote }} + {{- if .Values.sink.pulsar.authToken }} + pulsar-auth-token: {{ .Values.sink.pulsar.authToken | quote }} + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + eventhubs-connection-string: {{ .Values.sink.eventhubs.connectionString | quote }} + {{- end }} + + {{- /* Schema registry credentials */}} + {{- if .Values.format.schemaRegistry.url }} + schema-registry-url: {{ .Values.format.schemaRegistry.url | quote }} + {{- if .Values.format.schemaRegistry.username }} + schema-registry-username: {{ .Values.format.schemaRegistry.username | quote }} + schema-registry-password: {{ .Values.format.schemaRegistry.password | quote }} + {{- end }} + {{- end }} diff --git a/debezium-server/versions/1.1.0/templates/secret-entrypoint.yaml b/debezium-server/versions/1.1.0/templates/secret-entrypoint.yaml new file mode 100644 index 00000000..469c0503 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/secret-entrypoint.yaml @@ -0,0 +1,46 @@ +{{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} +kind: secret +name: {{ include "debezium.entrypoint.name" . }} +description: Debezium Server entrypoint script for PostgreSQL prerequisites +tags: + {{- include "debezium.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: | + #!/bin/bash + + set -o nounset + + echo "=== Debezium Server Entrypoint ===" + + # Find the PostgreSQL JDBC driver JAR (bundled with Debezium Server) + JDBC_JAR=$(find /debezium -name "postgresql-*.jar" -print -quit 2>/dev/null || true) + if [ -z "${JDBC_JAR}" ]; then + echo "WARNING: PostgreSQL JDBC driver not found. Skipping prerequisites." + exec /debezium/run.sh + fi + echo "Using JDBC driver: ${JDBC_JAR}" + + # Decode pre-compiled PgInit.class (Java 17+ compatible) + # Connects via JDBC, creates heartbeat table + failover replication slot + echo "yv66vgAAAD0AtQoAAgADBwAEDAAFAAYBABBqYXZhL2xhbmcvT2JqZWN0AQAGPGluaXQ+AQADKClWCQAIAAkHAAoMAAsADAEAEGphdmEvbGFuZy9TeXN0ZW0BAANlcnIBABVMamF2YS9pby9QcmludFN0cmVhbTsIAA4BAE5Vc2FnZTogUGdJbml0IDxob3N0PiA8cG9ydD4gPGRibmFtZT4gPHVzZXI+IDxwYXNzd29yZD4gPHNsb3ROYW1lPiA8cGx1Z2luTmFtZT4KABAAEQcAEgwAEwAUAQATamF2YS9pby9QcmludFN0cmVhbQEAB3ByaW50bG4BABUoTGphdmEvbGFuZy9TdHJpbmc7KVYKAAgAFgwAFwAYAQAEZXhpdAEABChJKVYSAAAAGgwAGwAcAQAXbWFrZUNvbmNhdFdpdGhDb25zdGFudHMBAEooTGphdmEvbGFuZy9TdHJpbmc7TGphdmEvbGFuZy9TdHJpbmc7TGphdmEvbGFuZy9TdHJpbmc7KUxqYXZhL2xhbmcvU3RyaW5nOwoAHgAfBwAgDAAhACIBABZqYXZhL3NxbC9Ecml2ZXJNYW5hZ2VyAQANZ2V0Q29ubmVjdGlvbgEATShMamF2YS9sYW5nL1N0cmluZztMamF2YS9sYW5nL1N0cmluZztMamF2YS9sYW5nL1N0cmluZzspTGphdmEvc3FsL0Nvbm5lY3Rpb247CQAIACQMACUADAEAA291dAgAJwEAGENvbm5lY3RlZCB0byBQb3N0Z3JlU1FMLgcAKQEAFWphdmEvc3FsL1NRTEV4Y2VwdGlvbgoAKAArDAAsAC0BAApnZXRNZXNzYWdlAQAUKClMamF2YS9sYW5nL1N0cmluZzsSAAEALwwAGwAwAQAnKElMamF2YS9sYW5nL1N0cmluZzspTGphdmEvbGFuZy9TdHJpbmc7EgACADIMABsAMwEAKChJSUxqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzsFAAAAAAAAE4gKADcAOAcAOQwAOgA7AQAQamF2YS9sYW5nL1RocmVhZAEABXNsZWVwAQAEKEopVgcAPQEAHmphdmEvbGFuZy9JbnRlcnJ1cHRlZEV4Y2VwdGlvbgoANwA/DABAAEEBAA1jdXJyZW50VGhyZWFkAQAUKClMamF2YS9sYW5nL1RocmVhZDsKADcAQwwARAAGAQAJaW50ZXJydXB0CABGAQArRW5zdXJpbmcgZGViZXppdW1faGVhcnRiZWF0IHRhYmxlIGV4aXN0cy4uLgsASABJBwBKDABLAEwBABNqYXZhL3NxbC9Db25uZWN0aW9uAQAPY3JlYXRlU3RhdGVtZW50AQAWKClMamF2YS9zcWwvU3RhdGVtZW50OwgATgEAa0NSRUFURSBUQUJMRSBJRiBOT1QgRVhJU1RTIGRlYmV6aXVtX2hlYXJ0YmVhdCAoaWQgSU5URUdFUiBQUklNQVJZIEtFWSwgdHMgVElNRVNUQU1QIE5PVCBOVUxMIERFRkFVTFQgbm93KCkpCwBQAFEHAFIMAFMAVAEAEmphdmEvc3FsL1N0YXRlbWVudAEAB2V4ZWN1dGUBABUoTGphdmEvbGFuZy9TdHJpbmc7KVoIAFYBAFVJTlNFUlQgSU5UTyBkZWJleml1bV9oZWFydGJlYXQgKGlkLCB0cykgVkFMVUVTICgxLCBub3coKSkgT04gQ09ORkxJQ1QgKGlkKSBETyBOT1RISU5HCwBQAFgMAFkABgEABWNsb3NlBwBbAQATamF2YS9sYW5nL1Rocm93YWJsZQoAWgBdDABeAF8BAA1hZGRTdXBwcmVzc2VkAQAYKExqYXZhL2xhbmcvVGhyb3dhYmxlOylWCABhAQAWSGVhcnRiZWF0IHRhYmxlIHJlYWR5LhIAAwBjDAAbAGQBACYoTGphdmEvbGFuZy9TdHJpbmc7KUxqYXZhL2xhbmcvU3RyaW5nOwgAZgEAPVNFTEVDVCBjb3VudCgqKSBGUk9NIHBnX3JlcGxpY2F0aW9uX3Nsb3RzIFdIRVJFIHNsb3RfbmFtZSA9ID8LAEgAaAwAaQBqAQAQcHJlcGFyZVN0YXRlbWVudAEAMChMamF2YS9sYW5nL1N0cmluZzspTGphdmEvc3FsL1ByZXBhcmVkU3RhdGVtZW50OwsAbABtBwBuDABvAHABABpqYXZhL3NxbC9QcmVwYXJlZFN0YXRlbWVudAEACXNldFN0cmluZwEAFihJTGphdmEvbGFuZy9TdHJpbmc7KVYLAGwAcgwAcwB0AQAMZXhlY3V0ZVF1ZXJ5AQAWKClMamF2YS9zcWwvUmVzdWx0U2V0OwsAdgB3BwB4DAB5AHoBABJqYXZhL3NxbC9SZXN1bHRTZXQBAARuZXh0AQADKClaCwB2AHwMAH0AfgEABmdldEludAEABChJKUkSAAQAgAwAGwCBAQA4KExqYXZhL2xhbmcvU3RyaW5nO0xqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzsSAAUAYxIABgBjCwBsAFgLAEgAWAgAhwEAF1ByZXJlcXVpc2l0ZXMgY29tcGxldGUuEgAHAGMIAIoBACJEZWJleml1bSBTZXJ2ZXIgd2lsbCBzdGFydCBhbnl3YXkuBwCMAQAGUGdJbml0AQAEQ29kZQEAD0xpbmVOdW1iZXJUYWJsZQEABG1haW4BABYoW0xqYXZhL2xhbmcvU3RyaW5nOylWAQANU3RhY2tNYXBUYWJsZQcAkwEAE1tMamF2YS9sYW5nL1N0cmluZzsHAJUBABBqYXZhL2xhbmcvU3RyaW5nAQAKU291cmNlRmlsZQEAC1BnSW5pdC5qYXZhAQAQQm9vdHN0cmFwTWV0aG9kcwgAmgEAF2pkYmM6cG9zdGdyZXNxbDovLwE6AS8BCACcAQAsRVJST1I6IENvdWxkIG5vdCBjb25uZWN0IGFmdGVyIAEgYXR0ZW1wdHM6IAEIAJ4BACUgIEF0dGVtcHQgAS8BIC0gcmV0cnlpbmcgaW4gNXMuLi4gKAEpCACgAQAwRW5zdXJpbmcgZmFpbG92ZXIgcmVwbGljYXRpb24gc2xvdCAnAScgZXhpc3RzLi4uCACiAQBAU0VMRUNUIHBnX2NyZWF0ZV9sb2dpY2FsX3JlcGxpY2F0aW9uX3Nsb3QoJwEnLCAnAScsIGZhbHNlLCB0cnVlKQgApAEAJkNyZWF0ZWQgZmFpbG92ZXIgcmVwbGljYXRpb24gc2xvdCAnAScuCACmAQAkUmVwbGljYXRpb24gc2xvdCAnAScgYWxyZWFkeSBleGlzdHMuCACoAQAlV0FSTklORzogUHJlcmVxdWlzaXRlIHNldHVwIGZhaWxlZDogAQ8GAKoKAKsArAcArQwAGwCuAQAkamF2YS9sYW5nL2ludm9rZS9TdHJpbmdDb25jYXRGYWN0b3J5AQCYKExqYXZhL2xhbmcvaW52b2tlL01ldGhvZEhhbmRsZXMkTG9va3VwO0xqYXZhL2xhbmcvU3RyaW5nO0xqYXZhL2xhbmcvaW52b2tlL01ldGhvZFR5cGU7TGphdmEvbGFuZy9TdHJpbmc7W0xqYXZhL2xhbmcvT2JqZWN0OylMamF2YS9sYW5nL2ludm9rZS9DYWxsU2l0ZTsBAAxJbm5lckNsYXNzZXMHALEBACVqYXZhL2xhbmcvaW52b2tlL01ldGhvZEhhbmRsZXMkTG9va3VwBwCzAQAeamF2YS9sYW5nL2ludm9rZS9NZXRob2RIYW5kbGVzAQAGTG9va3VwACEAiwACAAAAAAACAAEABQAGAAEAjQAAAB0AAQABAAAABSq3AAGxAAAAAQCOAAAABgABAAAAAwAJAI8AkAABAI0AAARsAAQAEAAAAgIqvhAHogAPsgAHEg22AA8EuAAVKgMyTCoEMk0qBTJOKgYyOgQqBzI6BSoIMjoGKhAGMjoHKywtugAZAAA6CBAeNgkBOgoENgsVCxUJowBjGQgZBBkFuAAdOgqyACMSJrYAD6cATToMFQsVCaAAGbIABxUJGQy2ACq6AC4AALYADwO4ABWyACMVCxUJGQy2ACq6ADEAALYADxQANLgANqcACzoNuAA+tgBChAsBp/+csgAjEkW2AA8ZCrkARwEAOgsZCxJNuQBPAgBXGQsSVbkATwIAVxkLxgAqGQu5AFcBAKcAIDoMGQvGABYZC7kAVwEApwAMOg0ZDBkNtgBcGQy/sgAjEmC2AA+yACMZBroAYgAAtgAPGQoSZbkAZwIAOgsZCwQZBrkAawMAGQu5AHEBADoMGQy5AHUBAFcZDAS5AHsCAJoAWRkKuQBHAQA6DRkNGQYZB7oAfwAAuQBPAgBXGQ3GACoZDbkAVwEApwAgOg4ZDcYAFhkNuQBXAQCnAAw6DxkOGQ+2AFwZDr+yACMZBroAggAAtgAPpwAQsgAjGQa6AIMAALYADxkLxgAqGQu5AIQBAKcAIDoMGQvGABYZC7kAhAEApwAMOg0ZDBkNtgBcGQy/GQq5AIUBALIAIxKGtgAPpwAdOguyAAcZC7YAKroAiAAAtgAPsgAHEom2AA+xAAkATwBiAGUAKACYAJ4AoQA8AMAA1ADjAFoA6gDxAPQAWgFPAWABbwBaAXYBfQGAAFoBIAGpAbgAWgG/AcYByQBaAK8B5AHnACgAAgCOAAAAwgAwAAAABQAHAAYADwAHABMACQAfAAoAKQALADQADAA+AA4AQgAPAEUAEABPABIAWgATAGIAFABlABUAZwAWAG4AFwCAABgAhAAaAJgAGwCpABAArwAgALcAIQDAACIAygAjANQAJADjACEBAAAlAQgAJwEVACgBIAApASoAKgEzACsBOwAsAUYALQFPAC4BYAAvAW8ALQGMADABnAAyAakANAG4ACgB1QA2AdwANwHkADsB5wA4AekAOQH5ADoCAQA8AJEAAAFIABcT/wA0AAwHAJIHAJQHAJQHAJQHAJQHAJQHAJQHAJQHAJQBBwBIAQAAXAcAKPwAHgcAKFwHADz6AAf6AAX/ADMADAcAkgcAlAcAlAcAlAcAlAcAlAcAlAcAlAcAlAEHAEgHAFAAAQcAWv8AEAANBwCSBwCUBwCUBwCUBwCUBwCUBwCUBwCUBwCUAQcASAcAUAcAWgABBwBaCPkAAv8AbgAOBwCSBwCUBwCUBwCUBwCUBwCUBwCUBwCUBwCUAQcASAcAbAcAdgcAUAABBwBa/wAQAA8HAJIHAJQHAJQHAJQHAJQHAJQHAJQHAJQHAJQBBwBIBwBsBwB2BwBQBwBaAAEHAFoI+QACD/oADE4HAFr/ABAADQcAkgcAlAcAlAcAlAcAlAcAlAcAlAcAlAcAlAEHAEgHAGwHAFoAAQcAWgj5AAJRBwAoGQADAJYAAAACAJcAmAAAADIACACpAAEAmQCpAAEAmwCpAAEAnQCpAAEAnwCpAAEAoQCpAAEAowCpAAEApQCpAAEApwCvAAAACgABALAAsgC0ABk=" | base64 -d > /tmp/PgInit.class + + if [ $? -ne 0 ]; then + echo "WARNING: Failed to decode PgInit.class. Skipping prerequisites." + exec /debezium/run.sh + fi + + echo "Running PostgreSQL prerequisites..." + java -cp "${JDBC_JAR}:/tmp" PgInit \ + "${DB_HOSTNAME}" \ + "{{ include "debezium.databasePort" . }}" \ + "{{ .Values.source.database.name }}" \ + "${DB_USER}" \ + "${DB_PASSWORD}" \ + "{{ .Values.source.postgres.slotName }}" \ + "{{ .Values.source.postgres.pluginName }}" || echo "WARNING: Prerequisites script returned non-zero. Continuing anyway." + + echo "=== Starting Debezium Server ===" + exec /debezium/run.sh +{{- end }} diff --git a/debezium-server/versions/1.1.0/templates/volumeset.yaml b/debezium-server/versions/1.1.0/templates/volumeset.yaml new file mode 100644 index 00000000..b6c4a5c3 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/volumeset.yaml @@ -0,0 +1,15 @@ +{{- if eq (include "debezium.requiresVolumeset" .) "true" }} +kind: volumeset +name: {{ include "debezium.volumeset.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +description: Debezium Server data volumeset for offset and schema history storage +tags: + {{- include "debezium.tags" . | nindent 2 }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.capacity }} + performanceClass: {{ .Values.volumeset.performanceClass }} + snapshots: + createFinalSnapshot: true + retentionDuration: 7d +{{- end }} diff --git a/debezium-server/versions/1.1.0/templates/workload-debezium.yaml b/debezium-server/versions/1.1.0/templates/workload-debezium.yaml new file mode 100644 index 00000000..e71cc327 --- /dev/null +++ b/debezium-server/versions/1.1.0/templates/workload-debezium.yaml @@ -0,0 +1,221 @@ +kind: workload +name: {{ include "debezium.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +description: Debezium Server CDC workload +tags: + {{- include "debezium.tags" . | nindent 2 }} +spec: + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + type: stateful + {{- else }} + type: standard + {{- end }} + identityLink: //identity/{{ include "debezium.identity.name" . }} + containers: + - name: debezium-server + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + command: /bin/bash + args: + - '-c' + - >- + cp /scripts/debezium-entrypoint.sh /tmp/ && chmod +x /tmp/debezium-entrypoint.sh && + /tmp/debezium-entrypoint.sh + {{- end }} + image: {{ .Values.image }} + inheritEnv: false + cpu: {{ .Values.resources.cpu | quote }} + memory: {{ .Values.resources.memory | quote }} + ports: + - number: 8080 + protocol: http + env: + # Database credentials from secret + - name: DB_HOSTNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-hostname' + - name: DB_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-user' + - name: DB_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.db-password' + + {{- /* MongoDB connection string */}} + {{- if and (eq .Values.source.type "mongodb") .Values.source.mongodb.connectionString }} + - name: MONGODB_CONNECTION_STRING + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.mongodb-connection-string' + {{- end }} + + {{- /* Offset storage credentials */}} + {{- if eq .Values.source.offset.storage "redis" }} + - name: OFFSET_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-redis-address' + {{- if .Values.source.offset.redis.password }} + - name: OFFSET_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-redis-password' + {{- end }} + {{- end }} + {{- if eq .Values.source.offset.storage "jdbc" }} + - name: OFFSET_JDBC_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-url' + - name: OFFSET_JDBC_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-user' + - name: OFFSET_JDBC_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.offset-jdbc-password' + {{- end }} + + {{- /* Schema history storage credentials */}} + {{- if eq (include "debezium.requiresSchemaHistory" .) "true" }} + {{- if eq .Values.source.schemaHistory.storage "redis" }} + - name: SCHEMA_HISTORY_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-redis-address' + {{- if .Values.source.schemaHistory.redis.password }} + - name: SCHEMA_HISTORY_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-redis-password' + {{- end }} + {{- end }} + {{- if eq .Values.source.schemaHistory.storage "jdbc" }} + - name: SCHEMA_HISTORY_JDBC_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-url' + - name: SCHEMA_HISTORY_JDBC_USER + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-user' + - name: SCHEMA_HISTORY_JDBC_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-history-jdbc-password' + {{- end }} + {{- end }} + + {{- /* Sink credentials */}} + {{- if eq .Values.sink.type "kafka" }} + - name: KAFKA_BOOTSTRAP_SERVERS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-bootstrap-servers' + {{- if .Values.sink.kafka.saslUsername }} + - name: KAFKA_SASL_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-sasl-username' + - name: KAFKA_SASL_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.kafka-sasl-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "redis" }} + - name: SINK_REDIS_ADDRESS + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.sink-redis-address' + {{- if .Values.sink.redis.password }} + - name: SINK_REDIS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.sink-redis-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "nats-jetstream" }} + - name: NATS_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-url' + {{- if .Values.sink.nats.username }} + - name: NATS_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-username' + - name: NATS_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.nats-password' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "http" }} + - name: HTTP_SINK_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-url' + {{- if eq .Values.sink.http.authType "basic" }} + - name: HTTP_SINK_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-username' + - name: HTTP_SINK_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-password' + {{- end }} + {{- if eq .Values.sink.http.authType "bearer" }} + - name: HTTP_SINK_BEARER_TOKEN + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.http-sink-bearer-token' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "pulsar" }} + - name: PULSAR_SERVICE_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.pulsar-service-url' + {{- if .Values.sink.pulsar.authToken }} + - name: PULSAR_AUTH_TOKEN + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.pulsar-auth-token' + {{- end }} + {{- end }} + + {{- if eq .Values.sink.type "eventhubs" }} + - name: EVENTHUBS_CONNECTION_STRING + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.eventhubs-connection-string' + {{- end }} + + {{- /* Schema registry credentials */}} + {{- if .Values.format.schemaRegistry.url }} + - name: SCHEMA_REGISTRY_URL + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-url' + {{- if .Values.format.schemaRegistry.username }} + - name: SCHEMA_REGISTRY_USERNAME + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-username' + - name: SCHEMA_REGISTRY_PASSWORD + value: 'cpln://secret/{{ include "debezium.credentials.name" . }}.schema-registry-password' + {{- end }} + {{- end }} + + volumes: + # Mount application.properties from opaque secret + - path: /debezium/config/application.properties + recoveryPolicy: retain + uri: cpln://secret/{{ include "debezium.config.name" . }}.payload + {{- if and (eq .Values.source.type "postgres") (gt (int .Values.source.postgres.heartbeatIntervalMs) 0) }} + # Mount entrypoint script for PostgreSQL prerequisites + - path: /scripts/debezium-entrypoint.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "debezium.entrypoint.name" . }}.payload + {{- end }} + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + # Mount data volume for offset and schema history storage + - path: /debezium/data + uri: cpln://volumeset/{{ include "debezium.volumeset.name" . }} + {{- end }} + + readinessProbe: + httpGet: + path: /q/health/ready + port: 8080 + scheme: HTTP + failureThreshold: 3 + initialDelaySeconds: 10 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 5 + + livenessProbe: + httpGet: + path: /q/health/live + port: 8080 + scheme: HTTP + failureThreshold: 3 + initialDelaySeconds: 30 + periodSeconds: 30 + successThreshold: 1 + timeoutSeconds: 5 + + defaultOptions: + # Debezium should run as a single instance for CDC consistency + autoscaling: + metric: disabled + minScale: 1 + maxScale: 1 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 60 + + {{- if eq (include "debezium.requiresVolumeset" .) "true" }} + securityOptions: + filesystemGroupId: 185 + {{- end }} + + firewallConfig: + external: + outboundAllowCIDR: + {{- toYaml .Values.firewall.external.outboundAllowCIDR | nindent 8 }} + internal: + inboundAllowType: {{ .Values.firewall.internal.inboundAllowType }} + {{- if .Values.firewall.internal.workloads }} + inboundAllowWorkload: + {{- toYaml .Values.firewall.internal.workloads | nindent 8 }} + {{- end }} diff --git a/debezium-server/versions/1.1.0/values.yaml b/debezium-server/versions/1.1.0/values.yaml new file mode 100644 index 00000000..9719f2d1 --- /dev/null +++ b/debezium-server/versions/1.1.0/values.yaml @@ -0,0 +1,219 @@ +# Debezium Server CDC Template +# Documentation: https://debezium.io/documentation/reference/stable/operations/debezium-server.html + +image: quay.io/debezium/server:3.0 + +resources: + cpu: 500m + memory: 512Mi + +# ============================================================================= +# Source Database Configuration +# ============================================================================= +source: + # Database type: postgres, mysql, mongodb, sqlserver, oracle + type: postgres + + # Database connection settings + database: + hostname: "" # Required: database hostname or IP + port: 5432 # Default port varies by type (postgres:5432, mysql:3306, mongodb:27017, sqlserver:1433, oracle:1521) + name: "" # Required: database name (or SID for Oracle) + user: "" # Required: database username + password: "" # Required: database password (stored in credentials secret) + + # Server name used as prefix for topic names + serverName: "dbserver1" + + # Tables to capture (comma-separated, e.g., "public.users,public.orders") + # Leave empty to capture all tables + tableIncludeList: "" + + # Tables to exclude (comma-separated) + tableExcludeList: "" + + # PostgreSQL-specific settings + postgres: + slotName: "debezium" # Replication slot name + publicationName: "dbz_publication" # Publication name + pluginName: "pgoutput" # Logical decoding plugin: pgoutput (default), decoderbufs + slotDropOnStop: false # Keep replication slot on stop (required for HA/failover) + heartbeatIntervalMs: 0 # Heartbeat interval in ms (0=disabled; set 5000 for HA) + heartbeatActionQuery: "" # SQL executed on heartbeat (e.g., "UPDATE debezium_heartbeat SET ts = now() WHERE id = 1") + + # MySQL-specific settings + mysql: + serverId: 85744 # Unique server ID for MySQL replication + includeSchemaChanges: true # Include DDL events + + # MongoDB-specific settings + mongodb: + connectionString: "" # Full connection string (overrides hostname/port) + replicaSet: "" # Replica set name + + # SQL Server-specific settings + sqlserver: + databaseNames: "" # Comma-separated database names to capture + snapshotMode: "initial" # Snapshot mode: initial, schema_only, initial_only + + # Oracle-specific settings + oracle: + pdbName: "" # Pluggable database name + logMiningStrategy: "online_catalog" # Log mining strategy: online_catalog, redo_log_catalog + + # Offset storage configuration + offset: + # Storage type: file, redis, jdbc + storage: file + + # Flush settings + flushIntervalMs: 10000 # How often offsets flush to storage (ms) + flushTimeoutMs: 60000 # Timeout for offset flush operations (ms) + + # File storage settings (requires volumeset) + file: + filename: "/debezium/data/offsets.dat" + + # Redis storage settings + redis: + address: "" # Redis address (e.g., redis.mygvc.cpln.local:6379) + key: "debezium:offsets" # Redis key for offsets + password: "" # Redis password (stored in credentials secret) + ssl: false # Enable SSL/TLS + + # JDBC storage settings + jdbc: + url: "" # JDBC URL (e.g., jdbc:postgresql://host:5432/dbname) + user: "" # JDBC username + password: "" # JDBC password (stored in credentials secret) + tableName: "debezium_offsets" # Table name for storing offsets + + # Schema history storage (required for MySQL and SQL Server) + schemaHistory: + # Storage type: file, redis, jdbc (only used for mysql/sqlserver) + storage: file + + # File storage settings + file: + filename: "/debezium/data/schema-history.dat" + + # Redis storage settings + redis: + address: "" + key: "debezium:schema-history" + password: "" + ssl: false + + # JDBC storage settings + jdbc: + url: "" + user: "" + password: "" + tableName: "debezium_schema_history" + + # Error retry configuration + errors: + retryDelayInitialMs: 300 # Initial retry delay (ms) + retryDelayMaxMs: 10000 # Max retry delay (ms) + maxRetries: -1 # Max retries (-1 = infinite) + +# ============================================================================= +# Sink Configuration +# ============================================================================= +sink: + # Sink type: kafka, redis, nats-jetstream, http, kinesis, pubsub, pulsar, eventhubs + type: kafka + + # Kafka sink settings + kafka: + bootstrapServers: "" # Required: Kafka bootstrap servers (e.g., kafka.mygvc.cpln.local:9092) + topic: "" # Topic prefix (events sent to {topic}.{table}) + securityProtocol: "PLAINTEXT" # Security protocol: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL + saslMechanism: "" # SASL mechanism: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512 + saslUsername: "" # SASL username + saslPassword: "" # SASL password (stored in credentials secret) + + # Redis sink settings (Redis Streams) + redis: + address: "" # Required: Redis address (e.g., redis.mygvc.cpln.local:6379) + password: "" # Redis password (stored in credentials secret) + ssl: false # Enable SSL/TLS + streamName: "" # Stream name prefix (events sent to {streamName}.{table}) + + # NATS JetStream sink settings + nats: + url: "" # Required: NATS URL (e.g., nats://nats.mygvc.cpln.local:4222) + subject: "" # Subject prefix (events sent to {subject}.{table}) + username: "" # NATS username + password: "" # NATS password (stored in credentials secret) + + # HTTP sink settings (webhooks) + http: + url: "" # Required: HTTP endpoint URL + headers: {} # Additional headers (key-value pairs) + authType: "" # Auth type: none, basic, bearer + username: "" # Basic auth username + password: "" # Basic auth password (stored in credentials secret) + bearerToken: "" # Bearer token (stored in credentials secret) + + # AWS Kinesis sink settings (uses Universal Cloud Identity) + kinesis: + region: "" # Required: AWS region + streamName: "" # Required: Kinesis stream name + credentialsProvider: "default" # Use "default" for Universal Cloud Identity + # Cloud account for Universal Cloud Identity + cloudAccount: + enabled: false + name: "" # AWS cloud account name in Control Plane + + # GCP Pub/Sub sink settings (uses Universal Cloud Identity) + pubsub: + projectId: "" # Required: GCP project ID + topic: "" # Topic prefix (events sent to {topic}.{table}) + # Cloud account for Universal Cloud Identity + cloudAccount: + enabled: false + name: "" # GCP cloud account name in Control Plane + + # Apache Pulsar sink settings + pulsar: + serviceUrl: "" # Required: Pulsar service URL + topic: "" # Topic prefix + authPluginClassName: "" # Auth plugin class (e.g., org.apache.pulsar.client.impl.auth.AuthenticationToken) + authToken: "" # Auth token (stored in credentials secret) + + # Azure Event Hubs sink settings + eventhubs: + connectionString: "" # Required: Event Hubs connection string (stored in credentials secret) + hubName: "" # Required: Event Hub name + +# ============================================================================= +# Serialization Format +# ============================================================================= +format: + key: json # Key format: json, avro, protobuf + value: json # Value format: json, avro, protobuf + + # Schema registry settings (for avro/protobuf) + schemaRegistry: + url: "" # Schema registry URL + username: "" # Schema registry username + password: "" # Schema registry password (stored in credentials secret) + +# ============================================================================= +# Volumeset Configuration (for file-based offset/schema-history storage) +# ============================================================================= +volumeset: + capacity: 10 # Initial capacity in GiB (minimum 10) + performanceClass: general-purpose-ssd # Performance class: general-purpose-ssd, high-throughput-ssd + +# ============================================================================= +# Firewall Configuration +# ============================================================================= +firewall: + internal: + inboundAllowType: same-gvc # Options: none, same-gvc, same-org, workload-list + workloads: [] # Workload list for inbound access (when type is workload-list) + external: + outboundAllowCIDR: + - 0.0.0.0/0 # Allow all outbound by default (required for database connectivity) diff --git a/ess/versions/1.4.0/Chart.yaml b/ess/versions/1.4.0/Chart.yaml new file mode 100644 index 00000000..b177ec9c --- /dev/null +++ b/ess/versions/1.4.0/Chart.yaml @@ -0,0 +1,17 @@ +apiVersion: v2 +name: ess +description: External Secret Syncer for Control Plane +type: application +version: 1.4.0 +appVersion: "1.3.4" + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2025-03-12" + lastModified: "2026-05-05" + category: "secrets" + createsGvc: false \ No newline at end of file diff --git a/ess/versions/1.4.0/README.md b/ess/versions/1.4.0/README.md new file mode 100644 index 00000000..1b963f0c --- /dev/null +++ b/ess/versions/1.4.0/README.md @@ -0,0 +1,207 @@ +## External Secret Syncer (ESS) + +### Overview + +Creates an application that continuously syncs secrets from external providers into Control Plane secrets on a configurable schedule. Supported providers: **HashiCorp Vault**, **AWS Secrets Manager**, **AWS Parameter Store**, **Doppler**, **GCP Secret Manager**, **1Password**, and **1Password Connect**. + +--- + +### How It Works + +ESS runs as a workload on Control Plane. Your provider configuration and secrets list are stored in a Control Plane secret and mounted into the workload as `sync.yaml`. On startup, ESS schedules a polling loop for each configured secret. At each interval, it fetches the latest value from the external provider and creates or updates the corresponding Control Plane secret via the API. + +ESS tags every secret it manages with `syncer.cpln.io/source` (set to the workload path). This prevents two ESS instances from accidentally overwriting each other's secrets. An hourly cleanup job also deletes any Control Plane secrets that ESS owns but that have been removed from your `sync.yaml` config. + +--- + +### Configuring `values.yaml` + +#### Top-level fields + +| Field | Description | +|---|---| +| `image` | The ESS container image. Do not change unless upgrading. | +| `resources.cpu` / `resources.memory` | Resource limits for the workload container. | +| `port` | Port for the ESS HTTP admin API (default: `3004`). Used for health checks and manual sync triggers. | +| `allowedIp` | List of CIDRs allowed to reach the ESS admin API externally. Replace the placeholder with your IP, or use `0.0.0.0/0` to allow all. | +| `essConfig` | The full sync configuration — providers and secrets (see below). | + +--- + +#### `essConfig.providers` + +Each provider entry requires a unique `name` and exactly one provider block. An optional `syncInterval` sets the default interval for all secrets using that provider. + +**Vault** +```yaml +- name: my-vault + vault: + address: https://my-vault.com:8200 # required + token: # required + syncInterval: 1m # optional — overrides global default +``` + +**AWS Parameter Store** +```yaml +- name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # optional if using an IAM-linked identity + secretAccessKey: # optional if using an IAM-linked identity +``` + +**AWS Secrets Manager** +```yaml +- name: my-aws-secrets-manager + awsSecretsManager: + region: us-east-1 + accessKeyId: + secretAccessKey: +``` + +**Doppler** +```yaml +- name: my-doppler + doppler: + accessToken: # use a Doppler service token (dp.st....) +``` + +**GCP Secret Manager** +```yaml +- name: my-gcp + gcpSecretManager: + projectId: 123456789876 + credentials: # optional — omit to use Application Default Credentials + clientEmail: + privateKey: +``` + +**1Password** +```yaml +- name: my-1password + onePassword: + serviceAccountToken: + integrationName: my-ess # optional + integrationVersion: 1.0.0 # optional +``` + +**1Password Connect** +```yaml +- name: my-1password-connect + onePasswordConnect: + serverURL: https://my-connect-server.example.com # required + token: # required +``` + +--- + +#### `essConfig.secrets` + +Each secret entry syncs one value (or a set of values) from a provider into a Control Plane secret. + +| Field | Description | +|---|---| +| `name` | Name of the Control Plane secret to create or update. | +| `provider` | Must match a provider `name` defined above. | +| `syncInterval` | Optional. Overrides the provider-level and global default for this specific secret. | + +Each secret must use exactly one of the following sync types: + +--- + +##### `opaque` — Single value (stored as a Control Plane `opaque` secret) + +Shorthand (path only, no fallback): +```yaml +- name: my-secret + provider: my-vault + opaque: /v1/secret/data/myapp +``` + +With options: +```yaml +- name: my-secret + provider: my-vault + opaque: + path: /v1/secret/data/myapp # path to fetch + parse: data.password # optional — extract a key from a JSON/YAML response + default: fallback-value # optional — used if fetch fails + encoding: base64 # optional — base64-decode the fetched value +``` + +> **Note:** If you use the shorthand form (`opaque: /some/path`) with no `default`, a fetch failure causes the sync to fail with no fallback. + +--- + +##### `dictionary` — Multiple values (stored as a Control Plane `dictionary` secret) + +Each key in the dictionary is fetched independently: +```yaml +- name: my-secret + provider: my-vault + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + path: /v1/secret/data/app + parse: data.username + default: "no username" +``` + +Each key supports `path`, `parse`, `default`, and `encoding` — the same options as `opaque`. A failure on one key does not block others. + +--- + +##### `dictionaryFromProject` — Entire Doppler project (Doppler only) + +Syncs all secrets from a Doppler project+config in one operation, stored as a Control Plane `dictionary` secret: +```yaml +- name: my-doppler-config + provider: my-doppler + dictionaryFromProject: + path: my-project/dev # format: "project/config" — exactly two segments +``` + +> **Note:** `dictionaryFromProject` is only valid with a Doppler provider. Using it with any other provider causes ESS to exit at startup. + +--- + +#### Doppler Path Formats + +| Sync type | Path format | Example | +|---|---|---| +| `opaque` or `dictionary` key | `project/config/SECRET_NAME` | `my-app/production/DATABASE_URL` | +| `dictionaryFromProject` | `project/config` | `my-app/production` | + +--- + +#### Sync Interval Format + +Intervals use the format `hms`. All parts are optional but at least one is required. + +Examples: `10s`, `5m`, `1h`, `1h30m`, `1h30m10s` + +Priority (highest wins): +1. Secret-level `syncInterval` +2. Provider-level `syncInterval` +3. Global default (`300s`) + +--- + +### Important Notes + +- **Conflict protection:** If a Control Plane secret already exists and is managed by a different ESS instance, the sync for that secret will fail. Two ESS instances cannot manage the same secret. +- **Secret type changes:** Changing a secret from `opaque` to `dictionary` (or vice versa) causes ESS to delete the existing secret and recreate it. There is a brief window where the secret does not exist. +- **Cleanup:** ESS runs an hourly job that deletes Control Plane secrets it owns but that no longer appear in `sync.yaml`. Removing a secret from the config will eventually result in its deletion from Control Plane. +- **Doppler `parse`:** The `parse` field only works when the Doppler secret's value is JSON or YAML. Using `parse` on a plain string secret throws an error. +- **`sync.yaml` hot reload:** ESS watches its config file and automatically restarts when changes are detected (every ~5 seconds). No workload restart is needed after updating the config secret. + +### Resources + +- [ESS Documentation](https://docs.controlplane.com/template-catalog/templates/external-secret-syncer) +- [Image Source Code](https://github.com/controlplane-com/external-secret-syncer) \ No newline at end of file diff --git a/ess/versions/1.4.0/templates/_helpers.tpl b/ess/versions/1.4.0/templates/_helpers.tpl new file mode 100644 index 00000000..95668c35 --- /dev/null +++ b/ess/versions/1.4.0/templates/_helpers.tpl @@ -0,0 +1,39 @@ +{{/* Resource Naming */}} + +{{/* +ESS Workload Name +*/}} +{{- define "ess.name" -}} +{{- printf "%s-ess" .Release.Name }} +{{- end }} + +{{/* +ESS Identity Name +*/}} +{{- define "ess.identity.name" -}} +{{- printf "%s-ess-identity" .Release.Name }} +{{- end }} + +{{/* +ESS Policy Name +*/}} +{{- define "ess.policy.name" -}} +{{- printf "%s-ess-policy" .Release.Name }} +{{- end }} + +{{/* +ESS Secret Config Name +*/}} +{{- define "ess.secret.name" -}} +{{- printf "%s-ess-config" .Release.Name }} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "ess.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/ess/versions/1.4.0/templates/identity.yaml b/ess/versions/1.4.0/templates/identity.yaml new file mode 100644 index 00000000..e7176ee7 --- /dev/null +++ b/ess/versions/1.4.0/templates/identity.yaml @@ -0,0 +1,5 @@ +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ess.identity.name" . }} +description: ESS identity +tags: {{- include "ess.tags" . | nindent 4 }} diff --git a/ess/versions/1.4.0/templates/policy.yaml b/ess/versions/1.4.0/templates/policy.yaml new file mode 100644 index 00000000..cba2f1dd --- /dev/null +++ b/ess/versions/1.4.0/templates/policy.yaml @@ -0,0 +1,10 @@ +kind: policy +name: {{ include "ess.policy.name" . }} +description: ESS policy +bindings: + - permissions: + - manage + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} +target: all +targetKind: secret diff --git a/ess/versions/1.4.0/templates/secret.yaml b/ess/versions/1.4.0/templates/secret.yaml new file mode 100644 index 00000000..764bc110 --- /dev/null +++ b/ess/versions/1.4.0/templates/secret.yaml @@ -0,0 +1,9 @@ +kind: secret +name: {{ include "ess.secret.name" . }} +description: ESS config +tags: {{- include "ess.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: | +{{- toYaml .Values.essConfig | nindent 4 }} \ No newline at end of file diff --git a/ess/versions/1.4.0/templates/workload.yaml b/ess/versions/1.4.0/templates/workload.yaml new file mode 100644 index 00000000..a4106066 --- /dev/null +++ b/ess/versions/1.4.0/templates/workload.yaml @@ -0,0 +1,61 @@ +kind: workload +name: {{ include "ess.name" . }} +description: External Secret Syncer +tags: {{- include "ess.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: ess + cpu: {{ .Values.resources.cpu | quote }} + image: {{ .Values.image }} + inheritEnv: false + memory: {{ .Values.resources.memory | quote }} + ports: + - number: {{ .Values.port }} + protocol: http + readinessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: /about + port: {{ .Values.port }} + scheme: HTTP + initialDelaySeconds: 0 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + volumes: + - path: /usr/src/app/sync.yaml + recoveryPolicy: retain + uri: cpln://secret/{{ include "ess.secret.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 3 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: + {{- toYaml .Values.allowedIp | nindent 8 }} + inboundBlockedCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: none + inboundAllowWorkload: [] + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} + loadBalancer: + direct: + enabled: false + ports: [] + supportDynamicTags: false diff --git a/ess/versions/1.4.0/values.yaml b/ess/versions/1.4.0/values.yaml new file mode 100644 index 00000000..df98db40 --- /dev/null +++ b/ess/versions/1.4.0/values.yaml @@ -0,0 +1,82 @@ +image: ghcr.io/controlplane-com/cpln-build/external-secret-syncer:v1.3.4 + +resources: + cpu: 200m + memory: 256Mi + +port: 3004 + +allowedIp: + - 1.2.3.4 # Replace with your IP + +essConfig: + providers: + - name: my-vault + vault: + address: https://my-vault.com:8200 + token: + syncInterval: 1m + - name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # alternatively configure identity to natively use AWS permissions + secretAccessKey: # alternatively configure identity to natively use AWS permissions + # - name: my-aws-secrets-manager + # awsSecretsManager: + # region: us-east-1 + # accessKeyId: + # secretAccessKey: + # - name: my-1password + # onePassword: + # serviceAccountToken: + # integrationName: my-ess + # integrationVersion: 1.0.0 + # - name: my-1password-connect + # onePasswordConnect: + # serverURL: https://my-connect-server.example.com + # token: + # - name: my-doppler + # doppler: + # accessToken: + # - name: my-gcp + # gcpSecretManager: + # projectId: 123456789876 + # credentials: + # clientEmail: + # privateKey: + secrets: + - name: auth + provider: my-vault + syncInterval: 20s + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + default: "no username" + path: /v1/secret/data/app + parse: data.username + - name: ssm + provider: my-aws + syncInterval: 20s + opaque: /example/app + # - name: secrets-manager + # provider: my-aws-secrets-manager + # dictionary: + # PASSWORD: + # path: /example/app + # parse: password + # - name: doppler-secret + # provider: my-doppler + # opaque: /project/config/SECRET_NAME + # - name: doppler-project + # provider: my-doppler + # dictionaryFromProject: + # path: project/config # syncs all secrets from a Doppler project+config + # - name: gcp + # provider: my-gcp + # opaque: database-password diff --git a/ess/versions/1.5.0/Chart.yaml b/ess/versions/1.5.0/Chart.yaml new file mode 100644 index 00000000..77148d6d --- /dev/null +++ b/ess/versions/1.5.0/Chart.yaml @@ -0,0 +1,17 @@ +apiVersion: v2 +name: ess +description: External Secret Syncer for Control Plane +type: application +version: 1.5.0 +appVersion: "1.3.5" + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2025-03-12" + lastModified: "2026-05-06" + category: "secrets" + createsGvc: false \ No newline at end of file diff --git a/ess/versions/1.5.0/README.md b/ess/versions/1.5.0/README.md new file mode 100644 index 00000000..1fcf5ace --- /dev/null +++ b/ess/versions/1.5.0/README.md @@ -0,0 +1,211 @@ +## External Secret Syncer (ESS) + +### Overview + +Creates an application that continuously syncs secrets from external providers into Control Plane secrets on a configurable schedule. Supported providers: **HashiCorp Vault**, **AWS Secrets Manager**, **AWS Parameter Store**, **Doppler**, **GCP Secret Manager**, **1Password**, and **1Password Connect**. + +--- + +### How It Works + +ESS runs as a workload on Control Plane. Your provider configuration and secrets list are stored in a Control Plane secret and mounted into the workload as `sync.yaml`. On startup, ESS schedules a polling loop for each configured secret. At each interval, it fetches the latest value from the external provider and creates or updates the corresponding Control Plane secret via the API. + +ESS tags every secret it manages with `syncer.cpln.io/source` (set to the workload path). This prevents two ESS instances from accidentally overwriting each other's secrets. An hourly cleanup job also deletes any Control Plane secrets that ESS owns but that have been removed from your `sync.yaml` config. + +--- + +### Patch Notes + +This version of ESS fixes a bug preventing the cleanup from running + +### Configuring `values.yaml` + +#### Top-level fields + +| Field | Description | +|---|---| +| `image` | The ESS container image. Do not change unless upgrading. | +| `resources.cpu` / `resources.memory` | Resource limits for the workload container. | +| `port` | Port for the ESS HTTP admin API (default: `3004`). Used for health checks and manual sync triggers. | +| `allowedIp` | List of CIDRs allowed to reach the ESS admin API externally. Replace the placeholder with your IP, or use `0.0.0.0/0` to allow all. | +| `essConfig` | The full sync configuration — providers and secrets (see below). | + +--- + +#### `essConfig.providers` + +Each provider entry requires a unique `name` and exactly one provider block. An optional `syncInterval` sets the default interval for all secrets using that provider. + +**Vault** +```yaml +- name: my-vault + vault: + address: https://my-vault.com:8200 # required + token: # required + syncInterval: 1m # optional — overrides global default +``` + +**AWS Parameter Store** +```yaml +- name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # optional if using an IAM-linked identity + secretAccessKey: # optional if using an IAM-linked identity +``` + +**AWS Secrets Manager** +```yaml +- name: my-aws-secrets-manager + awsSecretsManager: + region: us-east-1 + accessKeyId: + secretAccessKey: +``` + +**Doppler** +```yaml +- name: my-doppler + doppler: + accessToken: # use a Doppler service token (dp.st....) +``` + +**GCP Secret Manager** +```yaml +- name: my-gcp + gcpSecretManager: + projectId: 123456789876 + credentials: # optional — omit to use Application Default Credentials + clientEmail: + privateKey: +``` + +**1Password** +```yaml +- name: my-1password + onePassword: + serviceAccountToken: + integrationName: my-ess # optional + integrationVersion: 1.0.0 # optional +``` + +**1Password Connect** +```yaml +- name: my-1password-connect + onePasswordConnect: + serverURL: https://my-connect-server.example.com # required + token: # required +``` + +--- + +#### `essConfig.secrets` + +Each secret entry syncs one value (or a set of values) from a provider into a Control Plane secret. + +| Field | Description | +|---|---| +| `name` | Name of the Control Plane secret to create or update. | +| `provider` | Must match a provider `name` defined above. | +| `syncInterval` | Optional. Overrides the provider-level and global default for this specific secret. | + +Each secret must use exactly one of the following sync types: + +--- + +##### `opaque` — Single value (stored as a Control Plane `opaque` secret) + +Shorthand (path only, no fallback): +```yaml +- name: my-secret + provider: my-vault + opaque: /v1/secret/data/myapp +``` + +With options: +```yaml +- name: my-secret + provider: my-vault + opaque: + path: /v1/secret/data/myapp # path to fetch + parse: data.password # optional — extract a key from a JSON/YAML response + default: fallback-value # optional — used if fetch fails + encoding: base64 # optional — base64-decode the fetched value +``` + +> **Note:** If you use the shorthand form (`opaque: /some/path`) with no `default`, a fetch failure causes the sync to fail with no fallback. + +--- + +##### `dictionary` — Multiple values (stored as a Control Plane `dictionary` secret) + +Each key in the dictionary is fetched independently: +```yaml +- name: my-secret + provider: my-vault + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + path: /v1/secret/data/app + parse: data.username + default: "no username" +``` + +Each key supports `path`, `parse`, `default`, and `encoding` — the same options as `opaque`. A failure on one key does not block others. + +--- + +##### `dictionaryFromProject` — Entire Doppler project (Doppler only) + +Syncs all secrets from a Doppler project+config in one operation, stored as a Control Plane `dictionary` secret: +```yaml +- name: my-doppler-config + provider: my-doppler + dictionaryFromProject: + path: my-project/dev # format: "project/config" — exactly two segments +``` + +> **Note:** `dictionaryFromProject` is only valid with a Doppler provider. Using it with any other provider causes ESS to exit at startup. + +--- + +#### Doppler Path Formats + +| Sync type | Path format | Example | +|---|---|---| +| `opaque` or `dictionary` key | `project/config/SECRET_NAME` | `my-app/production/DATABASE_URL` | +| `dictionaryFromProject` | `project/config` | `my-app/production` | + +--- + +#### Sync Interval Format + +Intervals use the format `hms`. All parts are optional but at least one is required. + +Examples: `10s`, `5m`, `1h`, `1h30m`, `1h30m10s` + +Priority (highest wins): +1. Secret-level `syncInterval` +2. Provider-level `syncInterval` +3. Global default (`300s`) + +--- + +### Important Notes + +- **Conflict protection:** If a Control Plane secret already exists and is managed by a different ESS instance, the sync for that secret will fail. Two ESS instances cannot manage the same secret. +- **Secret type changes:** Changing a secret from `opaque` to `dictionary` (or vice versa) causes ESS to delete the existing secret and recreate it. There is a brief window where the secret does not exist. +- **Cleanup:** ESS runs an hourly job that deletes Control Plane secrets it owns but that no longer appear in `sync.yaml`. Removing a secret from the config will eventually result in its deletion from Control Plane. +- **Doppler `parse`:** The `parse` field only works when the Doppler secret's value is JSON or YAML. Using `parse` on a plain string secret throws an error. +- **`sync.yaml` hot reload:** ESS watches its config file and automatically restarts when changes are detected (every ~5 seconds). No workload restart is needed after updating the config secret. + +### Resources + +- [ESS Documentation](https://docs.controlplane.com/template-catalog/templates/external-secret-syncer) +- [Image Source Code](https://github.com/controlplane-com/external-secret-syncer) \ No newline at end of file diff --git a/ess/versions/1.5.0/templates/_helpers.tpl b/ess/versions/1.5.0/templates/_helpers.tpl new file mode 100644 index 00000000..95668c35 --- /dev/null +++ b/ess/versions/1.5.0/templates/_helpers.tpl @@ -0,0 +1,39 @@ +{{/* Resource Naming */}} + +{{/* +ESS Workload Name +*/}} +{{- define "ess.name" -}} +{{- printf "%s-ess" .Release.Name }} +{{- end }} + +{{/* +ESS Identity Name +*/}} +{{- define "ess.identity.name" -}} +{{- printf "%s-ess-identity" .Release.Name }} +{{- end }} + +{{/* +ESS Policy Name +*/}} +{{- define "ess.policy.name" -}} +{{- printf "%s-ess-policy" .Release.Name }} +{{- end }} + +{{/* +ESS Secret Config Name +*/}} +{{- define "ess.secret.name" -}} +{{- printf "%s-ess-config" .Release.Name }} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "ess.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/ess/versions/1.5.0/templates/identity.yaml b/ess/versions/1.5.0/templates/identity.yaml new file mode 100644 index 00000000..e7176ee7 --- /dev/null +++ b/ess/versions/1.5.0/templates/identity.yaml @@ -0,0 +1,5 @@ +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ess.identity.name" . }} +description: ESS identity +tags: {{- include "ess.tags" . | nindent 4 }} diff --git a/ess/versions/1.5.0/templates/policy.yaml b/ess/versions/1.5.0/templates/policy.yaml new file mode 100644 index 00000000..cba2f1dd --- /dev/null +++ b/ess/versions/1.5.0/templates/policy.yaml @@ -0,0 +1,10 @@ +kind: policy +name: {{ include "ess.policy.name" . }} +description: ESS policy +bindings: + - permissions: + - manage + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} +target: all +targetKind: secret diff --git a/ess/versions/1.5.0/templates/secret.yaml b/ess/versions/1.5.0/templates/secret.yaml new file mode 100644 index 00000000..764bc110 --- /dev/null +++ b/ess/versions/1.5.0/templates/secret.yaml @@ -0,0 +1,9 @@ +kind: secret +name: {{ include "ess.secret.name" . }} +description: ESS config +tags: {{- include "ess.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: | +{{- toYaml .Values.essConfig | nindent 4 }} \ No newline at end of file diff --git a/ess/versions/1.5.0/templates/workload.yaml b/ess/versions/1.5.0/templates/workload.yaml new file mode 100644 index 00000000..a4106066 --- /dev/null +++ b/ess/versions/1.5.0/templates/workload.yaml @@ -0,0 +1,61 @@ +kind: workload +name: {{ include "ess.name" . }} +description: External Secret Syncer +tags: {{- include "ess.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: ess + cpu: {{ .Values.resources.cpu | quote }} + image: {{ .Values.image }} + inheritEnv: false + memory: {{ .Values.resources.memory | quote }} + ports: + - number: {{ .Values.port }} + protocol: http + readinessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: /about + port: {{ .Values.port }} + scheme: HTTP + initialDelaySeconds: 0 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + volumes: + - path: /usr/src/app/sync.yaml + recoveryPolicy: retain + uri: cpln://secret/{{ include "ess.secret.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 3 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: + {{- toYaml .Values.allowedIp | nindent 8 }} + inboundBlockedCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: none + inboundAllowWorkload: [] + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} + loadBalancer: + direct: + enabled: false + ports: [] + supportDynamicTags: false diff --git a/ess/versions/1.5.0/values.yaml b/ess/versions/1.5.0/values.yaml new file mode 100644 index 00000000..e012170e --- /dev/null +++ b/ess/versions/1.5.0/values.yaml @@ -0,0 +1,82 @@ +image: ghcr.io/controlplane-com/cpln-build/external-secret-syncer:v1.3.5 + +resources: + cpu: 200m + memory: 256Mi + +port: 3004 + +allowedIp: + - 1.2.3.4 # Replace with your IP + +essConfig: + providers: + - name: my-vault + vault: + address: https://my-vault.com:8200 + token: + syncInterval: 1m + - name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # alternatively configure identity to natively use AWS permissions + secretAccessKey: # alternatively configure identity to natively use AWS permissions + # - name: my-aws-secrets-manager + # awsSecretsManager: + # region: us-east-1 + # accessKeyId: + # secretAccessKey: + # - name: my-1password + # onePassword: + # serviceAccountToken: + # integrationName: my-ess + # integrationVersion: 1.0.0 + # - name: my-1password-connect + # onePasswordConnect: + # serverURL: https://my-connect-server.example.com + # token: + # - name: my-doppler + # doppler: + # accessToken: + # - name: my-gcp + # gcpSecretManager: + # projectId: 123456789876 + # credentials: + # clientEmail: + # privateKey: + secrets: + - name: auth + provider: my-vault + syncInterval: 20s + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + default: "no username" + path: /v1/secret/data/app + parse: data.username + - name: ssm + provider: my-aws + syncInterval: 20s + opaque: /example/app + # - name: secrets-manager + # provider: my-aws-secrets-manager + # dictionary: + # PASSWORD: + # path: /example/app + # parse: password + # - name: doppler-secret + # provider: my-doppler + # opaque: /project/config/SECRET_NAME + # - name: doppler-project + # provider: my-doppler + # dictionaryFromProject: + # path: project/config # syncs all secrets from a Doppler project+config + # - name: gcp + # provider: my-gcp + # opaque: database-password diff --git a/ess/versions/1.6.0/Chart.yaml b/ess/versions/1.6.0/Chart.yaml new file mode 100644 index 00000000..f01e943f --- /dev/null +++ b/ess/versions/1.6.0/Chart.yaml @@ -0,0 +1,17 @@ +apiVersion: v2 +name: ess +description: External Secret Syncer for Control Plane +type: application +version: 1.6.0 +appVersion: v1.4.0 + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2025-03-12" + lastModified: "2026-05-11" + category: "secrets" + createsGvc: false \ No newline at end of file diff --git a/ess/versions/1.6.0/README.md b/ess/versions/1.6.0/README.md new file mode 100644 index 00000000..6d52c20e --- /dev/null +++ b/ess/versions/1.6.0/README.md @@ -0,0 +1,222 @@ +## External Secret Syncer (ESS) + +### Overview + +Creates an application that continuously syncs secrets from external providers into Control Plane secrets on a configurable schedule. Supported providers: **HashiCorp Vault**, **AWS Secrets Manager**, **AWS Parameter Store**, **Doppler**, **GCP Secret Manager**, **1Password**, and **1Password Connect**. + +--- + +### How It Works + +ESS runs as a workload on Control Plane. Your provider configuration and secrets list are stored in a Control Plane secret and mounted into the workload as `sync.yaml`. On startup, ESS schedules a polling loop for each configured secret. At each interval, it fetches the latest value from the external provider and creates or updates the corresponding Control Plane secret via the API. + +ESS tags every secret it manages with `syncer.cpln.io/source` (set to the workload path). This prevents two ESS instances from accidentally overwriting each other's secrets. An hourly cleanup job also deletes any Control Plane secrets that ESS owns but that have been removed from your `sync.yaml` config. + +--- + +### Patch Notes + +This version of ESS fixes a bug preventing the cleanup from running + +### Configuring `values.yaml` + +#### Top-level fields + +| Field | Description | +|---|---| +| `image` | The ESS container image. Do not change unless upgrading. | +| `resources.cpu` / `resources.memory` | Resource limits for the workload container. | +| `port` | Port for the ESS HTTP admin API (default: `3004`). Used for health checks and manual sync triggers. | +| `allowedIp` | List of CIDRs allowed to reach the ESS admin API externally. Replace the placeholder with your IP, or use `0.0.0.0/0` to allow all. | +| `essConfig` | The full sync configuration — providers and secrets (see below). | + +--- + +#### `essConfig.providers` + +Each provider entry requires a unique `name` and exactly one provider block. An optional `syncInterval` sets the default interval for all secrets using that provider. + +**Vault** +```yaml +- name: my-vault + vault: + address: https://my-vault.com:8200 # required + token: # required + syncInterval: 1m # optional — overrides global default +``` + +**AWS Parameter Store** +```yaml +- name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # optional if using an IAM-linked identity + secretAccessKey: # optional if using an IAM-linked identity +``` + +**AWS Secrets Manager** +```yaml +- name: my-aws-secrets-manager + awsSecretsManager: + region: us-east-1 + accessKeyId: + secretAccessKey: +``` + +**Doppler** +```yaml +- name: my-doppler + doppler: + accessToken: # use a Doppler service token (dp.st....) +``` + +**GCP Secret Manager** +```yaml +- name: my-gcp + gcpSecretManager: + projectId: 123456789876 + credentials: # optional — omit to use Application Default Credentials + clientEmail: + privateKey: +``` + +**1Password** +```yaml +- name: my-1password + onePassword: + serviceAccountToken: + integrationName: my-ess # optional + integrationVersion: 1.0.0 # optional +``` + +**1Password Connect** +```yaml +- name: my-1password-connect + onePasswordConnect: + serverURL: https://my-connect-server.example.com # required + token: # required +``` + +--- + +#### `essConfig.secrets` + +Each secret entry syncs one value (or a set of values) from a provider into a Control Plane secret. + +| Field | Description | +|---|---| +| `name` | Name of the Control Plane secret to create or update. | +| `provider` | Must match a provider `name` defined above. | +| `syncInterval` | Optional. Overrides the provider-level and global default for this specific secret. | + +Each secret must use exactly one of the following sync types: + +--- + +##### `opaque` — Single value (stored as a Control Plane `opaque` secret) + +Shorthand (path only, no fallback): +```yaml +- name: my-secret + provider: my-vault + opaque: /v1/secret/data/myapp +``` + +With options: +```yaml +- name: my-secret + provider: my-vault + opaque: + path: /v1/secret/data/myapp # path to fetch + parse: data.password # optional — extract a key from a JSON/YAML response + default: fallback-value # optional — used if fetch fails + encoding: base64 # optional — base64-decode the fetched value +``` + +> **Note:** If you use the shorthand form (`opaque: /some/path`) with no `default`, a fetch failure causes the sync to fail with no fallback. + +--- + +##### `dictionary` — Multiple values (stored as a Control Plane `dictionary` secret) + +Each key in the dictionary is fetched independently: +```yaml +- name: my-secret + provider: my-vault + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + path: /v1/secret/data/app + parse: data.username + default: "no username" +``` + +Each key supports `path`, `parse`, `default`, and `encoding` — the same options as `opaque`. A failure on one key does not block others. + +--- + +##### `dictionaryFromProject` — Sync an entire project (Doppler or GCP Secret Manager) + +Syncs all secrets from a provider project in one operation, stored as a Control Plane `dictionary` secret. The expected shape depends on the provider. + +**Doppler** — specify a `project/config` path: +```yaml +- name: my-doppler-config + provider: my-doppler + dictionaryFromProject: + path: my-project/dev # format: "project/config" — exactly two segments +``` + +**GCP Secret Manager** — set to `true` to pull every accessible secret from the project configured on the provider: +```yaml +- name: my-gcp-config + provider: my-gcp + dictionaryFromProject: true +``` + +Each fetched secret's latest version becomes one key in the resulting dictionary. Secrets with no accessible latest version (no versions, disabled, or destroyed) are skipped. + +> **Note:** `dictionaryFromProject` is only valid with the Doppler or GCP Secret Manager providers. Doppler requires the `{ path: ... }` object form; GCP requires the `true` form. Mixing them (or using either with another provider) causes ESS to exit at startup. + +--- + +#### Doppler Path Formats + +| Sync type | Path format | Example | +|---|---|---| +| `opaque` or `dictionary` key | `project/config/SECRET_NAME` | `my-app/production/DATABASE_URL` | +| `dictionaryFromProject` | `project/config` | `my-app/production` | + +--- + +#### Sync Interval Format + +Intervals use the format `hms`. All parts are optional but at least one is required. + +Examples: `10s`, `5m`, `1h`, `1h30m`, `1h30m10s` + +Priority (highest wins): +1. Secret-level `syncInterval` +2. Provider-level `syncInterval` +3. Global default (`300s`) + +--- + +### Important Notes + +- **Conflict protection:** If a Control Plane secret already exists and is managed by a different ESS instance, the sync for that secret will fail. Two ESS instances cannot manage the same secret. +- **Secret type changes:** Changing a secret from `opaque` to `dictionary` (or vice versa) causes ESS to delete the existing secret and recreate it. There is a brief window where the secret does not exist. +- **Cleanup:** ESS runs an hourly job that deletes Control Plane secrets it owns but that no longer appear in `sync.yaml`. Removing a secret from the config will eventually result in its deletion from Control Plane. +- **Doppler `parse`:** The `parse` field only works when the Doppler secret's value is JSON or YAML. Using `parse` on a plain string secret throws an error. +- **`sync.yaml` hot reload:** ESS watches its config file and automatically restarts when changes are detected (every ~5 seconds). No workload restart is needed after updating the config secret. + +### Resources + +- [ESS Documentation](https://docs.controlplane.com/template-catalog/templates/external-secret-syncer) +- [Image Source Code](https://github.com/controlplane-com/external-secret-syncer) \ No newline at end of file diff --git a/ess/versions/1.6.0/templates/_helpers.tpl b/ess/versions/1.6.0/templates/_helpers.tpl new file mode 100644 index 00000000..95668c35 --- /dev/null +++ b/ess/versions/1.6.0/templates/_helpers.tpl @@ -0,0 +1,39 @@ +{{/* Resource Naming */}} + +{{/* +ESS Workload Name +*/}} +{{- define "ess.name" -}} +{{- printf "%s-ess" .Release.Name }} +{{- end }} + +{{/* +ESS Identity Name +*/}} +{{- define "ess.identity.name" -}} +{{- printf "%s-ess-identity" .Release.Name }} +{{- end }} + +{{/* +ESS Policy Name +*/}} +{{- define "ess.policy.name" -}} +{{- printf "%s-ess-policy" .Release.Name }} +{{- end }} + +{{/* +ESS Secret Config Name +*/}} +{{- define "ess.secret.name" -}} +{{- printf "%s-ess-config" .Release.Name }} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "ess.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/ess/versions/1.6.0/templates/identity.yaml b/ess/versions/1.6.0/templates/identity.yaml new file mode 100644 index 00000000..e7176ee7 --- /dev/null +++ b/ess/versions/1.6.0/templates/identity.yaml @@ -0,0 +1,5 @@ +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ess.identity.name" . }} +description: ESS identity +tags: {{- include "ess.tags" . | nindent 4 }} diff --git a/ess/versions/1.6.0/templates/policy.yaml b/ess/versions/1.6.0/templates/policy.yaml new file mode 100644 index 00000000..cba2f1dd --- /dev/null +++ b/ess/versions/1.6.0/templates/policy.yaml @@ -0,0 +1,10 @@ +kind: policy +name: {{ include "ess.policy.name" . }} +description: ESS policy +bindings: + - permissions: + - manage + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} +target: all +targetKind: secret diff --git a/ess/versions/1.6.0/templates/secret.yaml b/ess/versions/1.6.0/templates/secret.yaml new file mode 100644 index 00000000..764bc110 --- /dev/null +++ b/ess/versions/1.6.0/templates/secret.yaml @@ -0,0 +1,9 @@ +kind: secret +name: {{ include "ess.secret.name" . }} +description: ESS config +tags: {{- include "ess.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: | +{{- toYaml .Values.essConfig | nindent 4 }} \ No newline at end of file diff --git a/ess/versions/1.6.0/templates/workload.yaml b/ess/versions/1.6.0/templates/workload.yaml new file mode 100644 index 00000000..a4106066 --- /dev/null +++ b/ess/versions/1.6.0/templates/workload.yaml @@ -0,0 +1,61 @@ +kind: workload +name: {{ include "ess.name" . }} +description: External Secret Syncer +tags: {{- include "ess.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: ess + cpu: {{ .Values.resources.cpu | quote }} + image: {{ .Values.image }} + inheritEnv: false + memory: {{ .Values.resources.memory | quote }} + ports: + - number: {{ .Values.port }} + protocol: http + readinessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: /about + port: {{ .Values.port }} + scheme: HTTP + initialDelaySeconds: 0 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + volumes: + - path: /usr/src/app/sync.yaml + recoveryPolicy: retain + uri: cpln://secret/{{ include "ess.secret.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 3 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: + {{- toYaml .Values.allowedIp | nindent 8 }} + inboundBlockedCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: none + inboundAllowWorkload: [] + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ess.identity.name" . }} + loadBalancer: + direct: + enabled: false + ports: [] + supportDynamicTags: false diff --git a/ess/versions/1.6.0/values.yaml b/ess/versions/1.6.0/values.yaml new file mode 100644 index 00000000..a445bd60 --- /dev/null +++ b/ess/versions/1.6.0/values.yaml @@ -0,0 +1,86 @@ +image: ghcr.io/controlplane-com/cpln-build/external-secret-syncer:v1.4.0 + +resources: + cpu: 200m + memory: 256Mi + +port: 3004 + +allowedIp: + - 1.2.3.4 # Replace with your IP + +essConfig: + providers: + - name: my-vault + vault: + address: https://my-vault.com:8200 + token: + syncInterval: 1m + - name: my-aws-ssm + awsParameterStore: + region: us-east-1 + accessKeyId: # alternatively configure identity to natively use AWS permissions + secretAccessKey: # alternatively configure identity to natively use AWS permissions + # - name: my-aws-secrets-manager + # awsSecretsManager: + # region: us-east-1 + # accessKeyId: + # secretAccessKey: + # - name: my-1password + # onePassword: + # serviceAccountToken: + # integrationName: my-ess + # integrationVersion: 1.0.0 + # - name: my-1password-connect + # onePasswordConnect: + # serverURL: https://my-connect-server.example.com + # token: + # - name: my-doppler + # doppler: + # accessToken: + # - name: my-gcp + # gcpSecretManager: + # projectId: 123456789876 + # credentials: + # clientEmail: + # privateKey: + secrets: + - name: auth + provider: my-vault + syncInterval: 20s + dictionary: + PORT: + path: /v1/secret/data/app + parse: data.port + default: 5432 + PASSWORD: + path: /v1/secret/data/app + parse: data.password + USERNAME: + default: "no username" + path: /v1/secret/data/app + parse: data.username + - name: ssm + provider: my-aws + syncInterval: 20s + opaque: /example/app + # - name: secrets-manager + # provider: my-aws-secrets-manager + # dictionary: + # PASSWORD: + # path: /example/app + # parse: password + # - name: doppler-secret + # provider: my-doppler + # opaque: /project/config/SECRET_NAME + # - name: doppler-project + # provider: my-doppler + # dictionaryFromProject: + # path: project/config # syncs all secrets from a Doppler project+config + # - name: gcp + # provider: my-gcp + # opaque: database-password + # - name: gcp-project + # provider: my-gcp + # dictionaryFromProject: true # syncs all secrets from the GCP project + diff --git a/etcd/versions/1.4.0/templates/_helpers.tpl b/etcd/versions/1.4.0/templates/_helpers.tpl index 24caaa8a..e9646d75 100644 --- a/etcd/versions/1.4.0/templates/_helpers.tpl +++ b/etcd/versions/1.4.0/templates/_helpers.tpl @@ -39,14 +39,28 @@ etcd Volume Set Name {{/* Validation */}} {{/* -Validate replicas value - must be minimum 3 and an odd number +Validate replicas value - must be minimum 3 and odd (single-location), +or 1 when multi-location is configured (1 per location, locations provide the count) */}} {{- define "etcd.validateReplicas" -}} -{{- if lt (int .Values.replicas) 3 -}} -{{- fail "Error: .Values.replicas must be at least 3" -}} -{{- end -}} -{{- if eq (mod (int .Values.replicas) 2) 0 -}} -{{- fail "Error: .Values.replicas must be an odd number" -}} +{{- if .Values.global.locations -}} + {{- if ne (int .Values.replicas) 1 -}} + {{- fail "Error: .Values.replicas must be 1 when global.locations is set (1 replica per location)" -}} + {{- end -}} + {{- $locCount := len .Values.global.locations -}} + {{- if lt $locCount 3 -}} + {{- fail "Error: global.locations must have at least 3 entries for etcd quorum" -}} + {{- end -}} + {{- if eq (mod $locCount 2) 0 -}} + {{- fail "Error: global.locations must have an odd number of entries for etcd quorum" -}} + {{- end -}} +{{- else -}} + {{- if lt (int .Values.replicas) 3 -}} + {{- fail "Error: .Values.replicas must be at least 3" -}} + {{- end -}} + {{- if eq (mod (int .Values.replicas) 2) 0 -}} + {{- fail "Error: .Values.replicas must be an odd number" -}} + {{- end -}} {{- end -}} {{- end -}} @@ -70,10 +84,6 @@ helm.sh/chart: {{ include "etcd.chart" . }} app.cpln.io/version: {{ .Chart.AppVersion | quote }} {{- end }} app.cpln.io/managed-by: {{ .Release.Service }} -cpln/marketplace: "true" -cpln/marketplace-template: etcd -cpln/marketplace-template-version: {{ .Chart.Version }} -cpln/marketplace-gvc: {{ .Values.global.cpln.gvc }} {{- end }} {{/* @@ -82,4 +92,4 @@ Selector labels {{- define "etcd.selectorLabels" -}} app.cpln.io/name: {{ .Release.Name }} app.cpln.io/instance: {{ .Release.Name }} -{{- end }} \ No newline at end of file +{{- end }} diff --git a/etcd/versions/1.4.0/templates/secret.yaml b/etcd/versions/1.4.0/templates/secret.yaml index 9d31c0ce..205b818e 100644 --- a/etcd/versions/1.4.0/templates/secret.yaml +++ b/etcd/versions/1.4.0/templates/secret.yaml @@ -22,7 +22,23 @@ data: # Self FQDN for peer URLs SELF_FQDN="replica-${REPLICA_INDEX}.${WORKLOAD_NAME}.${LOCATION}.{{ .Values.global.cpln.gvc }}.cpln.local" - # Build initial cluster list based on replicas + {{- if .Values.global.locations }} + # Multi-location mode: build cluster from all locations (1 replica per location) + LOCATIONS=({{ range .Values.global.locations }}"{{ . }}" {{ end }}) + ETCD_NAME="${WORKLOAD_NAME}-${LOCATION}" + INITIAL_CLUSTER="" + for loc in "${LOCATIONS[@]}"; do + peer="replica-0.${WORKLOAD_NAME}.${loc}.{{ .Values.global.cpln.gvc }}.cpln.local" + entry="${WORKLOAD_NAME}-${loc}=http://${peer}:2380" + if [[ -z "$INITIAL_CLUSTER" ]]; then + INITIAL_CLUSTER="$entry" + else + INITIAL_CLUSTER="${INITIAL_CLUSTER},$entry" + fi + done + {{- else }} + # Single-location mode: build cluster from local replicas + ETCD_NAME="${WORKLOAD_NAME}-${REPLICA_INDEX}" INITIAL_CLUSTER="" for i in $(seq 0 $(({{ .Values.replicas }} - 1))); do peer="replica-${i}.${WORKLOAD_NAME}.${LOCATION}.{{ .Values.global.cpln.gvc }}.cpln.local" @@ -33,6 +49,7 @@ data: INITIAL_CLUSTER="${INITIAL_CLUSTER},$entry" fi done + {{- end }} # Determine cluster state if [ -d "/var/lib/etcd/member" ] && [ "$(ls -A /var/lib/etcd/member)" ]; then @@ -41,11 +58,12 @@ data: INITIAL_CLUSTER_STATE="new" fi - echo "Starting etcd with cluster state: $INITIAL_CLUSTER_STATE" + echo "Starting etcd node '${ETCD_NAME}' with cluster state: $INITIAL_CLUSTER_STATE" + echo "Initial cluster: $INITIAL_CLUSTER" # Run etcd exec etcd \ - --name "${WORKLOAD_NAME}-${REPLICA_INDEX}" \ + --name "${ETCD_NAME}" \ --data-dir /var/lib/etcd \ --listen-client-urls "http://0.0.0.0:2379" \ --advertise-client-urls "http://${SELF_FQDN}:2379" \ diff --git a/kafka/RELEASES.md b/kafka/RELEASES.md index 9375966c..95e981e7 100644 --- a/kafka/RELEASES.md +++ b/kafka/RELEASES.md @@ -4,6 +4,33 @@ - **Kafka Cluster Parallel Scaling Policy**: Changed the default `scalingPolicy` for the Kafka cluster stateful workload from `OrderedReady` to `Parallel` +# Release Notes - Version 4.0.0 + +## What's New + +- **kafka-orchestrator sidecar for accurate readiness**: The Kafka cluster workload now runs `ghcr.io/controlplane-com/kafka-orchestrator` as a sidecar container. The sidecar exposes an HTTP `/health/ready` endpoint that validates broker registration, controller election, under-replicated partition count, and log-directory health using franz-go — a much stronger readiness signal than the previous TCP-socket check on port 9093. + - Sidecar readiness probe: `httpGet /health/ready` on port 8080 + - Prometheus metrics exposed at `/metrics` (cgroup memory and OOM-risk ratios) + - SASL credentials are wired automatically from the configured listener (default: `client`) + - The kafka container's existing TCP probes on port 9093 are preserved; workload readiness is now gated on both probes passing + - Configurable under the new `kafka_orchestrator:` section in `values.yaml`; set to `null` or comment out to disable the sidecar + +- **Graceful broker shutdown**: The kafka container's `terminationGracePeriodSeconds` is now exposed via `kafka.terminationGracePeriodSeconds` in `values.yaml` (default `600` seconds, up from the previous hardcoded `30`). Brokers carrying large amounts of data now have time to complete `controlled.shutdown` (leadership transfer + log flush) before SIGKILL. + +- **Init script signal propagation**: The kafka container's bash wrapper now `exec`s into `/tmp/kafka-init.sh`, which already `exec`s into the Kafka run script. PID 1 is now the Kafka JVM itself, so SIGTERM from Control Plane reaches the broker directly and triggers `controlled.shutdown` instead of being absorbed by the bash wrapper. + +- **Suppressed Control Plane's default preStop drain delay (all four containers)**: Control Plane's default container lifecycle injects a `preStop sleep $((terminationGracePeriodSeconds / 2))` on **every** container (the actuator's `getLifecycle` runs per-container in the `for...containers` loop in `workloadDeployment.ts:246-258`). For our 600s grace period that means a 300s idle preStop on each of `kafka`, `kafka-orchestrator`, `kafka-exporter`, and `jmx-exporter`. The drain delay is intended for L7 envoy/ingress connections, none of which apply to a kafka stateful workload — clients reconnect via Metadata refresh, inter-broker traffic is handled by `controlled.shutdown`'s leadership transfer, and the prometheus-scrape sidecars have no draining semantics. All four containers now declare an explicit no-op `preStop: exec: ['true']`, suppressing the default on each. Net effect: the entire pod terminates in seconds (bounded by the kafka container's `controlled.shutdown`), not 300s+ of useless sleep on three sidecars holding the pod hostage. + +- **`cpln/publishNotReadyAddresses=true` on the Kafka cluster workload**: Required so the headless Service exposes not-yet-Ready broker pods in DNS, which is what lets the KRaft controller quorum form on cold start (or after suspend/unsuspend). Earlier versions of the chart got away with this missing because the Kafka container's TCP probe on 9093 briefly flickered Ready every crash-loop iteration, just long enough to publish endpoints. The new kafka-orchestrator sidecar's `/health/ready` probe (correctly) requires actual cluster health, which closes that race — making the tag mandatory rather than optional. Without it, pods crash-loop with `UnknownHostException: etl-cluster-N.etl-cluster:9093`. + +- **Reliability and recovery defaults in `server.properties`**: The chart now emits the following defaults in the broker config (each can be overridden via `kafka.extra_configurations`): + - `default.replication.factor` — auto-derived as `min(3, kafka.replicas)`; clamps correctly when scaling below 3 replicas + - `min.insync.replicas` — auto-derived as `max(1, default.replication.factor - 1)` + - `controlled.shutdown.enable=true`, `controlled.shutdown.max.retries=3`, `controlled.shutdown.retry.backoff.ms=5000` — clean shutdown with retry on leadership-transfer failures + - `unclean.leader.election.enable=false` — never promote out-of-sync replicas; prevents data loss + - `num.recovery.threads.per.data.dir` — auto-derived as `8 * ceil(cores)` from `kafka.cpu` (e.g. `1000m` → 8, `2000m` → 16, `4` → 32). Recovery only runs after a *dirty* shutdown; a clean `controlled.shutdown` (now achievable thanks to the grace-period and signal-propagation fixes above) skips it entirely. + - `num.replica.fetchers=4` — faster follower replication so brokers rejoin the ISR quickly after transient outages + # Release Notes - Version 3.4.0 ## What's New diff --git a/kafka/versions/4.0.0/Chart.yaml b/kafka/versions/4.0.0/Chart.yaml new file mode 100644 index 00000000..a5569a4f --- /dev/null +++ b/kafka/versions/4.0.0/Chart.yaml @@ -0,0 +1,17 @@ +apiVersion: v2 +name: kafka +description: Kafka cluster app for Control Plane +type: application +version: 4.0.0 +appVersion: "3.9" + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2026-04-28" + lastModified: "2026-04-28" + category: "event-streaming" + createsGvc: false \ No newline at end of file diff --git a/kafka/versions/4.0.0/README.md b/kafka/versions/4.0.0/README.md new file mode 100644 index 00000000..47dc2aa5 --- /dev/null +++ b/kafka/versions/4.0.0/README.md @@ -0,0 +1,169 @@ +## Kafka App + +### How to connect to the cluster + +You can connect to Kafka from the same GVC in which it's deployed using the following methods: + +- To connect using the cluster's general address, use `{kafka-cluster-workload-name}:9092`. + +- To connect to a specific replica, use one of the following addresses based on the replica you wish to connect to: + - `{kafka-cluster-workload-name}-0.{kafka-cluster-workload-name}:9092` + - `{kafka-cluster-workload-name}-1.{kafka-cluster-workload-name}:9092` + - `{kafka-cluster-workload-name}-2.{kafka-cluster-workload-name}:9092` + +- If you're configuring your Kafka for external access, you'll need to provide a domain name for the public address of the listener you want to use. Prerequisites: + - Make sure the dedicated load balancer is enabled on the GVC. See [Configure Domain documentation](https://docs.controlplane.com/guides/configure-domain#dedicated-load-balancing). + - Make sure to register your [Apex domain](https://docs.controlplane.com/reference/domain#apex-domain-considerations) name with Control Plane and set up a DNS record for the Kafka public address CNAME with the canonical GVC endpoint in your DNS provider. + +### Test Kafka Cluster with Kafka Client + +1. To activate the Kafka client, make sure `kafka_client` is uncommented in your values file. If necessary, reinstall the chart with the command: + ```bash + cpln helm install kafka-dev -f values-example.yaml + ``` + +2. To connect to the `kafka-client` workload, navigate through the UI to the appropriate GVC and select the `kafka-client` workload. In the workload details, find and use the **Connect** feature to establish a connection, which can be done either via the UI or by utilizing the CLI command provided there. + +3. Once connected, you can write and consume messages through the `kafka-client` workload. If it's `PLAINTEXT`, producer and consumer configurations should be omitted below: + +```BASH +# Change to bin directory +cd /opt/kafka/bin + +# Create client.properties +echo "security.protocol=SASL_PLAINTEXT +sasl.mechanism=PLAIN +sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username=\"admin\" password=\"your-admin-password\";" > ./client.properties + +# Produce messages to the 'controlplane' topic +kafka-console-producer.sh --bootstrap-server {kafka-cluster-workload-name}:9092 --topic controlplane --producer.config ./client.properties + +# Consume messages from the 'controlplane' topic +kafka-console-consumer.sh --bootstrap-server {kafka-cluster-workload-name}:9092 --topic controlplane --from-beginning --consumer.config ./client.properties +``` + +### Public Listener Domain Configuration + +When configuring Kafka for external access via a public listener, you can choose between two domain routing modes: + +#### **Direct Replica Routing Mode (Recommended)** + +The recommended approach with automatic replica endpoint generation: + +```yaml +kafka: + listeners: + public: + protocol: SASL_PLAINTEXT + name: PUBLIC + directReplicaRouting: + enabled: true + containerPort: 9095 # ports 9091, 9093 and 9094 are reserved + publicAddress: kafka.example.com + sasl: + users: "public-user" + passwords: "your-password" +``` + +**Behavior:** +- Single domain configuration with the specified container port +- DNS01 certificate challenge for automatic SSL +- Platform automatically generates replica-specific subdomains in format: `{replica-name}-{location}.{publicAddress}` +- Replica-aware routing reduces cross-zone traffic costs in multi-zone deployments +- Connection endpoints (auto-generated examples): + - `kafka-cluster-0-aws-us-east-1.kafka.example.com:9095` + - `kafka-cluster-1-aws-us-east-1.kafka.example.com:9095` + - `kafka-cluster-2-aws-us-east-1.kafka.example.com:9095` + +**Prerequisites for Direct Routing:** +- DNS provider must support CNAME records +- Create DNS records for each replica and the ACME challenge record: + 1. `CNAME kafka-cluster-0-aws-us-east-1.kafka.example.com → kafka-cluster--0.aws-us-east-1.controlplane.us` + 2. `CNAME kafka-cluster-1-aws-us-east-1.kafka.example.com → kafka-cluster--1.aws-us-east-1.controlplane.us` + 3. `CNAME kafka-cluster-2-aws-us-east-1.kafka.example.com → kafka-cluster--2.aws-us-east-1.controlplane.us` + 4. `CNAME _acme-challenge.kafka → _acme-challenge.cpln.app` (for certificate validation) + +#### **Multi-Port Routing** + +Each replica gets its own port. Not recommended for multi-zone clusters: + +```yaml +kafka: + listeners: + public: + protocol: SASL_PLAINTEXT + name: PUBLIC + publicAddress: kafka.example.com + sasl: + users: "public-user" + passwords: "your-password" +``` + +**Behavior:** +- Creates ports 3000, 3001, 3002 (one per replica) +- Each port routes to a specific replica +- Custom TLS cipher suites configuration +- Connection format: `kafka.example.com:3000`, `kafka.example.com:3001`, etc. +- **Note**: Not recommended for multi-zone deployments as cross-zone traffic charges may occur + +**Which Mode to Use:** +- Use **Direct Replica Routing** for new deployments that require automatic SSL with zone-aware routing and per-replica hostnames +- Avoid using **Multi-Port Routing** unless you have specific use cases or existing clients configured with port numbers (3000-300X) + +**Configuration Rules:** +- Cannot use both `publicAddress` and `directReplicaRouting.enabled: true` in the same listener +- When `directReplicaRouting.enabled: true`, both `containerPort` and `publicAddress` must be specified within the `directReplicaRouting` section +- Only one listener can have a public address configured across all listeners +- Direct Replica Routing automatically creates DNS entries in format: `{replica-name}-{location}.{publicAddress}:{containerPort}` + +### Enable Custom Encryption using AWS Key Management Service (KMS) + +Custom encryption for volumes can be configured by setting the values under `kafka.volumes.customEncryption`. + +A key must be created in AWS before proceeding with the template. + +In the values file, set `enabled` to `true` and add the proper `region` and `keyId`. + +**Important** - To finish configuring in AWS once the template is installed: + +1. Navigate in the console to the created volume +2. Click on `spec` +3. Follow the `AWS Custom Encryption Instructions` +4. Repeat for each encrypted volume created + +### Kafbat configuration example + +Full configuration Docs: https://ui.docs.kafbat.io/configuration/configuration-file + +```YAML +kafka: + clusters: + - name: "apache-kafka" + bootstrapServers: "kafka-dev-cluster.kafka-dev.cpln.local:9092" + kafkaConnect: + - name: kafka-dev-connect-connect-cluster + address: http://kafka-dev-connect-connect-cluster.kafka-dev.cpln.local:8083 + properties: + security.protocol: "SASL_PLAINTEXT" + sasl.mechanism: "PLAIN" + sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"admin\" password=\"your-admin-password\";" + +management: + health: + ldap: + enabled: false + +auth: + type: "LOGIN_FORM" +spring: + security: + user: + name: "admin" + password: "adminPassword" + +server: + port: 8080 +``` + +### Release Notes +See [RELEASES.md](https://github.com/controlplane-com/templates/blob/main/kafka/RELEASES.md) diff --git a/kafka/versions/4.0.0/charts/cpln-common-1.0.0.tgz b/kafka/versions/4.0.0/charts/cpln-common-1.0.0.tgz new file mode 100644 index 00000000..c41e0905 Binary files /dev/null and b/kafka/versions/4.0.0/charts/cpln-common-1.0.0.tgz differ diff --git a/kafka/versions/4.0.0/templates/_helpers.tpl b/kafka/versions/4.0.0/templates/_helpers.tpl new file mode 100644 index 00000000..3da03e4b --- /dev/null +++ b/kafka/versions/4.0.0/templates/_helpers.tpl @@ -0,0 +1,908 @@ +{{/* +Name +*/}} +{{- define "kafka.name" -}} +{{- printf "%s" .Release.Name -}} +{{- end }} + +{{/* +Cluster Workload Name +*/}} +{{- define "kafka.clusterName" -}} +{{- printf "%s-%s" (include "kafka.name" .) .Values.kafka.name -}} +{{- end }} + +{{/* +Convert .Values.kafka.memory to appropriate JVM heap size settings. +*/}} +{{- define "kafka.heap.opts" -}} +{{- $memory := default "512Mi" .Values.kafka.memory }} +{{- $memoryInMi := 0 }} +{{- if hasSuffix "Gi" $memory }} + {{- $value := trimSuffix "Gi" $memory | float64 }} + {{- $memoryInMi = mul $value 1024 | int }} +{{- else if hasSuffix "Mi" $memory }} + {{- $memoryInMi = trimSuffix "Mi" $memory | int }} +{{- else }} + {{- $memoryInMi = 512 }} # Default to 512Mi if no suffix +{{- end }} +{{- $heapSize := div (mul $memoryInMi 60) 100 | int }} +-Xmx{{ $heapSize }}m -Xms{{ $heapSize }}m +{{- end }} + +{{/* +Recovery threads per data dir. Kafka serializes log recovery within a single thread per +data dir on broker startup, so a high partition count + slow disk = long recovery. We size +this proportionally to the broker's CPU budget: 8 * ceil(cpuCores). Accepts millicore +("1000m") or whole-core ("2") forms; ceils up so half-cores still get a useful pool. +*/}} +{{- define "kafka.recoveryThreads" -}} +{{- $cpu := default "1000m" .Values.kafka.cpu | toString -}} +{{- $millicores := 1000 -}} +{{- if hasSuffix "m" $cpu -}} + {{- $millicores = trimSuffix "m" $cpu | int -}} +{{- else -}} + {{- $millicores = mul (atoi $cpu) 1000 -}} +{{- end -}} +{{- $cores := div (add $millicores 999) 1000 -}} +{{- mul 8 $cores -}} +{{- end }} + +{{/* +Sensible default for default.replication.factor: min(3, replicaCount). Kafka rejects +default.replication.factor > replicas, so we clamp to the cluster size. +*/}} +{{- define "kafka.defaultReplicationFactor" -}} +{{- $replicas := .Values.kafka.replicas | int -}} +{{- if lt $replicas 3 -}}{{ $replicas }}{{- else -}}3{{- end -}} +{{- end }} + +{{/* +Sensible default for min.insync.replicas: max(1, defaultReplicationFactor - 1). With +replication.factor=3 this gives 2 (tolerates one broker loss without unavailability and +without losing acked writes); with replicas=2 it gives 1; with replicas=1 it gives 1. +*/}} +{{- define "kafka.minInsyncReplicas" -}} +{{- $rf := include "kafka.defaultReplicationFactor" . | int -}} +{{- if le $rf 1 -}}1{{- else -}}{{ sub $rf 1 }}{{- end -}} +{{- end }} + +{{- define "kafka.validateListenerConfig" -}} + {{- if not .name -}} + {{- fail "Error: 'name' must be provided for the listener" -}} + {{- end -}} + {{- if not .protocol -}} + {{- fail "Error: 'protocol' must be provided for the listener" -}} + {{- end -}} + {{- $hasValidConfig := or .publicAddress .containerPort (and .directReplicaRouting .directReplicaRouting.enabled .directReplicaRouting.containerPort .directReplicaRouting.publicAddress) -}} + {{- if not $hasValidConfig -}} + {{- fail "Error: At least one of 'publicAddress', 'containerPort', or valid 'directReplicaRouting' (with enabled: true, containerPort, and publicAddress) must be provided for the listener" -}} + {{- end -}} + {{- if and .publicAddress .containerPort -}} + {{- fail "Error: When publicAddress is set for the listener, containerPort should not be specified as it will be automatically set to port range 3000-3004" -}} + {{- end -}} + {{- if .containerPort -}} + {{- $port := .containerPort | printf "%s" }} + {{- if or (eq $port "9091") (eq $port "9093") (eq $port "9094") -}} + {{- fail "Error: containerPort cannot be 9091, 9093, or 9094 for listener" -}} + {{- end -}} + {{- end -}} + {{- if and .directReplicaRouting .directReplicaRouting.enabled -}} + {{- if .directReplicaRouting.containerPort -}} + {{- $port := .directReplicaRouting.containerPort | printf "%s" }} + {{- if or (eq $port "9091") (eq $port "9093") (eq $port "9094") -}} + {{- fail "Error: directReplicaRouting.containerPort cannot be 9091, 9093, or 9094 for listener" -}} + {{- end -}} + {{- end -}} + {{- end -}} +{{- end -}} + +{{- define "kafka.validateAdminExists" -}} +{{- $adminFound := false -}} +{{- $saslPlaintextExists := false -}} +{{- range .Values.kafka.listeners -}} + {{- if eq .protocol "SASL_PLAINTEXT" -}} + {{- $saslPlaintextExists = true -}} + {{- if and .sasl .sasl.admin -}} + {{- $adminFound = true -}} + {{- end -}} + {{- end -}} +{{- end -}} +{{- if and $saslPlaintextExists (not $adminFound) -}} + {{- fail "Error: At least one SASL_PLAINTEXT listener must have an admin user configured in sasl.admin" -}} +{{- end -}} +{{- end -}} + +{{- define "kafka.validateAuthConfig" -}} +{{- if eq .protocol "SASL_PLAINTEXT" -}} + {{- if not .sasl -}} + {{- fail (printf "Error: SASL_PLAINTEXT protocol requires sasl configuration to be enabled for listener '%s'" .name) -}} + {{- else if not .sasl.users -}} + {{- fail (printf "Error: SASL_PLAINTEXT protocol requires at least one user to be defined in sasl.users for listener '%s'" .name) -}} + {{- else -}} + {{- $userCount := len (splitList "," .sasl.users) -}} + {{- if not .sasl.passwords -}} + {{- fail (printf "Error: sasl.passwords must be provided when sasl.users is defined for listener '%s'" .name) -}} + {{- else -}} + {{- $passwordCount := len (splitList "," .sasl.passwords) -}} + {{- if ne $userCount $passwordCount -}} + {{- fail (printf "Error: Number of users (%d) does not match number of passwords (%d) for listener '%s'" $userCount $passwordCount .name) -}} + {{- end -}} + {{- end -}} + {{- end -}} +{{- end -}} +{{- end -}} + +{{- define "kafka.validateReplicas" -}} +{{- $replicas := .Values.kafka.replicas | int }} +{{- if or (gt $replicas 5) (eq $replicas 2) -}} + {{- fail "Invalid value for kafka.replicas. It must be less than or equal to 5 and not equal to 2." -}} +{{- end -}} +{{- end -}} + +{{- define "kafka.validateOnePublicAddress" -}} +{{- $publicAddressCount := 0 -}} +{{- range .Values.kafka.listeners }} + {{- if .publicAddress }} + {{- $publicAddressCount = add $publicAddressCount 1 -}} + {{- end }} + {{- if and .directReplicaRouting .directReplicaRouting.enabled .directReplicaRouting.publicAddress }} + {{- $publicAddressCount = add $publicAddressCount 1 -}} + {{- end }} +{{- end }} +{{- if gt $publicAddressCount 1 -}} + {{- fail "There must be at most one listener with a publicAddress set (either listener.publicAddress or listener.directReplicaRouting.publicAddress)." -}} +{{- end }} +{{- end -}} + +{{- define "kafka.validatedirectReplicaRoutingConfig" -}} +{{- range $key, $listener := .Values.kafka.listeners }} + {{- if and $listener.publicAddress $listener.directReplicaRouting }} + {{- if $listener.directReplicaRouting.enabled }} + {{- fail (printf "Error in listener '%s': Cannot have both 'publicAddress' at listener level and 'directReplicaRouting.enabled: true'. Use either legacy mode (publicAddress only) or new mode (directReplicaRouting with enabled: true)." $key) -}} + {{- end }} + {{- end }} + {{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }} + {{- if not $listener.directReplicaRouting.publicAddress }} + {{- fail (printf "Error in listener '%s': When directReplicaRouting.enabled is true, directReplicaRouting.publicAddress must be specified." $key) -}} + {{- end }} + {{- if not $listener.directReplicaRouting.containerPort }} + {{- fail (printf "Error in listener '%s': When directReplicaRouting.enabled is true, directReplicaRouting.containerPort must be specified." $key) -}} + {{- end }} + {{- end }} +{{- end }} +{{- end -}} + +{{- define "kafka.validateKafkaImage" -}} +{{- $image := .Values.kafka.image -}} +{{- if contains "bitnami" $image -}} + {{- fail (printf "Error: This chart does not support Bitnami images, please use Apache Kafka images instead. Current value: %s" $image) -}} +{{- end -}} +{{- end -}} + +{{- define "kafka.validateImage" -}} +{{- $image := .image -}} +{{- if contains "bitnami" $image -}} + {{- fail (printf "Error: This chart does not support Bitnami images, please use Apache Kafka images instead. Current value: %s" $image) -}} +{{- end -}} +{{- end -}} + +{{- define "kafka.clientBootstrapAddress" -}} +{{- $clusterName := include "kafka.clusterName" . -}} +{{- $bootstrapAddress := "" -}} +{{- $listenerName := "" -}} + +{{- if .listenerName -}} + {{- $listenerName = .listenerName -}} +{{- else if .Values.kafka_connectors -}} + {{- range .Values.kafka_connectors -}} + {{- if .listener -}} + {{- $listenerName = .listener -}} + {{- end -}} + {{- end -}} +{{- end -}} + +{{- if $listenerName -}} + {{- if hasKey .Values.kafka.listeners $listenerName -}} + {{- $listener := index .Values.kafka.listeners $listenerName -}} + {{- if $listener.publicAddress -}} + {{- $bootstrapAddress = printf "%s:3000" $listener.publicAddress -}} + {{- else -}} + {{- $containerPort := $listener.containerPort | int -}} + {{- $bootstrapAddress = printf "%s:%d" $clusterName $containerPort -}} + {{- end -}} + {{- else -}} + {{- $bootstrapAddress = include "kafka.bootstrapAddress" . -}} + {{- end -}} +{{- else -}} + {{- $bootstrapAddress = include "kafka.bootstrapAddress" . -}} +{{- end -}} + +{{- $bootstrapAddress -}} +{{- end -}} + +{{- define "kafka.propertiesMapToList" -}} +{{- range $key, $value := . -}} +{{ $key }}={{ $value }} +{{- end -}} +{{- end -}} + + + +{{- define "kafka.connectors.download.script" -}} +#!/bin/sh +set -e{{- if .verbose }}x{{- end }} + +download_file() { + local url=$1 + local output_file=$2 + + if echo "$url" | grep -q "@.*jfrog"; then + echo "Handling JFrog redirect for: $url" + local redirect_url=$(wget -S --spider "$url" 2>&1 | grep 'Location:' | awk '{print $2}') + if [ -n "$redirect_url" ]; then + echo "Downloading from redirect URL..." + wget -q "$redirect_url" -O "$output_file" + else + echo "Failed to get redirect URL, trying direct download..." + wget -q "$url" -O "$output_file" + fi + else + wget -q "$url" -O "$output_file" + fi +} + +# Function to download and extract artifacts +download_and_extract() { + local artifact_type=$1 + local artifact_url=$2 + local plugin_path=$3 + local plugin_name=$4 + local temp_dir=$(mktemp -d) + + echo "Downloading artifact from $artifact_url" + + if [ "$artifact_type" == "jar" ]; then + # For jar files, create a directory for the plugin if it doesn't exist + local plugin_dir="$plugin_path/$plugin_name" + mkdir -p "$plugin_dir" + + # Download jar file to the plugin-specific directory + download_file "$artifact_url" "$plugin_dir/${plugin_name}.jar" + echo "Downloaded JAR file to $plugin_dir/${plugin_name}.jar" + else + # For archives, download to temp dir and extract + local archive_file="$temp_dir/archive.${artifact_type}" + download_file "$artifact_url" "$archive_file" + + echo "Extracting $artifact_type archive to $plugin_path" + case "$artifact_type" in + "tgz"|"tar.gz") + tar -xzf "$archive_file" -C "$plugin_path" + ;; + "tar") + tar -xf "$archive_file" -C "$plugin_path" + ;; + "zip") + unzip -o "$archive_file" -d "$plugin_path" + ;; + *) + echo "Unsupported archive type: $artifact_type" + ;; + esac + + rm -rf "$temp_dir" + fi +} +# Process each Kafka connector +echo "Setting up Kafka connector plugins" + +# Download and extract artifacts for each enabled plugin +{{- range .plugins }} +{{- if eq .enabled true }} +echo "Processing plugin: {{ .name }}" +{{- $pluginName := .name }} +{{- range .artifacts }} +download_and_extract "{{ .type }}" "{{ .url }}" "{{ $.plugins_folder }}" "{{ $pluginName }}" +{{- end }} +{{- else }} +echo "Skipping disabled plugin: {{ .name }}" +{{- end }} +{{- end }} + +echo "All Kafka connector plugins have been downloaded and extracted." +echo "Sleeping..." +sleep infinity +{{- end }} + +{{- define "kafka.connectors.run.script" -}} +#!/bin/bash +set -e{{- if .verbose }}x{{- end }} + +# Function to create or update a connector +create_or_update_connector() { + local connector_name=$1 + local config=$2 + local cluster_connectors=$3 + + echo "Checking if connector $connector_name exists in cluster..." + + # Check if connector exists in the cluster-wide list (reliable in distributed mode) + local exists=false + if echo "$cluster_connectors" | grep -q "\"$connector_name\""; then + exists=true + echo "Connector $connector_name found in cluster list" + else + echo "Connector $connector_name does not exist in cluster" + fi + + if [ "$exists" = true ]; then + echo "Connector $connector_name exists. Updating configuration..." + + # Extract just the config part from the full connector JSON + # Remove the "name" line, remove everything up to and including "config": {, remove last two lines (both closing braces) + local config_content=$(echo "$config" | sed '/^[[:space:]]*"name":/d' | sed '1,/^[[:space:]]*"config":[[:space:]]*{/d' | sed '$d' | sed '$d') + local update_config="{${config_content}}" + + echo "Updating connector $connector_name with config:" + echo "$update_config" + + # Update the connector using PUT with nc (BusyBox wget doesn't support PUT) + local content_length=$(echo -n "$update_config" | wc -c | xargs) + local response + response=$(echo -e "PUT /connectors/$connector_name/config HTTP/1.1\r\nHost: localhost:8083\r\nContent-Type: application/json\r\nContent-Length: $content_length\r\nConnection: close\r\n\r\n$update_config" | nc localhost 8083 2>&1) + echo "HTTP response for $connector_name update: $(echo "$response" | head -1)" + if ! echo "$response" | grep -q "HTTP/1\.. 2"; then + echo "ERROR: Failed to update connector $connector_name. Response: $(echo "$response" | head -5)" + else + echo "Connector $connector_name updated successfully" + fi + else + echo "Connector $connector_name does not exist. Creating..." + + echo "Creating connector $connector_name with config:" + echo "$config" + + # Create the connector using POST (this may also update if connector exists) + wget -q -O /dev/null "http://localhost:8083/connectors" \ + --header="Content-Type: application/json" \ + --post-data="$config" + + echo "Connector $connector_name created successfully" + + # Add a delay after creation to allow connector to initialize + sleep 2 + fi +} + +# Function to check and setup truststore for SSL connections +truststore_init() { + local hostname=$1 + local port=$2 + local alias=$3 + local jdbc_props=$4 + + echo "Setting up truststore for $hostname:$port with alias $alias" + + # Parse JDBC connection properties + local truststore_path + local truststore_password + + # First check if ssl.truststore.location is provided in the config + if [[ -n "${SSL_TRUSTSTORE_LOCATION}" ]]; then + truststore_path="${SSL_TRUSTSTORE_LOCATION}" + echo "Using ssl.truststore.location from config: $truststore_path" + # Then check if it's in JDBC properties + elif [[ "$jdbc_props" =~ ssl\.truststore\.location=([^;]+) ]]; then + truststore_path="${BASH_REMATCH[1]}" + echo "Using ssl.truststore.location from JDBC properties: $truststore_path" + elif [[ "$jdbc_props" =~ trustStorePath=([^;]+) ]]; then + truststore_path="${BASH_REMATCH[1]}" + echo "Using trustStorePath from JDBC properties: $truststore_path" + else + # Use default path if not specified + truststore_path="/tmp/kafka.client.truststore.jks" + echo "No truststore path specified, using default: $truststore_path" + fi + + # Check if ssl.truststore.password is provided + if [[ -n "${SSL_TRUSTSTORE_PASSWORD}" ]]; then + truststore_password="${SSL_TRUSTSTORE_PASSWORD}" + echo "Using ssl.truststore.password from config" + # Then check if it's in JDBC properties + elif [[ "$jdbc_props" =~ ssl\.truststore\.password=([^;]+) ]]; then + truststore_password="${BASH_REMATCH[1]}" + echo "Using ssl.truststore.password from JDBC properties" + elif [[ "$jdbc_props" =~ trustStorePassword=([^;]+) ]]; then + truststore_password="${BASH_REMATCH[1]}" + echo "Using trustStorePassword from JDBC properties" + else + # Generate random password if not specified + truststore_password=$(openssl rand -base64 12) + export SSL_TRUSTSTORE_PASSWORD="${truststore_password}" + echo "Generated random ssl.truststore.password: ${truststore_password}" + fi + + # Create certs directory if it doesn't exist + mkdir -p $(dirname "$truststore_path") + + # Download CA certificate + echo "Downloading CA certificate for $hostname:$port" + echo | openssl s_client -connect $hostname:$port -showcerts 2>/dev/null | \ + openssl x509 -outform PEM > $(dirname "$truststore_path")/$alias.pem || \ + echo "WARNING: Failed to download certificate from $hostname:$port, truststore may be incomplete" + + # Create truststore if it doesn't exist or override existing one + echo "Creating new truststore from $JAVA_HOME/lib/security/cacerts" + if [[ -f "$JAVA_HOME/lib/security/cacerts" ]]; then + cp "$JAVA_HOME/lib/security/cacerts" "$truststore_path" || \ + echo "WARNING: Failed to copy cacerts to $truststore_path, truststore setup may be incomplete" + else + echo "WARNING: $JAVA_HOME/lib/security/cacerts not found, skipping truststore creation for $hostname" + return 0 + fi + + # Change the default password to our password + echo "Setting truststore password" + keytool -storepasswd -keystore "$truststore_path" \ + -storepass "changeit" -new "${truststore_password}" || \ + echo "WARNING: Failed to change truststore password for $hostname, continuing with default password" + + # Import certificate into truststore (only if cert file is non-empty) + local cert_file="$(dirname "$truststore_path")/$alias.pem" + if [[ -s "$cert_file" ]]; then + echo "Importing certificate into truststore" + keytool -import -noprompt -alias $alias -file "$cert_file" \ + -keystore "$truststore_path" -storepass "${truststore_password}" || \ + echo "WARNING: Failed to import certificate for $alias, truststore may be incomplete" + else + echo "WARNING: Skipping certificate import for $alias, cert file is empty or missing" + fi + + echo "Truststore setup completed for $hostname" +} + +# Function to setup multi-domain truststore from values configuration +setup_multi_domain_truststore() { + local plugin_name="$1" + local ssl_truststore_config="$2" + + echo "Setting up multi-domain truststore for plugin: $plugin_name" + + # Parse the ssl_truststore configuration (passed as JSON-like string) + local generate=$(echo "$ssl_truststore_config" | grep -o '"generate"[[:space:]]*:[[:space:]]*true' | wc -l) + + if [[ $generate -eq 0 ]]; then + echo "Multi-domain truststore generation disabled for $plugin_name" + return 0 + fi + + echo "Multi-domain truststore generation enabled for $plugin_name" + + # Extract truststore path (REQUIRED) + local truststore_path=$(echo "$ssl_truststore_config" | grep -o '"truststore_path"[[:space:]]*:[[:space:]]*"[^"]*"' | sed 's/.*"truststore_path"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/') + if [[ -z "$truststore_path" ]]; then + echo "ERROR: ssl_truststore.truststore_path is required when ssl_truststore.generate is true for plugin $plugin_name" + exit 1 + fi + + # Extract password environment variable name (REQUIRED) + local password_env=$(echo "$ssl_truststore_config" | grep -o '"truststore_password_env"[[:space:]]*:[[:space:]]*"[^"]*"' | sed 's/.*"truststore_password_env"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/') + if [[ -z "$password_env" ]]; then + echo "ERROR: ssl_truststore.truststore_password_env is required when ssl_truststore.generate is true for plugin $plugin_name" + exit 1 + fi + + # Check if password already exists + if [[ -n "${!password_env}" ]]; then + echo "Using existing password from environment variable: $password_env" + local truststore_password="${!password_env}" + else + # Generate random password + local truststore_password=$(openssl rand -base64 12) + export "$password_env"="$truststore_password" + echo "Generated random password for $password_env: $truststore_password" + fi + + # Create truststore directory if it doesn't exist + mkdir -p $(dirname "$truststore_path") + + # Create truststore if it doesn't exist or if we're starting fresh + if [[ ! -f "$truststore_path" ]]; then + echo "Creating new multi-domain truststore at: $truststore_path" + + # Validate JAVA_HOME exists + if [[ -z "$JAVA_HOME" ]]; then + echo "ERROR: JAVA_HOME environment variable is not set, required for truststore creation for plugin $plugin_name" + exit 1 + fi + + # Validate cacerts file exists + if [[ ! -f "$JAVA_HOME/lib/security/cacerts" ]]; then + echo "ERROR: Java cacerts file not found at $JAVA_HOME/lib/security/cacerts for plugin $plugin_name" + exit 1 + fi + + # Copy cacerts as base truststore + if ! cp "$JAVA_HOME/lib/security/cacerts" "$truststore_path"; then + echo "ERROR: Failed to create truststore file at $truststore_path for plugin $plugin_name" + exit 1 + fi + + # Change the default password to our password + echo "Setting truststore password" + if ! keytool -storepasswd -keystore "$truststore_path" \ + -storepass "changeit" -new "$truststore_password" >/dev/null 2>&1; then + echo "ERROR: Failed to set truststore password for plugin $plugin_name" + exit 1 + fi + fi + + # Extract and process hostnames (REQUIRED) + local hostnames=$(echo "$ssl_truststore_config" | grep -o '"hostnames"[[:space:]]*:[[:space:]]*\[[^]]*\]' | sed 's/.*"hostnames"[[:space:]]*:[[:space:]]*\[\([^]]*\)\].*/\1/' | tr ',' '\n') + + if [[ -z "$hostnames" ]]; then + echo "ERROR: ssl_truststore.hostnames is required and must be a non-empty array when ssl_truststore.generate is true for plugin $plugin_name" + exit 1 + fi + + # Validate that hostnames array is not empty + local hostname_count=$(echo "$hostnames" | grep -v '^$' | wc -l) + if [[ $hostname_count -eq 0 ]]; then + echo "ERROR: ssl_truststore.hostnames must contain at least one hostname when ssl_truststore.generate is true for plugin $plugin_name" + exit 1 + fi + + # Download and import certificates for each hostname + while IFS= read -r hostname_entry; do + if [[ -n "$hostname_entry" ]]; then + # Clean up the hostname (remove quotes and whitespace) + local clean_hostname=$(echo "$hostname_entry" | sed 's/[[:space:]]*"\([^"]*\)".*/\1/' | xargs) + + if [[ -n "$clean_hostname" ]]; then + local hostname=$(echo "$clean_hostname" | cut -d':' -f1) + local port=$(echo "$clean_hostname" | cut -d':' -f2) + + # Validate hostname:port format + if [[ -z "$hostname" || -z "$port" || "$hostname" == "$port" ]]; then + echo "ERROR: Invalid hostname format '$clean_hostname' in ssl_truststore.hostnames for plugin $plugin_name. Expected format: 'hostname:port'" + exit 1 + fi + + # Validate port is numeric + if ! [[ "$port" =~ ^[0-9]+$ ]]; then + echo "ERROR: Invalid port '$port' in hostname '$clean_hostname' for plugin $plugin_name. Port must be numeric." + exit 1 + fi + + local alias="$plugin_name-$(echo $hostname | tr '.' '-')" + + echo "Downloading certificate for $hostname:$port with alias $alias" + + # Download CA certificate + local cert_file="$(dirname "$truststore_path")/$alias.pem" + if ! echo | openssl s_client -connect $hostname:$port -showcerts 2>/dev/null | \ + openssl x509 -outform PEM > "$cert_file"; then + echo "ERROR: Failed to download certificate from $hostname:$port for plugin $plugin_name" + exit 1 + fi + + # Validate certificate file is not empty + if [[ ! -s "$cert_file" ]]; then + echo "ERROR: Downloaded certificate from $hostname:$port is empty for plugin $plugin_name" + exit 1 + fi + + # Import certificate into truststore (skip if already exists) + if keytool -list -keystore "$truststore_path" -storepass "$truststore_password" -alias "$alias" >/dev/null 2>&1; then + echo "Certificate with alias $alias already exists in truststore, skipping" + else + echo "Importing certificate with alias $alias into truststore" + if ! keytool -import -noprompt -alias "$alias" -file "$cert_file" \ + -keystore "$truststore_path" -storepass "$truststore_password" >/dev/null 2>&1; then + echo "ERROR: Failed to import certificate with alias $alias into truststore for plugin $plugin_name" + exit 1 + fi + fi + fi + fi + done <<< "$hostnames" + + + + echo "Multi-domain truststore setup completed for $plugin_name at: $truststore_path" +} + +# Function to setup connectors in the background +setup_connectors() { + echo "Starting connector setup process..." + + # Wait for Kafka Connect to start + echo "Waiting for Kafka Connect to start..." + until wget -q http://localhost:8083/ -O /dev/null; do + echo "Waiting for Kafka Connect REST API..." + sleep 5 + done + + echo "Kafka Connect REST API is up. Waiting for connector plugins to load..." + # Wait for connector plugins to be available (indicates full initialization) + local retry_count=0 + local max_retries=12 + until wget -q -O - http://localhost:8083/connector-plugins 2>/dev/null | grep -q "class" || [ $retry_count -ge $max_retries ]; do + echo "Waiting for connector plugins to load... (attempt $((retry_count+1))/$max_retries)" + sleep 5 + retry_count=$((retry_count+1)) + done + + echo "Kafka Connect plugins loaded. Now waiting for existing connectors to be restored from connect-config topic..." + # Poll for connectors to be restored, but with a timeout + local connector_wait=0 + local max_connector_wait=20 + local prev_count=-1 + while [ $connector_wait -lt $max_connector_wait ]; do + INSTALLED_CONNECTORS=$(wget -q -O - http://localhost:8083/connectors 2>/dev/null || echo "[]") + local current_count=$(echo "$INSTALLED_CONNECTORS" | tr -d '[]"' | tr ',' '\n' | grep -v '^$' | wc -l | xargs) + + if [ "$current_count" != "$prev_count" ]; then + echo "Connectors being restored... Found $current_count connector(s) so far: $INSTALLED_CONNECTORS" + prev_count=$current_count + connector_wait=0 # Reset wait counter when we see changes + else + if [ $connector_wait -eq 0 ] && [ "$current_count" -gt 0 ]; then + echo "Connector count stable at $current_count. Waiting 5 more seconds to ensure restoration is complete..." + fi + connector_wait=$((connector_wait+1)) + fi + + sleep 1 + done + + # Get final list of currently installed connectors + echo "Fetching final list of installed connectors..." + INSTALLED_CONNECTORS=$(wget -q -O - http://localhost:8083/connectors 2>/dev/null || echo "[]") + echo "Installed connectors: $INSTALLED_CONNECTORS" + + # Build list of desired connectors from values file + DESIRED_CONNECTORS=({{- range .plugins }} "{{ .name }}"{{- end }}) + echo "Desired connectors from values: ${DESIRED_CONNECTORS[@]}" + echo "Number of desired connectors: ${#DESIRED_CONNECTORS[@]}" + + # Remove connectors that are not in the desired list + if [[ "$INSTALLED_CONNECTORS" != "[]" && "$INSTALLED_CONNECTORS" != "" ]]; then + echo "$INSTALLED_CONNECTORS" | tr -d '[]"' | tr ',' '\n' | while IFS= read -r connector; do + connector=$(echo "$connector" | xargs) # trim whitespace + if [[ -n "$connector" ]]; then + found=false + for desired in "${DESIRED_CONNECTORS[@]}"; do + if [[ "$connector" == "$desired" ]]; then + found=true + break + fi + done + if [[ "$found" == "false" ]]; then + echo "Connector '$connector' is not enabled. Removing..." + (echo -e "DELETE /connectors/$connector HTTP/1.1\r\nHost: localhost:8083\r\nConnection: close\r\n\r\n" | nc localhost 8083 > /dev/null 2>&1) || true + fi + fi + done + fi + + # Create/update connectors + {{- range .plugins }} + echo "Processing connector: {{ .name }}" + {{- if hasKey . "enabled" }} + {{- if not .enabled }} + echo "Connector {{ .name }} is disabled. Removing if it exists..." + (echo -e "DELETE /connectors/{{ .name }} HTTP/1.1\r\nHost: localhost:8083\r\nConnection: close\r\n\r\n" | nc localhost 8083 > /dev/null 2>&1) || true + + # Wait a bit between connectors to allow Kafka Connect API to stabilize + sleep 3 + {{- else }} + echo "Creating/updating connector: {{ .name }}" + +{{- if hasKey . "ssl_truststore" }} +# Setup multi-domain truststore if configured +SSL_TRUSTSTORE_CONFIG='{"generate":{{ if hasKey .ssl_truststore "generate" }}{{ .ssl_truststore.generate }}{{ else }}false{{ end }}{{- if hasKey .ssl_truststore "truststore_path" }},"truststore_path":"{{ .ssl_truststore.truststore_path }}"{{- end }}{{- if hasKey .ssl_truststore "truststore_password_env" }},"truststore_password_env":"{{ .ssl_truststore.truststore_password_env }}"{{- end }}{{- if hasKey .ssl_truststore "hostnames" }},"hostnames":[{{- range $i, $hostname := .ssl_truststore.hostnames }}{{- if $i }},{{- end }}"{{ $hostname }}"{{- end }}]{{- end }}}' +setup_multi_domain_truststore "{{ .name }}" "$SSL_TRUSTSTORE_CONFIG" +{{- end }} + +{{- if and (hasKey .config "ssl") (eq .config.ssl "true") }} +# Check if SSL is enabled +# Export ssl.truststore.location if it exists +{{- if hasKey .config "ssl.truststore.location" }} +export SSL_TRUSTSTORE_LOCATION={{ index .config "ssl.truststore.location" | quote }} +{{- end }} +# Export ssl.truststore.password if it exists +{{- if hasKey .config "ssl.truststore.password" }} +export SSL_TRUSTSTORE_PASSWORD={{ index .config "ssl.truststore.password" | quote }} +{{- end }} +# Setup truststore +truststore_init "{{ .config.hostname }}" "{{ .config.port }}" "{{ .name }}" "{{ default "" .config.jdbcConnectionProperties }}" +{{- end }} + +CONFIG=$(cat << 'EOF' +{ + "name": "{{ .name }}", + "config": { + {{- $first := true }} + {{- range $key, $value := .config }} + {{- if $first }}{{ $first = false }}{{ else }},{{ end }} + "{{ $key }}": "{{ $value }}" + {{- end }} + {{- if and (hasKey .config "ssl") (eq .config.ssl "true") (not (hasKey .config "ssl.truststore.password")) }} + ,"ssl.truststore.password": "${SSL_TRUSTSTORE_PASSWORD}" + {{- end }} + } +} +EOF +) + +# If we have a generated password, replace it in the config +if [[ -n "${SSL_TRUSTSTORE_PASSWORD}" && ! "{{ if hasKey .config "ssl.truststore.password" }}true{{ else }}false{{ end }}" == "true" ]]; then + CONFIG=$(echo "$CONFIG" | sed "s|\${SSL_TRUSTSTORE_PASSWORD}|${SSL_TRUSTSTORE_PASSWORD}|g") +fi + + {{- if hasKey . "ssl_truststore" }} + {{- if hasKey .ssl_truststore "truststore_password_env" }} + # Replace plugin-specific truststore password if it exists + PLUGIN_PASSWORD_VAR="{{ .ssl_truststore.truststore_password_env }}" + echo "DEBUG: Looking for password in environment variable: $PLUGIN_PASSWORD_VAR" + echo "DEBUG: Password value: ${!PLUGIN_PASSWORD_VAR}" + if [[ -n "${!PLUGIN_PASSWORD_VAR}" ]]; then + echo "DEBUG: Replacing \${${PLUGIN_PASSWORD_VAR}} with password in config" + CONFIG=$(echo "$CONFIG" | sed "s|\${${PLUGIN_PASSWORD_VAR}}|${!PLUGIN_PASSWORD_VAR}|g") + echo "DEBUG: Config after replacement:" + echo "$CONFIG" + else + echo "DEBUG: No password found in $PLUGIN_PASSWORD_VAR" + fi + {{- end }} + {{- end }} + +# Try to create connector with retry logic +max_retries=5 +retry_count=0 +while [ $retry_count -lt $max_retries ]; do + if create_or_update_connector "{{ .name }}" "$CONFIG" "$INSTALLED_CONNECTORS"; then + echo "Successfully created/updated connector {{ .name }} on attempt $((retry_count+1))" + break + else + retry_count=$((retry_count+1)) + if [ $retry_count -lt $max_retries ]; then + echo "Failed to create/update connector {{ .name }}, retrying in 10 seconds (attempt $retry_count/$max_retries)..." + sleep 10 + else + echo "Failed to create/update connector {{ .name }} after $max_retries attempts" + fi + fi +done + +# Wait a bit between connectors to allow Kafka Connect API to stabilize +sleep 3 +{{- end }} + {{- else }} + echo "Creating/updating connector: {{ .name }}" + {{- if and (hasKey .config "ssl") (eq .config.ssl "true") }} + # Check if SSL is enabled + # Export ssl.truststore.location if it exists + {{- if hasKey .config "ssl.truststore.location" }} + export SSL_TRUSTSTORE_LOCATION={{ index .config "ssl.truststore.location" | quote }} + {{- end }} + # Export ssl.truststore.password if it exists + {{- if hasKey .config "ssl.truststore.password" }} + export SSL_TRUSTSTORE_PASSWORD={{ index .config "ssl.truststore.password" | quote }} + {{- end }} + # Setup truststore + truststore_init "{{ .config.hostname }}" "{{ .config.port }}" "{{ .name }}" "{{ default "" .config.jdbcConnectionProperties }}" + {{- end }} + + CONFIG=$(cat << 'EOF' +{ + "name": "{{ .name }}", + "config": { + {{- $first := true }} + {{- range $key, $value := .config }} + {{- if $first }}{{ $first = false }}{{ else }},{{ end }} + "{{ $key }}": "{{ $value }}" + {{- end }} + {{- if and (hasKey .config "ssl") (eq .config.ssl "true") (not (hasKey .config "ssl.truststore.password")) }} + {{- if not $first }},{{ end }} + "ssl.truststore.password": "${SSL_TRUSTSTORE_PASSWORD}" + {{- end }} + } +} +EOF +) + + # If we have a generated password, replace it in the config + if [[ -n "${SSL_TRUSTSTORE_PASSWORD}" && ! "{{ if hasKey .config "ssl.truststore.password" }}true{{ else }}false{{ end }}" == "true" ]]; then + CONFIG=$(echo "$CONFIG" | sed "s|\${SSL_TRUSTSTORE_PASSWORD}|${SSL_TRUSTSTORE_PASSWORD}|g") + fi + + {{- if hasKey . "ssl_truststore" }} +{{- if hasKey .ssl_truststore "truststore_password_env" }} +# Replace plugin-specific truststore password if it exists +PLUGIN_PASSWORD_VAR="{{ .ssl_truststore.truststore_password_env }}" +echo "DEBUG: Looking for password in environment variable: $PLUGIN_PASSWORD_VAR" +echo "DEBUG: Password value: ${!PLUGIN_PASSWORD_VAR}" +if [[ -n "${!PLUGIN_PASSWORD_VAR}" ]]; then + echo "DEBUG: Replacing \${${PLUGIN_PASSWORD_VAR}} with password in config" + CONFIG=$(echo "$CONFIG" | sed "s|\${${PLUGIN_PASSWORD_VAR}}|${!PLUGIN_PASSWORD_VAR}|g") + echo "DEBUG: Config after replacement:" + echo "$CONFIG" +else + echo "DEBUG: No password found in $PLUGIN_PASSWORD_VAR" +fi +{{- end }} +{{- end }} + + # Try to create connector with retry logic +max_retries=5 +retry_count=0 +while [ $retry_count -lt $max_retries ]; do + if create_or_update_connector "{{ .name }}" "$CONFIG" "$INSTALLED_CONNECTORS"; then + echo "Successfully created/updated connector {{ .name }} on attempt $((retry_count+1))" + break + else + retry_count=$((retry_count+1)) + if [ $retry_count -lt $max_retries ]; then + echo "Failed to create/update connector {{ .name }}, retrying in 10 seconds (attempt $retry_count/$max_retries)..." + sleep 10 + else + echo "Failed to create/update connector {{ .name }} after $max_retries attempts" + fi + fi +done + +# Wait a bit between connectors to allow Kafka Connect API to stabilize +sleep 3 + {{- end }} + {{- end }} + + echo "All Kafka connectors have been configured and started." +} + +# Signal handler for graceful shutdown +cleanup() { + echo "Received shutdown signal, stopping Kafka Connect..." + if [[ -n $KAFKA_PID ]]; then + kill -TERM $KAFKA_PID + wait $KAFKA_PID + fi + exit 0 +} + +# Set up signal handlers +trap cleanup SIGTERM SIGINT + +echo "Starting Kafka Connect distributed worker..." + +# Updating rest.advertised.host.name dynamically +POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) +WORKLOAD_NAME=$(echo $CPLN_WORKLOAD | sed 's|.*/workload/\([^/]*\)$|\1|') +cp /opt/kafka/config/connect-distributed.properties /opt/kafka/config/connect-distributed-updated.properties +echo "" >> /opt/kafka/config/connect-distributed-updated.properties +echo "rest.advertised.host.name=${WORKLOAD_NAME}-${POD_ID}.${WORKLOAD_NAME}" >> /opt/kafka/config/connect-distributed-updated.properties + +# Start the connector setup process in the background +setup_connectors & +SETUP_PID=$! + +# Start Kafka Connect in the foreground +echo "Starting Kafka Connect in foreground mode..." +exec /opt/kafka/bin/connect-distributed.sh /opt/kafka/config/connect-distributed-updated.properties & +KAFKA_PID=$! + +# Wait for either process to finish +wait $KAFKA_PID +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "kafka.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/domain.yaml b/kafka/versions/4.0.0/templates/domain.yaml new file mode 100644 index 00000000..acf09fa9 --- /dev/null +++ b/kafka/versions/4.0.0/templates/domain.yaml @@ -0,0 +1,64 @@ +{{- include "kafka.validateOnePublicAddress" . }} +{{- include "kafka.validatedirectReplicaRoutingConfig" . }} +{{- range $key, $listener := .Values.kafka.listeners }} +{{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }} +--- +kind: domain +name: {{ $listener.directReplicaRouting.publicAddress }} +description: {{ $listener.directReplicaRouting.publicAddress }} +spec: + acceptAllHosts: false + acceptAllSubdomains: false + certChallengeType: dns01 + dnsMode: cname + ports: + - number: {{ $listener.directReplicaRouting.containerPort }} + protocol: tcp + routes: + - port: {{ $listener.directReplicaRouting.containerPort }} + prefix: / + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "kafka.clusterName" $ }} + tls: + cipherSuites: + - ECDHE-ECDSA-AES256-GCM-SHA384 + - ECDHE-ECDSA-CHACHA20-POLY1305 + - ECDHE-ECDSA-AES128-GCM-SHA256 + - ECDHE-RSA-AES256-GCM-SHA384 + - ECDHE-RSA-CHACHA20-POLY1305 + - ECDHE-RSA-AES128-GCM-SHA256 + - AES256-GCM-SHA384 + - AES128-GCM-SHA256 + minProtocolVersion: TLSV1_2 + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "kafka.clusterName" $ }} +{{- else if $listener.publicAddress }} +--- +kind: domain +name: {{ $listener.publicAddress }} +description: {{ $listener.publicAddress }} +spec: + acceptAllHosts: false + dnsMode: cname + ports: + {{- $replicaCount := $.Values.kafka.replicas | int }} + {{- range $i := until $replicaCount }} + - number: {{ add 3000 $i }} + protocol: tcp + routes: + - port: {{ add 3000 $i }} + prefix: / + replica: {{ $i }} + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "kafka.clusterName" $ }} + tls: + cipherSuites: + - ECDHE-ECDSA-AES256-GCM-SHA384 + - ECDHE-ECDSA-CHACHA20-POLY1305 + - ECDHE-ECDSA-AES128-GCM-SHA256 + - ECDHE-RSA-AES256-GCM-SHA384 + - ECDHE-RSA-CHACHA20-POLY1305 + - ECDHE-RSA-AES128-GCM-SHA256 + - AES256-GCM-SHA384 + - AES128-GCM-SHA256 + minProtocolVersion: TLSV1_2 + {{- end }} +{{- end }} +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/identity.yaml b/kafka/versions/4.0.0/templates/identity.yaml new file mode 100644 index 00000000..571ed3b6 --- /dev/null +++ b/kafka/versions/4.0.0/templates/identity.yaml @@ -0,0 +1,4 @@ +kind: identity +name: {{ include "kafka.name" . }} +description: {{ include "kafka.clusterName" . }} identity +gvc: {{ .Values.global.cpln.gvc }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/kafbat-ui.yaml b/kafka/versions/4.0.0/templates/kafbat-ui.yaml new file mode 100644 index 00000000..992c5ba1 --- /dev/null +++ b/kafka/versions/4.0.0/templates/kafbat-ui.yaml @@ -0,0 +1,114 @@ +{{- if .Values.kafbat_ui.enabled }} +{{- if .Values.kafbat_ui.domain }} +kind: domain +name: {{ .Values.kafbat_ui.domain }} +description: {{ .Values.kafbat_ui.domain }} +spec: + acceptAllHosts: false + dnsMode: cname + ports: + - number: 443 + protocol: http2 + routes: + - port: 8080 + prefix: / + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} + tls: + cipherSuites: + - ECDHE-ECDSA-AES256-GCM-SHA384 + - ECDHE-ECDSA-CHACHA20-POLY1305 + - ECDHE-ECDSA-AES128-GCM-SHA256 + - ECDHE-RSA-AES256-GCM-SHA384 + - ECDHE-RSA-CHACHA20-POLY1305 + - ECDHE-RSA-AES128-GCM-SHA256 + - AES256-GCM-SHA384 + - AES128-GCM-SHA256 + minProtocolVersion: TLSV1_2 +--- +{{- end }} +kind: policy +name: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +description: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +tags: {{- include "kafka.tags" . | nindent 2 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ $.Values.global.cpln.gvc }}/identity/{{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +targetKind: secret +targetLinks: + - //secret/{{ .Values.kafbat_ui.configuration_secret }} +--- +kind: identity +name: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +description: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +gvc: {{ $.Values.global.cpln.gvc }} +--- +kind: workload +name: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +description: {{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} +tags: + {{- if .Values.kafbat_ui.deletionProtection }} + cpln/protected: true + {{- end }} + {{- include "kafka.tags" . | nindent 2 }} +spec: + type: standard + containers: + - name: kafbat-ui + cpu: {{ .Values.kafbat_ui.cpu }} + {{- if .Values.kafbat_ui.minCpu }} + minCpu: '{{ .Values.kafbat_ui.minCpu }}' + {{- end }} + env: + - name: SPRING_CONFIG_ADDITIONAL-LOCATION + value: /etc/config.yaml + image: {{ .Values.kafbat_ui.image }} + inheritEnv: false + memory: {{ .Values.kafbat_ui.memory }} + {{- if .Values.kafbat_ui.minMemory }} + minMemory: {{ .Values.kafbat_ui.minMemory }} + {{- end }} + ports: + - number: 8080 + volumes: + - path: /etc/config.yaml + recoveryPolicy: retain + uri: cpln://secret/{{ .Values.kafbat_ui.configuration_secret }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ .Values.kafbat_ui.replicas }} + metric: disabled + minScale: {{ .Values.kafbat_ui.replicas }} + scaleToZeroDelay: 300 + target: 100 + {{- if or .Values.kafbat_ui.minCpu .Values.kafbat_ui.minMemory }} + capacityAI: true + {{- else }} + capacityAI: false + {{- end }} + debug: false + suspend: false + timeoutSeconds: {{ .Values.kafbat_ui.timeoutSeconds }} +{{- if .Values.kafbat_ui.firewall }} + firewallConfig: + {{- if or (hasKey .Values.kafbat_ui.firewall "external_inboundAllowCIDR") (hasKey .Values.kafbat_ui.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.kafbat_ui.firewall.external_inboundAllowCIDR }}{{ .Values.kafbat_ui.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.kafbat_ui.firewall.external_outboundAllowCIDR }}{{ .Values.kafbat_ui.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.kafbat_ui.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.kafbat_ui.firewall.internal_inboundAllowType }} + {{- end }} +{{- end }} + identityLink: //gvc/{{ $.Values.global.cpln.gvc }}/identity/{{ include "kafka.name" $ }}-{{ .Values.kafbat_ui.name }} + loadBalancer: + direct: + enabled: false + ports: [] + securityOptions: + filesystemGroupId: 101 + supportDynamicTags: false +{{- end }} diff --git a/kafka/versions/4.0.0/templates/kafka-connectors.yaml b/kafka/versions/4.0.0/templates/kafka-connectors.yaml new file mode 100644 index 00000000..4d0b8300 --- /dev/null +++ b/kafka/versions/4.0.0/templates/kafka-connectors.yaml @@ -0,0 +1,224 @@ +{{- if .Values.kafka_connectors }} +{{- range .Values.kafka_connectors }} +{{- include "kafka.validateImage" . -}} +kind: policy +name: {{ include "kafka.name" $ }}-connect-{{ .name }} +description: {{ include "kafka.name" $ }}-connect-{{ .name }} +tags: {{- include "kafka.tags" $ | nindent 2 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ $.Values.global.cpln.gvc }}/identity/{{ include "kafka.name" $ }}-connect-{{ .name }} +targetKind: secret +targetLinks: + - //secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-props + - //secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-init + - //secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-download + {{- if and .extraVolumes (ne (len .extraVolumes) 0) }} + {{- range .extraVolumes }} + {{- if contains "/secret/" .uri }} + {{- $secretName := regexReplaceAll ".*//secret/([^.]+).*" .uri "${1}" }} + - //secret/{{ $secretName }} + {{- end }} + {{- end }} + {{- end }} + {{- /* Check for secrets in ssl_truststore configuration values */ -}} + {{- if hasKey . "ssl_truststore" }} + {{- range $key, $value := .ssl_truststore }} + {{- if and (kindIs "string" $value) (contains "//secret/" $value) }} + {{- $secretName := regexReplaceAll ".*//secret/([^.]+).*" $value "${1}" }} + - //secret/{{ $secretName }} + {{- end }} + {{- end }} + {{- end }} + {{- /* Check for secrets in env variables */ -}} + {{- if hasKey . "env" }} + {{- range .env }} + {{- if and (hasKey . "value") (contains "//secret/" .value) }} + {{- $secretName := regexReplaceAll ".*//secret/([^.]+).*" .value "${1}" }} + - //secret/{{ $secretName }} + {{- end }} + {{- end }} + {{- end }} + {{- /* Check for secrets in connector_properties */ -}} + {{- if hasKey . "connector_properties" }} + {{- range $key, $value := .connector_properties }} + {{- if and (kindIs "string" $value) (contains "//secret/" $value) }} + {{- $secretName := regexReplaceAll ".*//secret/([^.]+).*" $value "${1}" }} + - //secret/{{ $secretName }} + {{- end }} + {{- end }} + {{- end }} +--- +kind: secret +name: {{ include "kafka.name" $ }}-connect-{{ .name }}-download +type: opaque +data: + encoding: plain + payload: | + {{- include "kafka.connectors.download.script" (dict "plugins" .plugins "plugins_folder" .plugins_folder "verbose" .verbose) | nindent 4 }} +--- +kind: secret +name: {{ include "kafka.name" $ }}-connect-{{ .name }}-init +type: opaque +data: + encoding: plain + payload: | + {{- include "kafka.connectors.run.script" (dict "plugins" .plugins "plugins_folder" .plugins_folder "verbose" .verbose) | nindent 4 }} +--- +kind: secret +name: {{ include "kafka.name" $ }}-connect-{{ .name }}-props +description: {{ include "kafka.name" $ }}-connect-{{ .name }}-props +tags: {{- include "kafka.tags" $ | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + {{- if not (hasKey .connector_properties "bootstrap.servers") }} + bootstrap.servers={{ include "kafka.clientBootstrapAddress" $ }} + {{- end }} + {{- range $key, $value := .connector_properties }} + {{ $key }}={{ $value }} + {{- end }} +--- +kind: identity +name: {{ include "kafka.name" $ }}-connect-{{ .name }} +description: {{ include "kafka.name" $ }}-connect-{{ .name }} +gvc: {{ $.Values.global.cpln.gvc }} +--- +kind: volumeset +name: {{ include "kafka.name" $ }}-connect-{{ .name }} +description: {{ include "kafka.name" $ }}-connect-{{ .name }} +tags: {{- include "kafka.tags" $ | nindent 2 }} +spec: + fileSystemType: {{ dig "volumes" "fileSystemType" "ext4" . }} + initialCapacity: {{ dig "volumes" "initialCapacity" 10 . }} + performanceClass: {{ dig "volumes" "performanceClass" "general-purpose-ssd" . }} + {{- if and .volumes .volumes.customEncryption .volumes.customEncryption.enabled }} + customEncryption: + regions: + {{ .volumes.customEncryption.region }}: + keyId: '{{ .volumes.customEncryption.keyId }}' + {{- end }} + {{- if and .volumes .volumes.snapshots }} + snapshots: + createFinalSnapshot: {{ .volumes.snapshots.createFinalSnapshot | default true }} + retentionDuration: {{ .volumes.snapshots.retentionDuration | default "7d" }} + {{- if .volumes.snapshots.schedule }} + schedule: {{ .volumes.snapshots.schedule }} + {{- end }} + {{- else }} + snapshots: + createFinalSnapshot: true + retentionDuration: 7d + {{- end }} +--- +kind: workload +name: {{ include "kafka.name" $ }}-connect-{{ .name }} +description: {{ include "kafka.name" $ }}-connect-{{ .name }} +gvc: {{ $.Values.global.cpln.gvc }} +tags: + {{- if .deletionProtection }} + cpln/protected: true + {{- end }} + {{- include "kafka.tags" $ | nindent 2 }} +spec: + type: stateful + containers: + - name: kafka-connect + {{- if and .env (ne (len .env) 0) }} + env: + {{- toYaml .env | nindent 8 }} + {{- end }} + args: + - '-c' + - sleep 60 && cp /opt/kafka/init.sh /opt/kafka/init-run.sh && chmod +x /opt/kafka/init-run.sh && /opt/kafka/init-run.sh + command: /bin/bash + cpu: {{ .cpu }} + {{- if .minCpu }} + minCpu: {{ .minCpu }} + {{- end }} + image: {{ .image }} + inheritEnv: false + memory: {{ .memory }} + {{- if .minMemory }} + minMemory: {{ .minMemory }} + {{- end }} + ports: + - number: 8083 + protocol: http + volumes: + - path: /opt/kafka/plugins + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "kafka.name" $ }}-connect-{{ .name }} + - path: /opt/kafka/config/connect-distributed.properties + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-props + - path: /opt/kafka/init.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-init + {{- if .extraVolumes }} + {{- toYaml .extraVolumes | nindent 8 }} + {{- end }} + - name: plugins-downloader + args: + - '-c' + - cp /opt/kafka/download.sh /opt/kafka/download-run.sh && chmod +x /opt/kafka/download-run.sh && /opt/kafka/download-run.sh + command: /bin/sh + cpu: 80m + image: busybox:musl + inheritEnv: false + memory: 120Mi + ports: [] + volumes: + - path: /opt/kafka/plugins + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "kafka.name" $ }}-connect-{{ .name }} + - path: /opt/kafka/download.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" $ }}-connect-{{ .name }}-download + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ .replicas }} + metric: cpu + minScale: {{ .replicas }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + {{- if .multiZone }} + multiZone: + enabled: true + {{- else }} + multiZone: + enabled: false + {{- end }} + suspend: false + timeoutSeconds: {{ .timeoutSeconds }} +{{- if .firewall }} + firewallConfig: + {{- if or (hasKey .firewall "external_inboundAllowCIDR") (hasKey .firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .firewall.external_inboundAllowCIDR }}{{ .firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .firewall.external_outboundAllowCIDR }}{{ .firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "none" .firewall.internal_inboundAllowType }} + {{- if hasKey .firewall "inboundAllowWorkload" }} + inboundAllowWorkload: {{ .firewall.inboundAllowWorkload | toYaml | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} + identityLink: //gvc/{{ $.Values.global.cpln.gvc }}/identity/{{ include "kafka.name" $ }}-connect-{{ .name }} + loadBalancer: + direct: + enabled: false + ports: [] + securityOptions: + filesystemGroupId: 1001 + supportDynamicTags: false +{{- end }} +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/kafka-rest-proxy.yaml b/kafka/versions/4.0.0/templates/kafka-rest-proxy.yaml new file mode 100644 index 00000000..177f8c2b --- /dev/null +++ b/kafka/versions/4.0.0/templates/kafka-rest-proxy.yaml @@ -0,0 +1,169 @@ +{{- if .Values.kafka_rest_proxy.enabled }} +{{- if .Values.kafka_rest_proxy.password_properties }} +kind: secret +name: {{ include "kafka.name" . }}-rest-password-properties +description: {{ include "kafka.name" . }}-rest-password-properties +tags: {{- include "kafka.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + {{- range $key, $value := .Values.kafka_rest_proxy.password_properties }} + {{ $key }}: {{ $value }} + {{- end }} +{{- end }} +--- +kind: secret +name: {{ include "kafka.name" . }}-rest-properties +description: {{ include "kafka.name" . }}-rest-properties +tags: {{- include "kafka.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: |- + {{- range $key, $value := .Values.kafka_rest_proxy.properties }} + {{ $key }}={{ $value }} + {{- end }} +--- +kind: secret +name: {{ include "kafka.name" . }}-rest-jaas-conf +description: {{ include "kafka.name" . }}-rest-jaas-conf +tags: {{- include "kafka.tags" . | nindent 2 }} +type: opaque +data: + encoding: plain + payload: >- + {{- .Values.kafka_rest_proxy.jaas_conf | nindent 4 }} +--- +kind: identity +name: {{ include "kafka.name" . }}-rest-proxy-identity +description: Identity for Kafka Rest Proxy {{ include "kafka.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +--- +kind: policy +name: {{ include "kafka.name" . }}-rest-proxy-policy +origin: default +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "kafka.name" . }}-rest-proxy-identity +targetKind: secret +targetLinks: +{{- if .Values.kafka_rest_proxy.password_properties }} + - //secret/{{ include "kafka.name" . }}-rest-password-properties +{{- end }} + - //secret/{{ include "kafka.name" . }}-rest-properties + - //secret/{{ include "kafka.name" . }}-rest-jaas-conf +--- +kind: workload +name: {{ include "kafka.name" . }}-{{ .Values.kafka_rest_proxy.name }} +description: Kafka Rest Proxy +gvc: {{ .Values.global.cpln.gvc }} +tags: + {{- if .Values.kafka_rest_proxy.deletionProtection }} + cpln/protected: true + {{- end }} + cpln/marketplace: "true" + cpln/marketplace-template: kafka + cpln/marketplace-template-version: {{ .Chart.Version }} +spec: + type: standard + containers: + - name: rest-proxy + args: + - '-c' + - >- + KAFKAREST_OPTS="-Djava.security.auth.login.config=/etc/kafka-rest/kafka-rest.jaas.conf" + kafka-rest-start /etc/kafka-rest/kafka-rest.properties + command: /bin/bash + cpu: {{ .Values.kafka_rest_proxy.cpu }} + image: {{ .Values.kafka_rest_proxy.image }} + inheritEnv: false + memory: {{ .Values.kafka_rest_proxy.memory }} + {{- if and .Values.kafka_rest_proxy.capacityAI .Values.kafka_rest_proxy.capacityAI.enabled }} + {{- if .Values.kafka_rest_proxy.capacityAI.minCpu }} + minCpu: {{ .Values.kafka_rest_proxy.capacityAI.minCpu }} + {{- end }} + {{- if .Values.kafka_rest_proxy.capacityAI.minMemory }} + minMemory: {{ .Values.kafka_rest_proxy.capacityAI.minMemory }} + {{- end }} + {{- end }} + ports: + - number: 8082 + protocol: http + volumes: + {{- if .Values.kafka_rest_proxy.password_properties }} + - path: /etc/kafka-rest/password.properties + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" . }}-rest-password-properties + {{- end }} + - path: /etc/kafka-rest/kafka-rest.jaas.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" . }}-rest-jaas-conf + - path: /etc/kafka-rest/kafka-rest.properties + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" . }}-rest-properties + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ .Values.kafka_rest_proxy.replicas }} + metric: disabled + minScale: {{ .Values.kafka_rest_proxy.replicas }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: {{ .Values.kafka_rest_proxy.capacityAI.enabled }} + debug: false + suspend: false + timeoutSeconds: {{ .Values.kafka_rest_proxy.timeoutSeconds }} +{{- if .Values.kafka_rest_proxy.firewall }} + firewallConfig: + {{- if or (hasKey .Values.kafka_rest_proxy.firewall "external_inboundAllowCIDR") (hasKey .Values.kafka_rest_proxy.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.kafka_rest_proxy.firewall.external_inboundAllowCIDR }}{{ .Values.kafka_rest_proxy.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.kafka_rest_proxy.firewall.external_outboundAllowCIDR }}{{ .Values.kafka_rest_proxy.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.kafka_rest_proxy.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.kafka_rest_proxy.firewall.internal_inboundAllowType }} + {{- if .Values.kafka_rest_proxy.firewall.inboundAllowWorkload }} + inboundAllowWorkload: {{ .Values.kafka_rest_proxy.firewall.inboundAllowWorkload | toYaml | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} + identityLink: //identity/{{ include "kafka.name" . }}-rest-proxy-identity + loadBalancer: + direct: + enabled: false + ports: [] + securityOptions: + filesystemGroupId: 1000 + supportDynamicTags: false +{{ if .Values.kafka_rest_proxy.domain }} +--- +kind: domain +name: {{ .Values.kafka_rest_proxy.domain }} +description: {{ .Values.kafka_rest_proxy.domain }} +spec: + acceptAllHosts: false + dnsMode: cname + ports: + - number: 443 + protocol: http2 + routes: + - port: 8082 + prefix: / + workloadLink: //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "kafka.name" . }}-{{ .Values.kafka_rest_proxy.name }} + tls: + cipherSuites: + - ECDHE-ECDSA-AES256-GCM-SHA384 + - ECDHE-ECDSA-CHACHA20-POLY1305 + - ECDHE-ECDSA-AES128-GCM-SHA256 + - ECDHE-RSA-AES256-GCM-SHA384 + - ECDHE-RSA-CHACHA20-POLY1305 + - ECDHE-RSA-AES128-GCM-SHA256 + - AES256-GCM-SHA384 + - AES128-GCM-SHA256 + minProtocolVersion: TLSV1_2 +{{- end }} +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/policy.yaml b/kafka/versions/4.0.0/templates/policy.yaml new file mode 100644 index 00000000..fb9a338b --- /dev/null +++ b/kafka/versions/4.0.0/templates/policy.yaml @@ -0,0 +1,16 @@ +kind: policy +name: {{ include "kafka.name" . }} +origin: default +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "kafka.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "kafka.name" . }}-controller-configuration + - //secret/{{ include "kafka.name" . }}-init + - //secret/{{ include "kafka.name" . }}-secrets +{{- if .Values.jmx_exporter }} + - //secret/{{ include "kafka.name" . }}-jmx-exporter-conf +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/secret-controller-configuration.yaml b/kafka/versions/4.0.0/templates/secret-controller-configuration.yaml new file mode 100644 index 00000000..c66a36e8 --- /dev/null +++ b/kafka/versions/4.0.0/templates/secret-controller-configuration.yaml @@ -0,0 +1,99 @@ +kind: secret +name: {{ include "kafka.name" . }}-controller-configuration +type: opaque +data: + encoding: plain + payload: | + {{- include "kafka.validateReplicas" . }} + + # Listeners configuration + listeners-placeholder + advertised.listeners=INTERNAL://advertised-address-placeholder:9094,CONTROLLER://advertised-controller-address-placeholder:9093{{- range .Values.kafka.listeners }}{{- include "kafka.validateListenerConfig" . }},{{ .name | upper }}://advertised-{{ .name | lower }}-address-placeholder{{- end }} + listener.security.protocol.map=INTERNAL:SASL_PLAINTEXT,CONTROLLER:SASL_PLAINTEXT{{- range .Values.kafka.listeners }},{{ .name | upper }}:{{ .protocol }}{{- end }} + + # KRaft process roles + process.roles=process-roles-placeholder + + #node.id= + controller.listener.names=CONTROLLER + {{$replicaCount := int .Values.kafka.replicas -}} + {{- if eq $replicaCount 2 -}} + {{- fail "Invalid number of Kraft replicas: must not be 2" -}} + {{- end -}} + controller.quorum.voters= {{- $result := "" }} + {{- range $i := until $replicaCount }} + {{- if and (ge $i 0) (lt $i 5) }} + {{- if $i }} + {{- $result = print $result "," }} + {{- end }} + {{- $result = print $result (printf "%d@%s-%s-%d.%s-%s:9093" $i $.Release.Name $.Values.kafka.name $i $.Release.Name $.Values.kafka.name ) }} + {{- end }} + {{- end }} + {{- $result }} + + # Kraft Controller listener SASL settings + sasl.mechanism.controller.protocol=PLAIN + listener.name.controller.sasl.enabled.mechanisms=PLAIN + listener.name.controller.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="controller_user" password="controller-password-placeholder" user_controller_user="controller-password-placeholder"; + log.dirs={{ .Values.kafka.logDirs }} + sasl.enabled.mechanisms=PLAIN,SCRAM-SHA-256,SCRAM-SHA-512 + + # Interbroker configuration + inter.broker.listener.name=INTERNAL + sasl.mechanism.inter.broker.protocol=PLAIN + + # Listeners SASL JAAS configuration +{{- include "kafka.validateAdminExists" . }} +{{- range .Values.kafka.listeners }} + {{- include "kafka.validateAuthConfig" . }} + {{- if .sasl }} + {{- $adminConfig := "" }} + {{- if .sasl.admin }} + {{- $adminConfig = printf "user_%s=\"%s\"" .sasl.admin.username .sasl.admin.password }} + {{- end }} + listener.name.{{ .name | lower }}.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required {{- if $adminConfig }} {{ $adminConfig }}{{- end }}{{- $users := .sasl.users | split "," }}{{- $passwords := .sasl.passwords | split "," }}{{- range $index, $user := $users }}{{- $password := index $passwords $index }} user_{{ $user }}="{{ $password }}"{{- end }}; + listener.name.{{ .name | lower }}.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required {{- if $adminConfig }} {{ $adminConfig }}{{- end }}{{- range $index, $user := $users }}{{- $password := index $passwords $index }} user_{{ $user }}="{{ $password }}"{{- end }}; + listener.name.{{ .name | lower }}.scram-sha-512.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required {{- if $adminConfig }} {{ $adminConfig }}{{- end }}{{- range $index, $user := $users }}{{- $password := index $passwords $index }} user_{{ $user }}="{{ $password }}"{{- end }}; + {{- end }} +{{- end }} + listener.name.internal.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="inter_broker_user" password="interbroker-password-placeholder" user_inter_broker_user="interbroker-password-placeholder"{{- range .Values.kafka.listeners }}{{- if and .sasl .sasl.admin }} user_{{ .sasl.admin.username }}="{{ .sasl.admin.password }}"{{- break }}{{- end }}{{- end }}; + listener.name.internal.scram-sha-256.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="inter_broker_user" password="interbroker-password-placeholder"; + listener.name.internal.scram-sha-512.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="inter_broker_user" password="interbroker-password-placeholder"; + # End of SASL JAAS configuration + + + {{- if .Values.kafka.acl }} + + # Enable ACL + authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer + super.users=User:controller_user;User:inter_broker_user{{- if .Values.kafka.acl.superUsers }};{{ .Values.kafka.acl.superUsers }}{{- end }} + allow.everyone.if.no.acl.found={{ .Values.kafka.acl.allowEveryoneIfNoAclFound | default "false" }} + # End of ACL configuration + {{- end }} + + # Reliability and recovery defaults. Kafka's properties file uses last-value-wins for + # duplicate keys, so anything the operator sets in `extra_configurations` below + # overrides these defaults. + default.replication.factor={{ include "kafka.defaultReplicationFactor" . }} + min.insync.replicas={{ include "kafka.minInsyncReplicas" . }} + # controlled.shutdown drives leadership transfer + log flush before exit so a SIGTERM + # to the broker (now PID 1 thanks to exec in the init script) results in a clean exit + # within terminationGracePeriodSeconds and no recovery on next start. + controlled.shutdown.enable=true + controlled.shutdown.max.retries=3 + controlled.shutdown.retry.backoff.ms=5000 + # Never elect an out-of-sync replica as leader: prevents data loss after broker + # restart. Pair with min.insync.replicas above. + unclean.leader.election.enable=false + # Recovery thread pool sized to the broker's CPU budget. Recovery only runs on a + # *dirty* shutdown; a clean controlled.shutdown skips it entirely. When recovery + # does happen, this controls per-data-dir parallelism. + num.recovery.threads.per.data.dir={{ include "kafka.recoveryThreads" . }} + # Faster follower replication so replicas catch up and rejoin the ISR quickly after + # a transient broker outage. + num.replica.fetchers=4 + + # Extra configurations (these override any defaults above) + {{- range $key, $value := .Values.kafka.extra_configurations }} + {{ $key }}={{ $value }} + {{- end }} diff --git a/kafka/versions/4.0.0/templates/secret-init.yaml b/kafka/versions/4.0.0/templates/secret-init.yaml new file mode 100644 index 00000000..ff7575ef --- /dev/null +++ b/kafka/versions/4.0.0/templates/secret-init.yaml @@ -0,0 +1,184 @@ +{{- include "kafka.validatedirectReplicaRoutingConfig" . }} +kind: secret +name: {{ include "kafka.name" . }}-init +type: opaque +data: + encoding: plain + payload: | + #!/bin/bash + + set -o errexit + set -o nounset + set -o pipefail + + WORKLOAD_NAME=$(echo $CPLN_WORKLOAD | sed 's|.*/workload/\([^/]*\)$|\1|') + + error(){ + local message="${1:?missing message}" + echo "ERROR: ${message}" + exit 1 + } + + retry_while() { + local -r cmd="${1:?cmd is missing}" + local -r retries="${2:-12}" + local -r sleep_time="${3:-5}" + local return_value=1 + + read -r -a command <<< "$cmd" + for ((i = 1 ; i <= retries ; i+=1 )); do + "${command[@]}" && return_value=0 && break + sleep "$sleep_time" + done + return $return_value + } + + replace_in_file() { + local filename="${1:?filename is required}" + local match_regex="${2:?match regex is required}" + local substitute_regex="${3:?substitute regex is required}" + local posix_regex=${4:-true} + + local result + + # We should avoid using 'sed in-place' substitutions + # 1) They are not compatible with files mounted from ConfigMap(s) + # 2) We found incompatibility issues with Debian10 and "in-place" substitutions + local -r del=$'\001' # Use a non-printable character as a 'sed' delimiter to avoid issues + if [[ $posix_regex = true ]]; then + result="$(sed -E "s${del}${match_regex}${del}${substitute_regex}${del}g" "$filename")" + else + result="$(sed "s${del}${match_regex}${del}${substitute_regex}${del}g" "$filename")" + fi + echo "$result" > "$filename" + } + + kafka_conf_set() { + local file="${1:?missing file}" + local key="${2:?missing key}" + local value="${3:?missing value}" + + # Check if the value was set before + if grep -q "^[#\\s]*$key\s*=.*" "$file"; then + # Update the existing key + replace_in_file "$file" "^[#\\s]*${key}\s*=.*" "${key}=${value}" false + else + # Add a new key + printf '\n%s=%s' "$key" "$value" >>"$file" + fi + } + + replace_placeholder() { + local placeholder="${1:?missing placeholder value}" + local password="${2:?missing password value}" + sed -i "s|$placeholder|$password|g" "$KAFKA_CONFIG_FILE" + } + + configure_external_access() { + # Configure external hostname + if [[ -f "/shared/external-host.txt" ]]; then + host=$(cat "/shared/external-host.txt") + elif [[ -n "${EXTERNAL_ACCESS_HOST:-}" ]]; then + host="$EXTERNAL_ACCESS_HOST" + elif [[ -n "${EXTERNAL_ACCESS_HOSTS_LIST:-}" ]]; then + read -r -a hosts <<<"$(tr ',' ' ' <<<"${EXTERNAL_ACCESS_HOSTS_LIST}")" + host="${hosts[$POD_ID]}" + elif [[ "$EXTERNAL_ACCESS_HOST_USE_PUBLIC_IP" =~ ^(yes|true)$ ]]; then + host=$(curl -s https://ipinfo.io/ip) + else + error "External access hostname not provided" + fi + + # Configure external port + if [[ -f "/shared/external-port.txt" ]]; then + port=$(cat "/shared/external-port.txt") + elif [[ -n "${EXTERNAL_ACCESS_PORT:-}" ]]; then + if [[ "${EXTERNAL_ACCESS_PORT_AUTOINCREMENT:-}" =~ ^(yes|true)$ ]]; then + port="$((EXTERNAL_ACCESS_PORT + POD_ID))" + else + port="$EXTERNAL_ACCESS_PORT" + fi + elif [[ -n "${EXTERNAL_ACCESS_PORTS_LIST:-}" ]]; then + read -r -a ports <<<"$(tr ',' ' ' <<<"${EXTERNAL_ACCESS_PORTS_LIST}")" + port="${ports[$POD_ID]}" + else + error "External access port not provided" + fi + # Configure Kafka advertised listeners + sed -i -E "s|^(advertised\.listeners=\S+)$|\1,EXTERNAL://${host}:${port}|" "$KAFKA_CONFIG_FILE" + } + + configure_kafka_sasl() { + + # Replace placeholders with passwords + replace_placeholder "interbroker-password-placeholder" "$KAFKA_INTER_BROKER_PASSWORD" + replace_placeholder "controller-password-placeholder" "$KAFKA_CONTROLLER_PASSWORD" + } + + export KAFKA_CONFIG_FILE=${KAFKA_CONFIG_FILE:-/mnt/shared/config/server.properties} + cp /configmaps/server.properties $KAFKA_CONFIG_FILE + + # Get pod ID and role, last and second last fields in the pod name respectively + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + export KAFKA_CFG_NODE_ID="$POD_ID" + LOCATION_NAME=$(echo "$CPLN_LOCATION" | sed 's|.*/location/\([^/]*\)$|\1|') + + # Configure POD Role + if [ "$POD_ID" -le 4 ]; then + replace_placeholder "process-roles-placeholder" "controller,broker" + replace_placeholder "advertised-controller-address-placeholder" "${POD_NAME}.${WORKLOAD_NAME}.${CPLN_GVC_ALIAS}.svc.cluster.local" + replace_placeholder "listeners-placeholder" "listeners=INTERNAL://:9094,CONTROLLER://:9093{{- range $i, $key := keys .Values.kafka.listeners | sortAlpha }} + {{- $listener := index $.Values.kafka.listeners $key -}} + {{- include "kafka.validateListenerConfig" $listener -}},{{ $listener.name | upper }}://:{{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }}{{ $listener.directReplicaRouting.containerPort }}{{- else if $listener.publicAddress }}300${POD_ID}{{- else }}{{ $listener.containerPort }}{{- end }}{{- end }}" + else + replace_placeholder "process-roles-placeholder" "broker" + replace_placeholder ",CONTROLLER://advertised-controller-address-placeholder:9093" "" + replace_placeholder "listeners-placeholder" "listeners=INTERNAL://:9094{{- range $i, $key := keys .Values.kafka.listeners | sortAlpha }} + {{- $listener := index $.Values.kafka.listeners $key -}} + {{- include "kafka.validateListenerConfig" $listener -}},{{ $listener.name | upper }}://:{{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }}{{ $listener.directReplicaRouting.containerPort }}{{- else if $listener.publicAddress }}300${POD_ID}{{- else }}{{ $listener.containerPort }}{{- end }}{{- end }}" + fi + + # Configure node.id and/or broker.id + ID=$((POD_ID + KAFKA_MIN_ID)) + kafka_conf_set "$KAFKA_CONFIG_FILE" "node.id" "$ID" + + replace_placeholder "advertised-address-placeholder" "${POD_NAME}.${WORKLOAD_NAME}.${CPLN_GVC_ALIAS}.svc.cluster.local" + + {{- range $key, $listener := .Values.kafka.listeners }} + {{- include "kafka.validateListenerConfig" . }} + {{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }} + replace_placeholder "advertised-{{ $listener.name | lower }}-address-placeholder" "${POD_NAME}-${LOCATION_NAME}.{{ $listener.directReplicaRouting.publicAddress }}:{{ $listener.directReplicaRouting.containerPort }}" + {{- else if $listener.publicAddress }} + replace_placeholder "advertised-{{ $listener.name | lower }}-address-placeholder" "{{ $listener.publicAddress }}:300${POD_ID}" + {{- else }} + replace_placeholder "advertised-{{ $listener.name | lower }}-address-placeholder" "${POD_NAME}.${WORKLOAD_NAME}.${CPLN_GVC_ALIAS}.svc.cluster.local:{{ .containerPort }}" + {{- end }} + {{- end }} + + if [[ "${EXTERNAL_ACCESS_ENABLED:-false}" =~ ^(yes|true)$ ]]; then + configure_external_access + fi + + configure_kafka_sasl + + # Initialize log directories for Apache Kafka + {{- $root := . -}} + {{- $logDirs := split "," $root.Values.kafka.logDirs }} + {{- $counter := 0 }} + {{- range $path := $logDirs }} + # Create log directory if it doesn't exist + mkdir -p {{ $path }} + + # Remove lost+found if it exists (common with mounted volumes) + rm -rf {{ $path }}/lost+found 2>/dev/null || true + + # Ensure proper ownership for Apache Kafka (runs as appuser) + chown -R $(id -u):$(id -g) {{ $path }} 2>/dev/null || true + {{- $counter = add $counter 1 }} + {{- end }} + + # Ensure data directory exists (Apache Kafka default) + mkdir -p /var/lib/kafka/data + chown -R $(id -u):$(id -g) /var/lib/kafka/data 2>/dev/null || true + + exec /etc/kafka/docker/run diff --git a/kafka/versions/4.0.0/templates/secret-secrets.yaml b/kafka/versions/4.0.0/templates/secret-secrets.yaml new file mode 100644 index 00000000..86060a2d --- /dev/null +++ b/kafka/versions/4.0.0/templates/secret-secrets.yaml @@ -0,0 +1,12 @@ +kind: secret +name: {{ include "kafka.name" . }}-secrets +type: dictionary +data: + kraft-cluster-id: {{ .Values.kafka.secrets.kraft_cluster_id }} + {{- range $key, $listener := .Values.kafka.listeners }} + {{- if and $listener.sasl $listener.sasl.admin }} + {{ $listener.name | lower }}-admin-password: {{ $listener.sasl.admin.password }} + {{- end }} + {{- end }} + inter-broker-password: {{ .Values.kafka.secrets.inter_broker_password }} + controller-password: {{ .Values.kafka.secrets.controller_password }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/volumesets.yaml b/kafka/versions/4.0.0/templates/volumesets.yaml new file mode 100644 index 00000000..dba17d86 --- /dev/null +++ b/kafka/versions/4.0.0/templates/volumesets.yaml @@ -0,0 +1,31 @@ +{{- $root := . -}} +{{- $logDirs := split "," $root.Values.kafka.logDirs }} +{{- $counter := 0 }} +{{- range $index, $path := $logDirs }} +kind: volumeset +name: {{ include "kafka.name" $root }}-logs-{{ $counter }} +description: {{ include "kafka.name" $root }} logs {{ $counter }} +gvc: {{ $root.Values.global.cpln.gvc }} +spec: + initialCapacity: {{ $root.Values.kafka.volumes.logs.initialCapacity }} + performanceClass: {{ $root.Values.kafka.volumes.logs.performanceClass }} + fileSystemType: {{ $root.Values.kafka.volumes.logs.fileSystemType }} + {{- if and $root.Values.kafka.volumes.logs.customEncryption $root.Values.kafka.volumes.logs.customEncryption.enabled }} + customEncryption: + regions: + {{ $root.Values.kafka.volumes.logs.customEncryption.region }}: + keyId: '{{ $root.Values.kafka.volumes.logs.customEncryption.keyId }}' + {{- end }} + autoscaling: + maxCapacity: {{ $root.Values.kafka.volumes.logs.autoscaling.maxCapacity }} + minFreePercentage: {{ $root.Values.kafka.volumes.logs.autoscaling.minFreePercentage }} + scalingFactor: {{ $root.Values.kafka.volumes.logs.autoscaling.scalingFactor }} +{{- if $root.Values.kafka.volumes.logs.snapshots }} + snapshots: + createFinalSnapshot: {{ $root.Values.kafka.volumes.logs.snapshots.createFinalSnapshot }} + retentionDuration: {{ $root.Values.kafka.volumes.logs.snapshots.retentionDuration }} + schedule: {{ $root.Values.kafka.volumes.logs.snapshots.schedule }} +{{- end }} +--- +{{- $counter = add $counter 1 }} +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/workload-kafka-client.yaml b/kafka/versions/4.0.0/templates/workload-kafka-client.yaml new file mode 100644 index 00000000..08a0a78a --- /dev/null +++ b/kafka/versions/4.0.0/templates/workload-kafka-client.yaml @@ -0,0 +1,46 @@ +{{- if .Values.kafka_client }} +kind: workload +name: {{ include "kafka.name" . }}-{{ .Values.kafka_client.name }} +gvc: {{ .Values.global.cpln.gvc }} +spec: + type: standard + containers: + - name: kafka + args: + - '-c' + - sleep infinity + command: /bin/bash + cpu: {{ .Values.kafka_client.cpu }} + image: {{ .Values.kafka_client.image }} + inheritEnv: false + memory: {{ .Values.kafka_client.memory }} + ports: + - number: 9092 + protocol: tcp + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 3 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 +{{- if .Values.kafka_client.firewall }} + firewallConfig: + {{- if or (hasKey .Values.kafka_client.firewall "external_inboundAllowCIDR") (hasKey .Values.kafka_client.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.kafka_client.firewall.external_inboundAllowCIDR }}{{ .Values.kafka_client.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.kafka_client.firewall.external_outboundAllowCIDR }}{{ .Values.kafka_client.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.kafka_client.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.kafka_client.firewall.internal_inboundAllowType }} + {{- end }} +{{- end }} + localOptions: [] + supportDynamicTags: false +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/workload-kafka-cluster.yaml b/kafka/versions/4.0.0/templates/workload-kafka-cluster.yaml new file mode 100644 index 00000000..c35064e2 --- /dev/null +++ b/kafka/versions/4.0.0/templates/workload-kafka-cluster.yaml @@ -0,0 +1,353 @@ +{{- include "kafka.validateKafkaImage" . -}} +{{- include "kafka.validatedirectReplicaRoutingConfig" . -}} +{{- if .Values.jmx_exporter }} +kind: secret +name: {{ include "kafka.name" . }}-jmx-exporter-conf +type: opaque +data: + encoding: plain + payload: |- + {{ .Values.jmx_exporter.config | toYaml | nindent 4 }} +--- +{{- end }} +kind: workload +name: {{ include "kafka.clusterName" . }} +gvc: {{ .Values.global.cpln.gvc }} +tags: + {{- if .Values.kafka.deletionProtection }} + cpln/protected: true + {{- end }} + # KRaft brokers must resolve `.:9093` to find their peers and form + # the controller quorum on cold start, before any replica is Ready. This tag flips + # publishNotReadyAddresses=true on the headless Service so EndpointSlice exposes + # not-yet-Ready pods for peer DNS. Without it, suspend/unsuspend (or any cold + # start where the kafka-orchestrator readiness probe gates the workload-Ready + # signal) deadlocks: pods can't form a quorum because they can't resolve each + # other, and they can't become Ready until the quorum forms. + cpln/publishNotReadyAddresses: "true" + {{- include "kafka.tags" . | nindent 2 }} +spec: + type: stateful + containers: + - name: kafka + args: + - '-c' + - >- + cp /scripts/kafka-init.sh /tmp/ && chmod +x /tmp/kafka-init.sh && + exec /tmp/kafka-init.sh + command: /bin/bash + # Override Control Plane's default preStop hook (sleep for half the grace + # period to drain envoy connections). Kafka has no L7 connection drain to + # wait for — its clients reconnect to other brokers via Metadata refresh, + # and inter-broker traffic is handled by controlled.shutdown's leadership + # transfer. Sleeping here just delays SIGTERM and shortens the window + # available for the actual graceful shutdown. + lifecycle: + preStop: + exec: + command: ['true'] + cpu: '{{ .Values.kafka.cpu }}' + {{- if .Values.kafka.minCpu }} + minCpu: '{{ .Values.kafka.minCpu }}' + {{- end }} + env: + {{- if .Values.kafka.env }} +{{ toYaml .Values.kafka.env | indent 8 }} + {{- end }} + {{- if .Values.jmx_exporter }} + - name: JMX_PORT + value: {{ .Values.jmx_exporter.kafkaJmxPort | quote }} + {{- end }} + - name: KAFKA_CONTROLLER_PASSWORD + value: 'cpln://secret/{{ include "kafka.name" . }}-secrets.controller-password' + - name: KAFKA_CONTROLLER_USER + value: controller_user + - name: KAFKA_HEAP_OPTS + value: "{{ .Values.kafka.overrideHeapOpts | default (include "kafka.heap.opts" .) | trim }}" + - name: KAFKA_INTER_BROKER_PASSWORD + value: 'cpln://secret/{{ include "kafka.name" . }}-secrets.inter-broker-password' + - name: KAFKA_INTER_BROKER_USER + value: inter_broker_user + - name: KAFKA_KRAFT_BOOTSTRAP_SCRAM_USERS + value: 'true' + - name: CLUSTER_ID + value: 'cpln://secret/{{ include "kafka.name" . }}-secrets.kraft-cluster-id' + - name: KAFKA_MIN_ID + value: '0' + image: {{ .Values.kafka.image }} + inheritEnv: false + livenessProbe: + failureThreshold: 5 + initialDelaySeconds: 60 + periodSeconds: 15 + successThreshold: 1 + tcpSocket: + port: 9093 + timeoutSeconds: 15 + memory: {{ .Values.kafka.memory }} + {{- if .Values.kafka.minMemory }} + minMemory: {{ .Values.kafka.minMemory }} + {{- end }} + ports: +{{ range $key, $listener := .Values.kafka.listeners }} +{{- include "kafka.validateListenerConfig" $listener }} +{{- if and $listener.directReplicaRouting $listener.directReplicaRouting.enabled }} + - number: {{ $listener.directReplicaRouting.containerPort }} + protocol: tcp +{{- else if $listener.publicAddress }} + {{- $startPort := 3000 }} + {{- $replicas := $.Values.kafka.replicas | int }} + {{- range $replicaIndex := until $replicas }} + - number: {{ add $startPort $replicaIndex }} + protocol: tcp + {{- end }} +{{- else }} + - number: {{ $listener.containerPort }} + protocol: tcp +{{- end }} +{{- end }} + - number: 9093 + protocol: tcp + - number: 9094 + protocol: tcp +{{- if .Values.jmx_exporter }} + - number: {{ .Values.jmx_exporter.kafkaJmxPort }} + protocol: tcp +{{- end }} + readinessProbe: + failureThreshold: 20 + initialDelaySeconds: 20 + periodSeconds: 10 + successThreshold: 6 + tcpSocket: + port: 9093 + timeoutSeconds: 5 + volumes: + {{- $root := . -}} + {{- $logDirs := split "," $root.Values.kafka.logDirs }} + {{- $counter := 0 }} + {{- range $path := $logDirs }} + - path: {{ $path | trim }} + recoveryPolicy: retain + uri: 'cpln://volumeset/{{ include "kafka.name" $root }}-logs-{{ $counter }}' + {{- $counter = add $counter 1 }} + {{- end }} + - path: /configmaps/server.properties + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "kafka.name" $root }}-controller-configuration' + - path: /scripts/kafka-init.sh + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "kafka.name" $root }}-init' +{{- if .Values.kafka_orchestrator }} +{{- $orchestratorListenerName := .Values.kafka_orchestrator.listener }} +{{- if not (hasKey .Values.kafka.listeners $orchestratorListenerName) }} + {{- fail (printf "Error: Listener '%s' specified in kafka_orchestrator.listener does not exist" $orchestratorListenerName) }} +{{- end }} +{{- $orchestratorListener := index .Values.kafka.listeners $orchestratorListenerName }} + - name: kafka-orchestrator + image: {{ .Values.kafka_orchestrator.image }} + inheritEnv: false + # Suppress Control Plane's default preStop sleep (half of grace period). + # The orchestrator is a read-only health/metrics HTTP server; nothing to + # drain. Letting it exit immediately keeps pod termination time bounded + # by the kafka container's controlled.shutdown rather than 300s of idle. + lifecycle: + preStop: + exec: + command: ['true'] + cpu: {{ .Values.kafka_orchestrator.cpu }} + {{- if .Values.kafka_orchestrator.minCpu }} + minCpu: '{{ .Values.kafka_orchestrator.minCpu }}' + {{- end }} + memory: {{ .Values.kafka_orchestrator.memory }} + {{- if .Values.kafka_orchestrator.minMemory }} + minMemory: {{ .Values.kafka_orchestrator.minMemory }} + {{- end }} + ports: + - number: 8080 + protocol: http + metrics: + path: /metrics + port: 8080 + readinessProbe: + failureThreshold: 20 + initialDelaySeconds: 20 + periodSeconds: 10 + successThreshold: 1 + httpGet: + path: /health/ready + port: 8080 + timeoutSeconds: 5 + env: + {{- if .Values.kafka_orchestrator.env }} +{{ toYaml .Values.kafka_orchestrator.env | indent 8 }} + {{- end }} + - name: REPLICA_COUNT + value: '{{ .Values.kafka.replicas }}' + - name: KAFKA_PORT + value: '{{ $orchestratorListener.containerPort }}' + - name: LOG_LEVEL + value: '{{ .Values.kafka_orchestrator.logLevel | default "info" }}' + - name: CHECK_TIMEOUT + value: '{{ .Values.kafka_orchestrator.checkTimeout | default "10s" }}' + {{- if eq $orchestratorListener.protocol "SASL_PLAINTEXT" }} + - name: SASL_ENABLED + value: 'true' + - name: SASL_MECHANISM + value: PLAIN + - name: SASL_USERNAME + value: '{{ $orchestratorListener.sasl.admin.username }}' + - name: SASL_PASSWORD + value: 'cpln://secret/{{ include "kafka.name" . }}-secrets.{{ $orchestratorListener.name | lower }}-admin-password' + {{- end }} +{{- end }} +{{- if .Values.kafka_exporter }} + - name: kafka-exporter + args: + - '-c' + - >- +{{- $listenerName := .Values.kafka_exporter.listener }} +{{- if not (hasKey .Values.kafka.listeners $listenerName) }} + {{- fail (printf "Error: Listener '%s' specified in kafka_exporter.listener does not exist" $listenerName) }} +{{- end }} +{{- $listener := index .Values.kafka.listeners $listenerName }} +{{- $port := 3000 }} +{{- if $listener.containerPort }} + {{- $port = $listener.containerPort }} +{{- else }} + {{- $port = "$(echo $((3000 + $POD_ID)))" }} +{{- end }} +{{- if eq $listener.protocol "SASL_PLAINTEXT" }} + {{- if not (and $listener.sasl $listener.sasl.admin) }} + {{- fail (printf "Error: SASL_PLAINTEXT listener '%s' must have sasl.admin configured for kafka_exporter" $listenerName) }} + {{- end }} + sleep 60 && POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) && kafka_exporter --kafka.server=localhost:{{ if not $listener.containerPort }}$(echo $((3000 + $POD_ID))){{ else }}{{ $port }}{{ end }} + --sasl.enabled --sasl.username={{ $listener.sasl.admin.username }} --sasl.mechanism=plain + --sasl.password=${KAFKA_CLIENT_PASSWORDS} --web.listen-address=:9308 +{{- else if eq $listener.protocol "PLAINTEXT" }} + sleep 60 && POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) && kafka_exporter --kafka.server=localhost:{{ if not $listener.containerPort }}$(echo $((3000 + $POD_ID))){{ else }}{{ $port }}{{ end }} + --no-sasl.handshake --web.listen-address=:9308 +{{- else }} + sleep 60 && POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) && kafka_exporter --kafka.server=localhost:{{ if not $listener.containerPort }}$(echo $((3000 + $POD_ID))){{ else }}{{ $port }}{{ end }} + --no-sasl.handshake --web.listen-address=:9308 +{{- end }} + command: /bin/sh + cpu: {{ .Values.kafka_exporter.cpu }} + # Suppress Control Plane's default preStop sleep — prometheus-scrape + # sidecar with no L7 connections to drain. See kafka-orchestrator block + # for full reasoning. + lifecycle: + preStop: + exec: + command: ['true'] + metrics: + path: /metrics + port: 9308 + dropMetrics: {{- if .Values.kafka_exporter.dropMetrics }}{{ .Values.kafka_exporter.dropMetrics | toYaml | nindent 8 }}{{- else }} []{{- end }} + env: +{{- if .Values.kafka_exporter.env }} +{{ toYaml .Values.kafka_exporter.env | indent 8 }} +{{- end }} +{{- $listenerName := .Values.kafka_exporter.listener }} +{{- if not (hasKey .Values.kafka.listeners $listenerName) }} + {{- fail (printf "Error: Listener '%s' specified in kafka_exporter.listener does not exist" $listenerName) }} +{{- end }} +{{- $listener := index .Values.kafka.listeners $listenerName }} +{{- if eq $listener.protocol "SASL_PLAINTEXT" }} + - name: KAFKA_CLIENT_PASSWORDS + value: 'cpln://secret/{{ include "kafka.name" $ }}-secrets.{{ $listener.name | lower }}-admin-password' +{{- end }} + image: {{ .Values.kafka_exporter.image }} + inheritEnv: false + memory: {{ .Values.kafka_exporter.memory }} + ports: + - number: 9308 + protocol: tcp +{{- end }} +{{- if .Values.jmx_exporter }} + - name: jmx-exporter + command: java + args: + - -XX:MaxRAMPercentage=100 + - -XshowSettings:vm + - -jar + - jmx_prometheus_standalone.jar + - {{ .Values.jmx_exporter.exporterPort | quote }} + - /etc/jmx-kafka/jmx-kafka-prometheus.yml + cpu: {{ .Values.jmx_exporter.cpu }} + {{- if .Values.jmx_exporter.minCpu }} + minCpu: '{{ .Values.jmx_exporter.minCpu }}' + {{- end }} + # Suppress Control Plane's default preStop sleep — prometheus-scrape + # sidecar with no L7 connections to drain. See kafka-orchestrator block + # for full reasoning. + lifecycle: + preStop: + exec: + command: ['true'] + metrics: + path: /metrics + port: {{ .Values.jmx_exporter.exporterPort }} + dropMetrics: {{- if .Values.jmx_exporter.dropMetrics }}{{ .Values.jmx_exporter.dropMetrics | toYaml | nindent 8 }}{{- else }} []{{- end }} + image: {{ .Values.jmx_exporter.image }} + inheritEnv: false + memory: {{ .Values.jmx_exporter.memory }} + {{- if .Values.jmx_exporter.minMemory }} + minMemory: {{ .Values.jmx_exporter.minMemory }} + {{- end }} + ports: + - number: {{ .Values.jmx_exporter.exporterPort }} + protocol: tcp + volumes: + - path: /etc/jmx-kafka/jmx-kafka-prometheus.yml + recoveryPolicy: retain + uri: cpln://secret/{{ include "kafka.name" . }}-jmx-exporter-conf +{{- end }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: {{ .Values.kafka.replicas }} + metric: disabled + minScale: {{ .Values.kafka.replicas }} + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + {{- if .Values.kafka.multiZone }} + multiZone: + enabled: true + {{- else }} + multiZone: + enabled: false + {{- end }} + suspend: {{ .Values.kafka.suspend }} + timeoutSeconds: {{ .Values.kafka.terminationGracePeriodSeconds | default 600 }} +{{- if .Values.kafka.firewall }} + firewallConfig: + {{- if or (hasKey .Values.kafka.firewall "external_inboundAllowCIDR") (hasKey .Values.kafka.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.kafka.firewall.external_inboundAllowCIDR }}{{ .Values.kafka.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.kafka.firewall.external_outboundAllowCIDR }}{{ .Values.kafka.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.kafka.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.kafka.firewall.internal_inboundAllowType }} + {{- if .Values.kafka.firewall.inboundAllowWorkload }} + inboundAllowWorkload: {{ .Values.kafka.firewall.inboundAllowWorkload | toYaml | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: true + identityLink: //identity/{{ include "kafka.name" . }} + rolloutOptions: + maxSurgeReplicas: 25% + maxUnavailableReplicas: '1' + minReadySeconds: {{ .Values.kafka.minReadySeconds }} + scalingPolicy: Parallel + securityOptions: + filesystemGroupId: 1001 + supportDynamicTags: false \ No newline at end of file diff --git a/kafka/versions/4.0.0/templates/workload-kafka-ui.yaml b/kafka/versions/4.0.0/templates/workload-kafka-ui.yaml new file mode 100644 index 00000000..dc2726df --- /dev/null +++ b/kafka/versions/4.0.0/templates/workload-kafka-ui.yaml @@ -0,0 +1,66 @@ +{{- if and .Values.kafka_ui .Values.kafka_ui.enabled }} +kind: workload +name: {{ include "kafka.name" . }}-{{ .Values.kafka_ui.name }} +description: kafka-ui +gvc: {{ .Values.global.cpln.gvc }} +spec: + type: standard + containers: + - name: kafka-ui + cpu: {{ .Values.kafka_ui.cpu }} + env: + - name: KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS + value: "{{- $replicas := int .Values.kafka.replicas -}}{{- $bootstrapServers := list -}}{{- range $i := until $replicas -}}{{- if $i -}},{{- end -}}{{- printf "%s-%s-%d.%s-%s:9092" $.Release.Name $.Values.kafka.name $i $.Release.Name $.Values.kafka.name -}}{{- end }}" + - name: KAFKA_CLUSTERS_0_NAME + value: {{ include "kafka.name" . }} + - name: KAFKA_CLUSTERS_0_PROPERTIES_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM + value: '' + - name: LOGGING_LEVEL_ROOT + value: INFO +{{- $listenerName := .Values.kafka_ui.listener }} +{{- $listener := index .Values.kafka.listeners $listenerName }} +{{- if eq $listener.protocol "SASL_PLAINTEXT" }} + {{- if not (and $listener.sasl $listener.sasl.admin) }} + {{- fail (printf "Error: SASL_PLAINTEXT listener '%s' must have sasl.admin configured for kafka_exporter" $listenerName) }} + {{- end }} + - name: KAFKA_CLUSTERS_0_PROPERTIES_SECURITY_PROTOCOL + value: {{ $listener.protocol }} + - name: KAFKA_CLUSTERS_0_PROPERTIES_SASL_MECHANISM + value: PLAIN + - name: KAFKA_CLUSTERS_0_PROPERTIES_SASL_JAAS_CONFIG + value: >- + org.apache.kafka.common.security.plain.PlainLoginModule required username="{{ $listener.sasl.admin.username }}" password="{{ $listener.sasl.admin.password }}"; +{{- end }} + image: 'provectuslabs/kafka-ui:latest' + inheritEnv: false + memory: {{ .Values.kafka_ui.memory }} + ports: + - number: 8080 + protocol: http + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 +{{- if .Values.kafka_ui.firewall }} + firewallConfig: + {{- if or (hasKey .Values.kafka_ui.firewall "external_inboundAllowCIDR") (hasKey .Values.kafka_ui.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.kafka_ui.firewall.external_inboundAllowCIDR }}{{ .Values.kafka_ui.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.kafka_ui.firewall.external_outboundAllowCIDR }}{{ .Values.kafka_ui.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.kafka_ui.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.kafka_ui.firewall.internal_inboundAllowType }} + {{- end }} +{{- end }} + localOptions: [] + supportDynamicTags: false +{{- end }} \ No newline at end of file diff --git a/kafka/versions/4.0.0/values.yaml b/kafka/versions/4.0.0/values.yaml new file mode 100644 index 00000000..bbd3a797 --- /dev/null +++ b/kafka/versions/4.0.0/values.yaml @@ -0,0 +1,503 @@ +kafka: + name: cluster + image: apache/kafka:3.9.1 + suspend: false + deletionProtection: false + replicas: 3 # must not be 2 + minReadySeconds: 0 + debug: false + multiZone: false # If true: It's recommended to enable multi-zone on the Dedicated Load Balancer setting on GVC to reduce the cross-zone traffic + logDirs: /opt/kafka/logs-0,/opt/kafka/logs-1 + env: [] # If you need to set environment variables, add them here + # How long Control Plane waits for a graceful broker shutdown (controlled.shutdown completes, + # leadership transfers, log flushes finish) before SIGKILL. Brokers carrying large amounts + # of data benefit from a long grace window — set this comfortably above your largest broker's + # observed shutdown time. Default 600s (10m); raise for very large clusters. + terminationGracePeriodSeconds: 600 + volumes: + logs: + initialCapacity: 10 # In GB + performanceClass: general-purpose-ssd # general-purpose-ssd / high-throughput-ssd (Min 1000GB) + fileSystemType: ext4 # ext4 / xfs + snapshots: + createFinalSnapshot: true + retentionDuration: 7d + schedule: 0 0 * * * # UTC + autoscaling: + maxCapacity: 1000 # In GB + minFreePercentage: 20 + scalingFactor: 1.2 + # customEncryption: + # enabled: false + # region: aws-us-east-2 # Replace with the appropriate region + # keyId: arn:aws:kms:us-east-2:1234567890:key/d411f35a-1d31-4515-9934-4f193e042d80 # Replace with your AWS KMS key ARN + cpu: 1000m # For millicores us 'm' like 500m + memory: 2000Mi # Gi / Mi + minCpu: 250m # For millicores us 'm' like 500m + minMemory: 2000Mi # Gi / Mi + # overrideHeapOpts: "-Xmx1024m -Xms1024m" # Override the default heap Options settings + # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall + firewall: + internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc(Recommended) + # external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + # # You can specify additional workloads with either same-gvc or workload-list: + # inboundAllowWorkload: + # - //gvc/main-kafka/workload/main-kafka-kafbat-ui + # - //gvc/client-gvc/workload/client + # external_outboundAllowCIDR: "111.222.333.444/16,111.222.444.333/32" # Provide a comma-separated list + listeners: + # @param listeners.client.name Name for the Kafka client listener + # @param listeners.client.containerPort Port for the Kafka client listener. Except ports 9091,9093,9094 + # @param listeners.client.protocol Security protocol for the Kafka client listener. Allowed values are 'PLAINTEXT', 'SASL_PLAINTEXT' + # @param listeners.client.publicAddress DNS address for public access to brokers. Must be the same as kafka.replicas + client: + protocol: SASL_PLAINTEXT + name: CLIENT + containerPort: 9092 # If publicAddress is enabled, Client automatically set to port range 3000-3004 + sasl: + ## @param listeners.client.sasl.users Comma-separated list of usernames for client communications when SASL is enabled + ## @param listeners.client.passwords Comma-separated list of passwords for client communications when SASL is enabled, must match the number of client.sasl.users + ## @param listeners.client.admin Admin username and password for client communications when SASL is enabled + admin: + username: admin + password: "your-admin-password" + users: "user" + passwords: "your-user-password" + # public: + # protocol: SASL_PLAINTEXT # TLS enforced, Kafka clients should use SASL_SSL to access 'publicAddress' if provided + # name: PUBLIC + # # containerPort: 9095 # Uncomment only when no directReplicaRouting or publicAddress is provided + # # Use directReplicaRouting for automatic public replica endpoints with DNS01 cert challenge + # directReplicaRouting: + # enabled: true + # containerPort: 9095 # ports 9093 and 9094 are reserved for controller and inter-broker communication + # publicAddress: kafka.example.com # Make sure Dedicate Load Balancer is enabled on the GVC + # sasl: + # ## @param listeners.client.sasl.users Comma-separated list of usernames for client communications when SASL is enabled + # ## @param listeners.client.brokersAddresses Comma-separated list of passwords for client communications when SASL is enabled, must match the number of client.sasl.users + # ## @param listeners.client.admin Admin username and password for client communications when SASL is enabled + # # admin: + # # username: admin + # # password: tgtgtg + # users: "public-user" + # passwords: "your-public-user-password" + acl: + superUsers: "User:admin" # User:admin;User:connectors (for multiple users) + allowEveryoneIfNoAclFound: false + secrets: + kraft_cluster_id: your-kraft-cluster-id # Example:bkdDtS1Rsf536si7BGM0JY + inter_broker_password: your-inter-broker-password # Example: HfcgCHp32e + controller_password: your-controller-password # Example: ayd8iJwqXe + extra_configurations: + # default.replication.factor and min.insync.replicas are auto-derived from kafka.replicas + # in the chart (defaults to min(3, replicas) and rf-1 respectively). Set them here only + # if you want to override. + auto.create.topics.enable: true # auto.create.topics.enable + log.retention.hours: 168 # The number of hours to keep a log file before deleting it (in hours) + +# Sidecar that exposes /health/ready, /health/live, and Prometheus metrics for the Kafka brokers. +# Set to null (or comment the block out) to disable the sidecar. +kafka_orchestrator: + image: ghcr.io/controlplane-com/kafka-orchestrator:v0.1.0 + cpu: 100m + memory: 128Mi + minCpu: 50m + minMemory: 64Mi + listener: client # name of an entry under kafka.listeners; the sidecar uses this listener's port and SASL config for health checks + logLevel: info # debug / info / warn / error + checkTimeout: 10s # per-check timeout used by the orchestrator's franz-go client + env: [] # extra env vars (e.g. BROKER_ID / BOOTSTRAP_SERVERS overrides for non-standard setups) + +kafka_exporter: + name: exporter + image: danielqsj/kafka-exporter:v1.9.0 + debug: false + cpu: 50m + memory: 128Mi + listener: client + env: [] # If you need to set environment variables, add them here + dropMetrics: [] # e.g., ["kafka_consumergroup.*", "^kafka_topic_partition_current_offset"] + +jmx_exporter: + name: jmx-exporter + image: ghcr.io/controlplane-com/bitnami/jmx-exporter + kafkaJmxPort: 5557 # Ensure this port matches the port in the jmxUrl below + exporterPort: 5556 + debug: false + cpu: 250m + memory: 256Mi + minCpu: 80m + minMemory: 125Mi + listener: client + dropMetrics: [] # e.g., ["kafka_consumergroup.*", "^kafka_topic_partition_current_offset"] + config: + jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:5557/jmxrmi + lowercaseOutputName: true + lowercaseOutputLabelNames: true + ssl: false + whitelistObjectNames: + - kafka.controller:* + - kafka.server:* + - java.lang:* + - kafka.network:* + - kafka.log:* + - kafka.producer:* + - kafka.consumer:* + rules: + - labels: + request: "$3" + name: kafka_request_count + pattern: kafka.network<>(Count) + - labels: + request: "$3" + stat: "$4" + name: kafka_request_metrics_totaltimems + pattern: kafka.network<>(.+) + - labels: + request: "$3" + component: "$2" + stat: "$4" + name: kafka_request_latency_ms + pattern: kafka.network<>(.+) + - labels: + client_type: "$3" + metric: "$2" + stat: "$4" + name: kafka_client_metrics + pattern: kafka.network<>(.+) + - labels: + client_id: "$1" + metric: "$2" + name: kafka_consumer_metrics + pattern: kafka.consumer<>(.+) + - labels: + client_id: "$1" + metric: "$2" + name: kafka_producer_metrics + pattern: kafka.producer<>(.+) + - name: kafka_server_$1_$2_$3 + pattern: kafka.server<>(Count|Value) + - name: java_lang_$1_$2 + pattern: java.lang<>(.+) + +kafbat_ui: + enabled: true + deletionProtection: false + name: kafbat-ui + image: ghcr.io/kafbat/kafka-ui + cpu: 300m + memory: 1000Mi + minCpu: 100m + minMemory: 400Mi + replicas: 1 + timeoutSeconds: 30 + configuration_secret: kafka-kafbat-ui-config # Pre-create a secret with the configuration; Example in README + # Domain name for the UI. + # Make sure the required DNS records are created in your DNS server + # https://docs.controlplane.com/guides/configure-domain#subdomain-e-g-sample-domain-com-cname-mode-path-based-routing + # domain: kafbat-ui.example.com # Domain name for the UI. + # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall + firewall: + # internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc + external_inboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + +# kafka_connectors: +# - name: cluster +# image: apache/kafka:3.9.1 +# multiZone: true +# cpu: 400m +# memory: 1500Mi +# minCpu: 100m +# minMemory: 375Mi +# plugins_folder: /opt/kafka/plugins +# timeoutSeconds: 15 +# replicas: 1 +# verbose: false +# extraVolumes: [] +# # Volume configuration for Kafka Connect (Optional - defaults shown below) +# volumes: +# initialCapacity: 10 # In GB (Default: 10) +# performanceClass: general-purpose-ssd # general-purpose-ssd / high-throughput-ssd (Min 1000GB) (Default: general-purpose-ssd) +# fileSystemType: ext4 # ext4 / xfs (Default: ext4) +# snapshots: +# createFinalSnapshot: true # Default: true +# retentionDuration: 7d # Default: 7d +# # schedule: 0 0 * * * # UTC (Optional) +# # customEncryption: +# # enabled: true # Encrypting is only possible for new volumes. Existing volumes cannot be re-encrypted after creation. +# # region: aws-us-east-2 # Replace with the appropriate region +# # keyId: arn:aws:kms:us-east-2:1234567890:key/fewf2f43-1d31-2332-9934-efhg4334gfe # Replace with your AWS KMS key ARN +# env: +# - name: KAFKA_HEAP_OPTS +# value: '-Xms900m -Xmx900m' # set to 50%-75% of the memory +# # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall +# firewall: +# external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list +# internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc +# # You can specify additional workloads with either same-gvc or workload-list: +# inboundAllowWorkload: +# - //gvc/main-kafka/workload/main-kafka-kafbat-ui +# - //gvc/client-gvc/workload/client +# external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list +# listener: client # Provide the listener name to connect to +# connector_properties: +# # bootstrap.servers: "kafka-dev-cluster:9092" # Optional. If not set, the bootstrap address will be the cluster name or publicAddress +# group.id: "connect-cluster" +# security.protocol: "SASL_PLAINTEXT" +# sasl.mechanism: "PLAIN" +# sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"admin\" password=\"your-admin-password\";" +# consumer.security.protocol: "SASL_PLAINTEXT" +# consumer.sasl.mechanism: "PLAIN" +# consumer.sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"admin\" password=\"your-admin-password\";" +# producer.security.protocol: "SASL_PLAINTEXT" +# producer.sasl.mechanism: "PLAIN" +# producer.sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"admin\" password=\"your-admin-password\";" +# key.converter.schemas.enable: "false" +# value.converter.schemas.enable: "false" +# offset.storage.topic: "connect-offsets" +# offset.storage.replication.factor: "3" +# config.storage.topic: "connect-configs" +# config.storage.replication.factor: "3" +# status.storage.topic: "connect-status" +# status.storage.replication.factor: "3" +# offset.flush.interval.ms: "10000" +# plugin.path: "/opt/kafka/plugins" +# key.converter: "org.apache.kafka.connect.storage.StringConverter" +# value.converter: "org.apache.kafka.connect.converters.ByteArrayConverter" +# plugins: +# - name: kafka-mirror-test-1 +# enabled: true +# config: +# connector.class: org.apache.kafka.connect.mirror.MirrorSourceConnector +# tasks.max: '1' +# offset-syncs.topic.location: source +# source.cluster.alias: remote +# target.cluster.alias: local +# source.bootstrap.servers: kafka-dev-cluster:9092 +# target.bootstrap.servers: kafka-dev-cluster:9092 +# source.consumer.bootstrap.servers: kafka-dev-cluster:9092 +# target.consumer.bootstrap.servers: kafka-dev-cluster:9092 +# source.producer.bootstrap.servers: kafka-dev-cluster:9092 +# target.producer.bootstrap.servers: kafka-dev-cluster:9092 +# source.admin.bootstrap.servers: kafka-dev-cluster:9092 +# target.admin.bootstrap.servers: kafka-dev-cluster:9092 +# replication.policy.class: org.apache.kafka.connect.mirror.DefaultReplicationPolicy +# topics: mirror-test-1-a,mirror-test-1-b +# groups: .* +# sync.topic.configs.enabled: 'true' +# sync.topic.acls.enabled: 'false' +# refresh.topics.interval.seconds: '60' +# refresh.groups.interval.seconds: '60' +# replication.factor: '3' +# offset.syncs.topic.replication.factor: '3' +# checkpoints.topic.replication.factor: '3' +# heartbeats.topic.replication.factor: '3' +# source.consumer.security.protocol: SASL_PLAINTEXT +# source.consumer.sasl.mechanism: PLAIN +# source.consumer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# source.producer.security.protocol: SASL_PLAINTEXT +# source.producer.sasl.mechanism: PLAIN +# source.producer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# target.producer.security.protocol: SASL_PLAINTEXT +# target.producer.sasl.mechanism: PLAIN +# target.producer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# target.consumer.security.protocol: SASL_PLAINTEXT +# target.consumer.sasl.mechanism: PLAIN +# target.consumer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# admin.security.protocol: SASL_PLAINTEXT +# admin.sasl.mechanism: PLAIN +# admin.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# consumer.security.protocol: SASL_PLAINTEXT +# consumer.sasl.mechanism: PLAIN +# consumer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# producer.security.protocol: SASL_PLAINTEXT +# producer.sasl.mechanism: PLAIN +# producer.sasl.jaas.config: >- +# org.apache.kafka.common.security.plain.PlainLoginModule required +# username='admin' password='your-admin-password'; +# - name: "camel-s3-sink" +# enabled: true +# artifacts: +# - type: tgz +# url: https://repo.maven.apache.org/maven2/org/apache/camel/kafkaconnector/camel-aws-s3-sink-kafka-connector/4.8.5/camel-aws-s3-sink-kafka-connector-4.8.5-package.tar.gz +# config: +# "connector.class": "org.apache.camel.kafkaconnector.awss3sink.CamelAwss3sinkSinkConnector" +# "tasks.max": "1" +# "topics": "your-topic" +# "camel.kamelet.aws-s3-sink.useSessionCredentials": "false" +# "camel.kamelet.aws-s3-sink.bucketNameOrArn": "your-bucket-name" +# "camel.kamelet.aws-s3-sink.keyName": "your-topic-sink-${exchangeId}.txt" +# "camel.kamelet.aws-s3-sink.region": "your-region" +# "camel.kamelet.aws-s3-sink.autoCreateBucket": "true" +# "camel.kamelet.aws-s3-sink.accessKey": "your-access-key" +# "camel.kamelet.aws-s3-sink.secretKey": "your-secret-key" +# - name: "clickhouse-sink" +# enabled: true +# ssl_truststore: +# generate: true +# truststore_path: /tmp/kafka.autogenerated.truststore.jks +# truststore_password_env: "SSL_CLICKHOUSE_SINK_TRUSTSTORE_PASSWORD" +# hostnames: +# - domain1.clickhouse-sink.com +# - domain2.clickhouse-sink.com +# artifacts: +# - type: zip +# url: https://github.com/ClickHouse/clickhouse-kafka-connect/releases/download/v1.2.8/clickhouse-kafka-connect-v1.2.8.zip +# config: +# "connector.class": "com.clickhouse.kafka.connect.ClickHouseSinkConnector" +# "tasks.max": "1" +# "topics": "your-topic" +# "security.protocol": "SASL_PLAINTEXT" # Connect to Kafka cluster using PLAINTEXT protocol - Internal connection mTLS encrypted +# "hostname": "your-hostname" +# "username": "your-username" +# "database": "your-database" +# "password": "your-password" +# "port": "8443" +# "value.converter.schemas.enable": "false" +# "ssl": "true" # Connect to ClickHouse using SSL protocol +# "value.converter": "org.apache.kafka.connect.json.JsonConverter" +# "key.converter": "org.apache.kafka.connect.storage.StringConverter" +# "errors.retry.timeout": "30" +# "schemas.enable": "false" +# "jdbcConnectionProperties": "?sslmode=STRICT" +# "ssl.truststore.location": "/tmp/kafka.autogenerated.truststore.jks" +# "ssl.truststore.password": "${SSL_CLICKHOUSE_SINK_TRUSTSTORE_PASSWORD}" +# "errors.tolerance": "all" +# "errors.log.enable": "true" +# "errors.log.include.messages": "true" +# - name: "snowflake-sink" +# enabled: true +# artifacts: +# - type: jar +# url: https://repo1.maven.org/maven2/com/snowflake/snowflake-kafka-connector/3.1.1/snowflake-kafka-connector-3.1.1.jar +# - type: jar +# url: https://repo1.maven.org/maven2/org/bouncycastle/bc-fips/2.1.0/bc-fips-2.1.0.jar +# - type: jar +# url: https://repo1.maven.org/maven2/org/bouncycastle/bcpkix-fips/2.1.9/bcpkix-fips-2.1.9.jar +# config: +# "connector.class": "com.snowflake.kafka.connector.SnowflakeSinkConnector" +# "tasks.max": "1" +# "topics": "your-topic" +# "key.converter": "org.apache.kafka.connect.storage.StringConverter" +# "value.converter": "com.snowflake.kafka.connector.records.SnowflakeJsonConverter" +# "value.converter.schemas.enable": "false" +# "security.protocol": "SASL_PLAINTEXT" # Connect to Kafka cluster using SASL_PLAINTEXT protocol - Internal connection mTLS encrypted +# "snowflake.url.name": "your-snowflake-url" +# "snowflake.user.name": "your-snowflake-username" +# "snowflake.private.key": "your-snowflake-private-key" +# "snowflake.private.key.passphrase": "your-snowflake-private-key-passphrase" +# "snowflake.warehouse.name": "your-snowflake-warehouse-name" +# "snowflake.database.name": "your-snowflake-database-name" +# "snowflake.schema.name": "your-snowflake-schema-name" +# "snowflake.topic2table.map": "your-topic:your-table" +# "snowflake.role.name": "your-snowflake-role-name" +# "snowflake.enable.schematization": "false" +# "snowflake.disable.ssl.certificate.verification": "true" +# "snowflake.log.enable": "true" +# "snowflake.log.level": "DEBUG" +# "buffer.count.records": "10000" +# "buffer.flush.time": "120" +# "buffer.size.bytes": "10000000" +# "errors.tolerance": "all" +# "errors.log.enable": "true" +# "errors.log.include.messages": "true" + +kafka_rest_proxy: + enabled: true + deletionProtection: false + name: rest-proxy + image: confluentinc/cp-kafka-rest:latest + cpu: 500m + memory: 1000Mi + capacityAI: + enabled: true + minCpu: 125m # This only applied when capacityAI is enabled + minMemory: 200Mi # This only applied when capacityAI is enabled + replicas: 1 + timeoutSeconds: 15 + # domain: kafka-rest.example.com # Domain name for the Kafka Rest Proxy. + + # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall + firewall: + # internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc(Recommended) + external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + # # You can specify additional workloads with either same-gvc or workload-list: + # inboundAllowWorkload: + # - //gvc/main-kafka/workload/main-kafka-kafbat-ui + # - //gvc/client-gvc/workload/client + external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + properties: + # host.name: kafka-rest.example.com + bootstrap.servers: SASL_PLAINTEXT://kafka-dev-cluster:9092 + resource.extension: ALL + api.v3.enable: true + api.v2.enable: true + client.sasl.mechanism: PLAIN + api.compatibility.mode: BOTH + log4j.opts: -Dlog4j.configuration=file:/tmp/log4j.properties + listeners: http://0.0.0.0:8082 + authentication.realm: KafkaRest + authentication.method: BASIC + authentication.roles: user + client.security.protocol: SASL_PLAINTEXT + + # JAAS configuration for Kafka client and Kafka Rest Proxy + # https://docs.confluent.io/platform/current/kafka-rest/production-deployment/confluent-server/security.html#authentication-between-the-admin-rest-and-ak-brokers + jaas_conf: + KafkaClient { + org.apache.kafka.common.security.plain.PlainLoginModule required + username="admin" + password="your-admin-password"; + }; + KafkaRest { + org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required + debug="true" + file="/etc/kafka-rest/password.properties"; + }; + + # # Password properties for Kafka Rest Proxy + # # Required when authentication.method is set to BASIC + # # https://docs.confluent.io/platform/current/kafka-rest/production-deployment/confluent-server/security.html#password-properties + password_properties: + user: your-user-password,user + user1: password213,user + user2: password214,user + +kafka_client: + name: client + image: apache/kafka:3.9.1 + cpu: 500m + memory: 1000Mi + # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall + firewall: + # internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc + # external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + +# DEPRECATED NOTICE https://github.com/provectus/kafka-ui +# PLEASE USE KAFBAT UI INSTEAD +kafka_ui: + enabled: false + name: ui + image: provectuslabs/kafka-ui:latest + cpu: 200m + memory: 600Mi + listener: client + # To disable all traffic, comment out the corresponding rule. Docs: https://docs.controlplane.com/concepts/security#firewall + firewall: {} + # internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc + # external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + # external_outboundAllowCIDR: "111.222.333.444/16,111.222.444.333/32" # Provide a comma-separated list diff --git a/manticore/versions/2.0.1/Chart.yaml b/manticore/versions/2.0.1/Chart.yaml new file mode 100644 index 00000000..fc33a7ac --- /dev/null +++ b/manticore/versions/2.0.1/Chart.yaml @@ -0,0 +1,18 @@ +apiVersion: v2 +name: manticore +description: Distributed Manticore Search cluster with intelligent orchestration. + +type: application +version: 2.0.1 +appVersion: "25.0.0" + +annotations: + created: "2026-01-05" + lastModified: "2026-04-17" + category: "search" + createsGvc: false + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" \ No newline at end of file diff --git a/manticore/versions/2.0.1/README.md b/manticore/versions/2.0.1/README.md new file mode 100644 index 00000000..fe07cae8 --- /dev/null +++ b/manticore/versions/2.0.1/README.md @@ -0,0 +1,258 @@ +# Manticore Search Cluster + +Deploys a distributed Manticore Search cluster on Control Plane with automatic Galera-based replication, zero-downtime data imports, multi-table support, backup/restore, and a web UI for cluster management. + +## Architecture + +The template deploys several components that work together: + +- **Manticore Workload** - Stateful replicas running Manticore searchd, each with a sidecar agent for local operations +- **Orchestrator API** - REST API that coordinates cluster-wide operations (initialization, imports, repairs, backups) +- **Orchestrator Job** - Cron workload for on-demand job execution +- **UI** - Web dashboard for monitoring and managing the cluster + +The orchestrator handles cluster initialization, coordinates imports across all replicas using a dual-slot (A/B) system for zero-downtime swaps, and provides automatic repair for split-brain scenarios. All replicas stay in sync via Galera cluster replication. + +## Prerequisites + +1. **S3 Bucket** - Create an S3 bucket to store your CSV source files +2. **Control Plane Cloud Account** - Follow the [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account) guide to establish trust between Control Plane and your AWS account + +## Installation + +1. **Configure S3 access** in `values.yaml`: + ```yaml + buckets: + cloudAccountName: your-cloud-account + awsPolicyRefs: + - aws::AmazonS3ReadOnlyAccess # or your custom policy + sourceBucket: your-bucket-name + ``` + +2. **Define your tables**: + ```yaml + tables: + - name: products + csvPath: imports/products/data.csv + config: + haStrategy: noerrors # HA strategy for distributed queries; 'noerrors' skips agents that return errors + agentRetryCount: 3 # Number of times to retry failed agent connections + clusterMain: false # Set to true to replicate the main table across all cluster nodes + memLimit: 2G # Memory limit for indexer during import (max = 2G) + hasHeader: true # Set to true if the CSV file includes a header row + schema: + columns: + - name: title + type: field + - name: price + type: attr_float + ``` + +3. **Generate an authentication token**: + ```bash + openssl rand -base64 32 + ``` + Set this in `orchestrator.agent.token`. This bearer token secures all internal API communication between components. + +**Note:** After installation, the cluster will be initialized but tables will be empty until you run an import. See [Operations](#operations) below. + +## Authentication + +All internal communication is secured with the bearer token set in `orchestrator.agent.token`. This token is shared across the orchestrator, agents, and UI. + +- Must be set before deployment +- Should be cryptographically random (use `openssl rand -base64 32`) +- Rotating requires redeploying all components + +**Security note:** The UI injects this token automatically, so anyone with network access to the UI can perform admin operations. Restrict access by setting `orchestrator.ui.allowExternalAccess: false` or using a domain with authentication. + +## Configuration Reference + +### Core Settings + +| Path | Description | Default | +|------|-------------|---------| +| `buckets.cloudAccountName` | AWS Cloud Account name | - | +| `buckets.sourceBucket` | S3 bucket with CSV files | - | +| `manticore.clusterName` | Galera cluster name | `manticore` | +| `manticore.autoscaling.minScale` | Minimum replicas | `3` | +| `manticore.autoscaling.maxScale` | Maximum replicas | `4` | + +### Table Configuration + +Each entry in `tables[]` supports: + +| Field | Description | +|-------|-------------| +| `name` | Table name | +| `csvPath` | Path to CSV in S3 bucket, or a list of paths for multi-segment tables (see [Multi-Segment Tables](#multi-segment-tables)) | +| `config.haStrategy` | HA strategy: `noerrors`, `nodeads`, etc. | +| `config.agentRetryCount` | Retry count for distributed queries | +| `config.clusterMain` | Replicate main tables across cluster | +| `config.segmentCount` | Number of distributed table segments; must match the number of entries in `csvPath` (default: `1`) | +| `config.importMethod` | Import method: `indexer` or `sql` | +| `config.charsetTable` | Manticore `charset_table` tokenization preset (e.g., `non_cont`) — omit to use the Manticore default | +| `config.memLimit` | Memory limit for indexer operations (e.g., `2G`) | +| `config.hasHeader` | Whether the CSV file has a header row (`true`/`false`) | +| `schema.columns` | Column definitions (see column types below) | + +### Column Types + +| Type | Description | +|------|-------------| +| `field` | Full-text searchable field | +| `field_string` | Full-text field (string variant) | +| `attr_uint` | Unsigned integer attribute | +| `attr_bigint` | Big integer attribute | +| `attr_float` | Float attribute | +| `attr_bool` | Boolean attribute | +| `attr_string` | String attribute (not full-text indexed) | +| `attr_timestamp` | Timestamp attribute | +| `attr_multi` | Multi-value integer attribute | +| `attr_multi_64` | Multi-value 64-bit integer attribute | +| `attr_json` | JSON attribute | + +**Note**: If column 1 is numeric, it's used as the document ID (don't declare it). If not numeric, an ID is auto-generated. + +### Orchestrator Settings + +| Path | Description | Default | +|------|-------------|---------| +| `orchestrator.schedule` | Cron schedule for imports | `0 * * * *` | +| `orchestrator.action` | Action: `init`, `import`, `health`, `repair` | `import` | +| `orchestrator.tableName` | Table to import | - | +| `orchestrator.suspend` | Start suspended | `true` | +| `orchestrator.agent.token` | Bearer token for auth | **required** | + +## Multi-Segment Tables + +Large datasets can be split across multiple CSV files and imported as a distributed table with multiple independent segments. Manticore fans queries across all segments automatically. + +Set `csvPath` to a list of S3 paths and set `segmentCount` to match the number of entries: + +```yaml +tables: + - name: addresses + csvPath: + - large-file/part1.csv + - large-file/part2.csv + config: + segmentCount: 2 # must match the number of csvPath entries + importMethod: indexer + memLimit: 2G + hasHeader: true + schema: + columns: + - name: street_name + type: field +``` + +`segmentCount` must equal the number of items in `csvPath`. The template will fail at render time with a descriptive error if they don't match. + +When a table has multiple segments, a backup backs up **all segments** as one file. Restores on multi-segment tables will restore all segments on the table. + +## Operations + +Operations can be triggered via the **Orchestrator UI** or the **Control Plane CLI/API**. + +### Via Orchestrator UI + +The web dashboard provides controls for: +- **Import, Backup and Restore** - Select a table and trigger a coordinated import, backup, or restore process +- **Repair** - Recover the cluster from split-brain scenarios +- **Monitoring** - View cluster health, replica status, and table details + +### Via Control Plane + +Run the orchestrator cron workload to execute operations: + +```bash +# Trigger an import +cpln workload run-cron {release-name}-orchestrator-job --gvc {gvc-name} + +# Trigger a repair (set ACTION=repair on the workload first) +cpln workload run-cron {release-name}-orchestrator-job --gvc {gvc-name} +``` + +## Load Testing + +Enable k6 load testing to validate search performance: + +```yaml +loadTest: + enabled: true + vus: 10 + duration: "5m" + query: + index: products + query: + match: + "*": "test" +``` + +Trigger via Control Plane: +```bash +cpln workload run-cron {release-name}-load-test-controller --gvc {gvc-name} +``` + +Or set `loadTest.controller.schedule` to run on a cron schedule. + +## Backup & Restore + +Backup and restore is available for both **delta** (real-time updates) and **main** (full indexed dataset) tables. Backups are stored as compressed archives in S3. + +### Prerequisites + +1. **S3 Bucket** for storing backups (can be shared with or separate from source data) +2. **IAM Policy** with `s3:GetObject`, `s3:PutObject`, `s3:DeleteObject`, `s3:ListBucket` permissions on the bucket +3. **Cloud Account** with the above policy attached + +### Configuration + +Enable backups in `values.yaml`: + +```yaml +orchestrator: + backup: + enabled: true + cloudAccountName: my-backup-cloud-account + s3Bucket: my-backup-bucket + s3Policy: + - my-backup-policy + s3Region: us-east-1 + prefix: manticore-backups + schedules: [ # Automated backup schedules (optional) + {"table": "products", "type": "delta", "schedule": "0 2 * * *"}, + {"table": "products", "type": "main", "schedule": "0 3 * * 0"} + ] +``` + +### Usage + +**Via Orchestrator UI:** +- **Backup**: Select a type (delta/main) and click "Backup" +- **Restore**: Select a type, choose a backup file from the list, and confirm +- **Rotate Main**: After a main restore, swap the active slot + +**Via API:** +```bash +# Backup +curl -X POST "https://{orchestrator-api-url}/api/backup" \ + -H "Authorization: Bearer {token}" \ + -H "Content-Type: application/json" \ + -d '{"tableName": "products", "type": "delta"}' + +# List backups +curl "https://{orchestrator-api-url}/api/backups/files?tableName=products" \ + -H "Authorization: Bearer {token}" + +# Restore +curl -X POST "https://{orchestrator-api-url}/api/restore" \ + -H "Authorization: Bearer {token}" \ + -H "Content-Type: application/json" \ + -d '{"tableName": "products", "type": "delta", "filename": "products_delta-2024-01-28T22-50-49Z.tar.gz"}' +``` + +## Links +- [Manticore Search Docs](https://manual.manticoresearch.com/) +- [Orchestrator, Agent, UI and Backup source code](https://github.com/controlplane-com/manticore-orchestrator) \ No newline at end of file diff --git a/manticore/versions/2.0.1/templates/_helpers.tpl b/manticore/versions/2.0.1/templates/_helpers.tpl new file mode 100644 index 00000000..c0326b1d --- /dev/null +++ b/manticore/versions/2.0.1/templates/_helpers.tpl @@ -0,0 +1,242 @@ +{{/* Resource Naming */}} + +{{/* +Manticore Workload Name +*/}} +{{- define "manticore.name" -}} +{{- printf "%s-manticore" .Release.Name }} +{{- end }} + +{{/* +Manticore Orchestrator Job Workload Name +*/}} +{{- define "manticore.orchestratorJobName" -}} +{{- printf "%s-orchestrator-job" .Release.Name }} +{{- end }} + +{{/* +Manticore Orchestrator API Workload Name +*/}} +{{- define "manticore.orchestratorAPIName" -}} +{{- printf "%s-orchestrator-api" .Release.Name }} +{{- end }} + +{{/* +Manticore UI Workload Name +*/}} +{{- define "manticore.UIName" -}} +{{- printf "%s-ui" .Release.Name }} +{{- end }} + +{{/* +Manticore Backup Workload Name +*/}} +{{- define "manticore.backupName" -}} +{{- printf "%s-manticore-backup" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Workload Name +*/}} +{{- define "manticore.loadTestName" -}} +{{- printf "%s-load-test" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Controller Workload Name +*/}} +{{- define "manticore.loadTestControllerName" -}} +{{- printf "%s-load-test-controller" .Release.Name }} +{{- end }} + +{{/* +Manticore Secret Config Name +*/}} +{{- define "manticore.secretConfigName" -}} +{{- printf "%s-manticore-config" .Release.Name }} +{{- end }} + +{{/* +Manticore Secret Startup Name +*/}} +{{- define "manticore.secretStartupName" -}} +{{- printf "%s-manticore-startup" .Release.Name }} +{{- end }} + +{{/* +Manticore Secret Schema Config Name +*/}} +{{- define "manticore.secretSchemaConfigName" -}} +{{- printf "%s-manticore-schema" .Release.Name }} +{{- end }} + +{{/* +Manticore Secret Agent Token Name +*/}} +{{- define "manticore.secretAgentTokenName" -}} +{{- printf "%s-manticore-agent-token" .Release.Name }} +{{- end }} + +{{/* +Manticore Secret K6 Script Name +*/}} +{{- define "manticore.secretK6ScriptName" -}} +{{- printf "%s-manticore-k6-script" .Release.Name }} +{{- end }} + +{{/* +Manticore Identity Name +*/}} +{{- define "manticore.identityName" -}} +{{- printf "%s-manticore-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Orchestrator Identity Name +*/}} +{{- define "manticore.orchestratorIdentityName" -}} +{{- printf "%s-manticore-orchestrator-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Orchestrator Job Identity Name +*/}} +{{- define "manticore.orchestratorJobIdentityName" -}} +{{- printf "%s-manticore-orchestrator-job-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Identity Name +*/}} +{{- define "manticore.loadTestIdentityName" -}} +{{- printf "%s-manticore-load-test-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Controller Identity Name +*/}} +{{- define "manticore.loadTestControllerIdentityName" -}} +{{- printf "%s-manticore-load-test-controller-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Backup Identity Name +*/}} +{{- define "manticore.backupIdentityName" -}} +{{- printf "%s-manticore-backup-identity" .Release.Name }} +{{- end }} + +{{/* +Manticore Config Policy Name +*/}} +{{- define "manticore.configPolicyName" -}} +{{- printf "%s-manticore-config-policy" .Release.Name }} +{{- end }} + +{{/* +Manticore Exec Policy Name +*/}} +{{- define "manticore.execPolicyName" -}} +{{- printf "%s-manticore-exec-policy" .Release.Name }} +{{- end }} + +{{/* +Manticore Orchestrator Policy Name +*/}} +{{- define "manticore.orchestratorPolicyName" -}} +{{- printf "%s-manticore-orchestrator-policy" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Policy Name +*/}} +{{- define "manticore.loadTestPolicyName" -}} +{{- printf "%s-manticore-load-test-policy" .Release.Name }} +{{- end }} + +{{/* +Manticore Load Test Controller Policy Name +*/}} +{{- define "manticore.loadTestControllerPolicyName" -}} +{{- printf "%s-manticore-load-test-controller-policy" .Release.Name }} +{{- end }} + +{{/* +Manticore Volume Set Name +*/}} +{{- define "manticore.volumeName" -}} +{{- printf "%s-manticore-vs" .Release.Name }} +{{- end }} + +{{/* +Manticore Shared Volume Set Name +*/}} +{{- define "manticore.sharedVolumeName" -}} +{{- printf "%s-manticore-vs-shared" .Release.Name }} +{{- end }} + + +{{/* Functions */}} + +{{/* +Generate JSON mapping of table names to CSV paths for orchestrator. +csvPath accepts a single string or a list for multi-segment tables. +Output (single): {"addresses":"imports/addresses/data.csv"} +Output (multi): {"addresses":["imports/addresses/data_1.csv","imports/addresses/data_2.csv"]} +*/}} +{{- define "manticore.tablesConfigJSON" -}} +{{- $config := dict -}} +{{- range . -}} +{{- $_ := set $config .name .csvPath -}} +{{- end -}} +{{- $config | toJson -}} +{{- end }} + +{{/* +Validate that each table's csvPath length matches its config.segmentCount. +csvPath may be a single string (segmentCount must be 1) or a list (length must equal segmentCount). +*/}} +{{- define "manticore.validateTables" -}} +{{- range .Values.tables -}} +{{- $tableName := .name -}} +{{- $segmentCount := .config.segmentCount | int -}} +{{- if kindIs "slice" .csvPath -}} + {{- $csvCount := len .csvPath -}} + {{- if ne $csvCount $segmentCount -}} + {{- fail (printf "Table %q: csvPath has %d entries but segmentCount is %d — they must match." $tableName $csvCount $segmentCount) -}} + {{- end -}} +{{- else -}} + {{- if ne $segmentCount 1 -}} + {{- fail (printf "Table %q: csvPath is a single string but segmentCount is %d — it must be 1 when csvPath is a single value." $tableName $segmentCount) -}} + {{- end -}} +{{- end -}} +{{- end -}} +{{- end }} + +{{/* +Calculate total load test duration in seconds (duration + buffer) +Parses duration strings like "5m", "1h", "30s" +*/}} +{{- define "loadTest.totalDurationSeconds" -}} +{{- $duration := .Values.loadTest.duration -}} +{{- $buffer := .Values.loadTest.controller.testDurationBuffer | int -}} +{{- $seconds := 0 -}} +{{- if hasSuffix "s" $duration -}} + {{- $seconds = trimSuffix "s" $duration | int -}} +{{- else if hasSuffix "m" $duration -}} + {{- $seconds = mul (trimSuffix "m" $duration | int) 60 -}} +{{- else if hasSuffix "h" $duration -}} + {{- $seconds = mul (trimSuffix "h" $duration | int) 3600 -}} +{{- end -}} +{{- add $seconds $buffer -}} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels - delegated to cpln-common +*/}} +{{- define "manticore.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/manticore/versions/2.0.1/templates/domain.yaml b/manticore/versions/2.0.1/templates/domain.yaml new file mode 100644 index 00000000..aad6906a --- /dev/null +++ b/manticore/versions/2.0.1/templates/domain.yaml @@ -0,0 +1,34 @@ +# ============================================================================= +# External Domain (Optional) +# ============================================================================= +# Routes /api/* to orchestrator-api, /* to UI +{{- if .Values.domain.enabled }} +kind: domain +name: {{ .Values.domain.name }} +description: External domain for Manticore cluster UI and API +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + dnsMode: {{ .Values.domain.dnsMode }} + gvcLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }} + acceptAllHosts: false + ports: + - number: 443 + protocol: http2 + cors: + allowOrigins: + - exact: '*' + allowMethods: + - GET + - POST + - OPTIONS + allowHeaders: + - '*' + allowCredentials: true + routes: + - prefix: /api/ + workloadLink: //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.orchestratorAPIName" . }} + port: 8080 + - prefix: / + workloadLink: //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.UIName" . }} + port: 3000 +{{- end }} diff --git a/manticore/versions/2.0.1/templates/identity.yaml b/manticore/versions/2.0.1/templates/identity.yaml new file mode 100644 index 00000000..cad0607e --- /dev/null +++ b/manticore/versions/2.0.1/templates/identity.yaml @@ -0,0 +1,66 @@ +# ============================================================================= +# Workload Identities +# ============================================================================= +# Manticore identity - S3 access via AWS Cloud Account +kind: identity +name: {{ include "manticore.identityName" . }} +description: Manticore workload identity for secret access +tags: {{- include "manticore.tags" . | nindent 4 }} + +--- +# Orchestrator identity - secret access + CPLN API for cluster operations + s3 access +kind: identity +name: {{ include "manticore.orchestratorIdentityName" . }} +description: Orchestrator identity for secrets, CPLN API, and S3 access +tags: {{- include "manticore.tags" . | nindent 4 }} +{{- if .Values.orchestrator.backup.enabled }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.orchestrator.backup.cloudAccountName }} + policyRefs: + {{- range .Values.orchestrator.backup.s3Policy }} + - {{ . }} + {{- end }} +{{- end }} + +--- +# Orchestrator job identity - secret access + S3 access for cron operations +kind: identity +name: {{ include "manticore.orchestratorJobIdentityName" . }} +description: Orchestrator job identity for secrets and S3 access +tags: {{- include "manticore.tags" . | nindent 4 }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.buckets.cloudAccountName }} + policyRefs: + {{- range .Values.buckets.awsPolicyRefs }} + - {{ . }} + {{- end }} + +{{- if .Values.orchestrator.backup.enabled }} +--- +# Backup identity - S3 access for backups +kind: identity +name: {{ include "manticore.backupIdentityName" . }} +description: Manticore backup identity for S3 access +tags: {{- include "manticore.tags" . | nindent 4 }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.orchestrator.backup.cloudAccountName }} + policyRefs: + {{- range .Values.orchestrator.backup.s3Policy }} + - {{ . }} + {{- end }} +{{- end }} + +{{- if .Values.loadTest.enabled }} +--- +# Load test identity - access to k6 script secret +kind: identity +name: {{ include "manticore.loadTestIdentityName" . }} +description: Load test workload identity for script access +tags: {{- include "manticore.tags" . | nindent 4 }} +--- +# Load test controller identity - workload scaling permissions +kind: identity +name: {{ include "manticore.loadTestControllerIdentityName" . }} +description: Controller identity for scaling load-test workload +tags: {{- include "manticore.tags" . | nindent 4 }} +{{- end }} diff --git a/manticore/versions/2.0.1/templates/policy.yaml b/manticore/versions/2.0.1/templates/policy.yaml new file mode 100644 index 00000000..78c046f7 --- /dev/null +++ b/manticore/versions/2.0.1/templates/policy.yaml @@ -0,0 +1,99 @@ +# ============================================================================= +# Access Policies +# ============================================================================= +# Secret access policy - grants reveal on config secrets +kind: policy +name: {{ include "manticore.configPolicyName" . }} +description: Secret access for manticore and orchestrator workloads +tags: {{- include "manticore.tags" . | nindent 4 }} +bindings: + - permissions: + - reveal + principalLinks: + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.identityName" . }} + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorIdentityName" . }} + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorJobIdentityName" . }} + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.backupIdentityName" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "manticore.secretConfigName" . }} + - //secret/{{ include "manticore.secretStartupName" . }} + - //secret/{{ include "manticore.secretSchemaConfigName" . }} + - //secret/{{ include "manticore.secretAgentTokenName" . }} +targetQuery: + kind: secret + fetch: items + spec: + match: all + terms: [] +--- +# Cron execution policy - allows triggering orchestrator jobs +kind: policy +name: {{ include "manticore.execPolicyName" . }} +description: Permission to trigger orchestrator cron workload executions +tags: {{- include "manticore.tags" . | nindent 4 }} +bindings: + - permissions: + - view + - edit + - exec.runCronWorkload + principalLinks: + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.identityName" . }} + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorIdentityName" . }} +targetKind: workload +targetLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.orchestratorJobName" . }} + - //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.name" . }} + {{- if .Values.orchestrator.backup.enabled }} + - //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.backupName" . }} + {{- end }} + +--- +# Workload view policy - allows orchestrator to read Manticore workload config +kind: policy +name: {{ include "manticore.orchestratorPolicyName" . }} +description: Permission to view Manticore workload configuration +tags: {{- include "manticore.tags" . | nindent 4 }} +bindings: + - permissions: + - view + - edit + principalLinks: + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorJobIdentityName" . }} + - /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.backupIdentityName" . }} +targetKind: workload +targetLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.name" . }} + +{{- if .Values.loadTest.enabled }} +--- +# Load test script access +kind: policy +name: {{ include "manticore.loadTestPolicyName" . }} +description: K6 workload access to load test script secret +tags: {{- include "manticore.tags" . | nindent 4 }} +bindings: + - permissions: + - use + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.loadTestIdentityName" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "manticore.secretK6ScriptName" . }} +--- +# Load test controller policy - allows scaling k6 workload +kind: policy +name: {{ include "manticore.loadTestControllerPolicyName" . }} +description: Controller permission to scale load-test workload +tags: {{- include "manticore.tags" . | nindent 4 }} +bindings: + - permissions: + - manage + - edit + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.loadTestControllerIdentityName" . }} +targetKind: workload +targetLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/workload/{{ include "manticore.loadTestName" . }} +{{- end }} diff --git a/manticore/versions/2.0.1/templates/secret.yaml b/manticore/versions/2.0.1/templates/secret.yaml new file mode 100644 index 00000000..b8deb254 --- /dev/null +++ b/manticore/versions/2.0.1/templates/secret.yaml @@ -0,0 +1,330 @@ +{{- include "manticore.validateTables" . -}} +# ============================================================================= +# Manticore Search Configuration +# ============================================================================= +# Base searchd config (startup script generates runtime config with IP-bound listeners) +kind: secret +name: {{ include "manticore.secretConfigName" . }} +description: Manticore searchd base configuration +tags: {{- include "manticore.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + searchd { + listen = 9306:mysql + listen = 9308:http + listen = 9312 + data_dir = /var/lib/manticore + binlog_path = /var/lib/manticore/binlog + log = /dev/stdout + query_log = /dev/stdout + pid_file = /var/run/searchd.pid + seamless_rotate = 1 + preopen_tables = 1 + } +--- +# ============================================================================= +# Schema Registry +# ============================================================================= +# Table schemas for agent (creates RT delta tables, parses CSV, builds distributed tables) +kind: secret +name: {{ include "manticore.secretSchemaConfigName" . }} +description: Schema registry for multi-table configuration +tags: {{- include "manticore.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + # Schema Registry (YAML format, read by agent) + # See README.md for column types and configuration options + {{- range .Values.tables }} + + {{ .name }}: + config: + {{- .config | toYaml | nindent 8 }} + schema: + columns: + {{- range .schema.columns }} + - name: {{ .name }} + type: {{ .type }} + {{- end }} + {{- end }} +--- +# ============================================================================= +# Manticore Startup Script +# ============================================================================= +# Generates runtime config, starts searchd, handles graceful shutdown +# Cluster init (bootstrap/join) handled by agent via orchestrator API +kind: secret +name: {{ include "manticore.secretStartupName" . }} +description: Manticore searchd startup and shutdown handler +tags: {{- include "manticore.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + #!/usr/bin/env bash + set -euo pipefail + + # ============================================================================= + # CONFIGURATION + # ============================================================================= + + CLUSTER_NAME="{{ .Values.manticore.clusterName }}" + MYSQL_PORT=9306 + HTTP_PORT=9308 + REPL_PORT=9312 + WORKLOAD_NAME="$(echo "${HOSTNAME}" | sed 's/-[0-9]*$//')" + REPLICA_INDEX="$(echo "${HOSTNAME}" | awk -F'-' '{print $NF}')" + LOCATION=$(basename "${CPLN_LOCATION}") + # Internal DNS format: {workloadName}-{replicaIndex}.{workloadName} + NODE0_FQDN="${WORKLOAD_NAME}-0.${WORKLOAD_NAME}" + NODE0_ADDR="${NODE0_FQDN}:${REPL_PORT}" + + echo "============================================" + echo "Manticore Startup" + echo "============================================" + echo "Hostname: ${HOSTNAME}" + echo "Replica Index: ${REPLICA_INDEX}" + echo "Cluster: ${CLUSTER_NAME}" + echo "Node 0 FQDN: ${NODE0_FQDN}" + echo "============================================" + echo "" + + # Create required directories + echo "Creating directories..." + mkdir -p /var/lib/manticore/binlog + mkdir -p /var/lib/manticore/data + echo "Directories created." + + RUNTIME_CONFIG="/tmp/manticore-runtime.conf" + + # ============================================================================= + # HELPER FUNCTIONS + # ============================================================================= + + mysql_exec() { + local host="${1:-127.0.0.1}" + shift || true + mysql --protocol=tcp -h "${host}" -P "${MYSQL_PORT}" -N -B "$@" + } + + wait_for_manticore() { + echo "Waiting for Manticore MySQL port ${MYSQL_PORT}..." + for i in $(seq 1 60); do + if mysql --protocol=tcp -h 127.0.0.1 -P "${MYSQL_PORT}" -e "SELECT 1" >/dev/null 2>&1; then + echo "Manticore is ready." + return 0 + fi + echo " [$i/60] not ready yet..." + sleep 1 + done + echo "ERROR: Manticore did not become ready on ${MYSQL_PORT}" + return 1 + } + + # ============================================================================= + # GRACEFUL SHUTDOWN HANDLER + # ============================================================================= + + shutdown() { + echo "" + echo "============================================" + echo "SIGTERM received: graceful shutdown" + echo "============================================" + + echo "Stopping searchd gracefully..." + if searchd --stopwait --config "${RUNTIME_CONFIG}"; then + echo "searchd stopped gracefully" + else + echo "searchd --stopwait failed, falling back to kill" + if [[ -n "${SEARCHD_PID:-}" ]] && kill -0 "$SEARCHD_PID" >/dev/null 2>&1; then + kill "$SEARCHD_PID" || true + for _ in $(seq 1 30); do + kill -0 "$SEARCHD_PID" >/dev/null 2>&1 || break + sleep 1 + done + fi + fi + + # Keep manticore.json on all replicas so they remember cluster state and + # table associations. This allows IST (incremental sync) instead of full SST + # when rejoining, and ensures local tables are recognized as cluster tables. + if [[ "${REPLICA_INDEX}" != "0" ]]; then + echo "Requesting cluster node refresh on replica-0..." + mysql_exec "${NODE0_FQDN}" -e "ALTER CLUSTER ${CLUSTER_NAME} UPDATE nodes" || true + fi + + echo "Shutdown sequence complete." + echo "============================================" + } + + trap shutdown TERM INT + + # ============================================================================= + # START MANTICORE + # ============================================================================= + + echo "" + echo "============================================" + echo "Preparing searchd configuration..." + echo "============================================" + + # Get the pod's IP address and FQDN for replication binding + MY_IP=$(hostname -i | awk '{print $1}') + MY_FQDN="${WORKLOAD_NAME}-${REPLICA_INDEX}.${WORKLOAD_NAME}" + echo "Pod IP address: ${MY_IP}" + echo "Pod FQDN: ${MY_FQDN}" + + # Create runtime config with dynamic replication listener + echo "Generating runtime config: ${RUNTIME_CONFIG}" + + echo "searchd {" > "${RUNTIME_CONFIG}" + echo " listen = 127.0.0.1:9306:mysql" >> "${RUNTIME_CONFIG}" + echo " listen = ${MY_IP}:9306:mysql" >> "${RUNTIME_CONFIG}" + echo " listen = 127.0.0.1:9308:http" >> "${RUNTIME_CONFIG}" + echo " listen = ${MY_IP}:9308:http" >> "${RUNTIME_CONFIG}" + echo " listen = 127.0.0.1:9312" >> "${RUNTIME_CONFIG}" + echo " listen = ${MY_IP}:9312" >> "${RUNTIME_CONFIG}" + echo " listen = 0.0.0.0:9322-9323:replication" >> "${RUNTIME_CONFIG}" + + echo " node_address = ${MY_FQDN}" >> "${RUNTIME_CONFIG}" + echo " data_dir = /var/lib/manticore" >> "${RUNTIME_CONFIG}" + echo " binlog_path = /var/lib/manticore/binlog" >> "${RUNTIME_CONFIG}" + echo " log = /tmp/searchd.log" >> "${RUNTIME_CONFIG}" + echo " query_log = /dev/stdout" >> "${RUNTIME_CONFIG}" + echo " pid_file = /var/run/searchd.pid" >> "${RUNTIME_CONFIG}" + echo " seamless_rotate = 1" >> "${RUNTIME_CONFIG}" + echo " preopen_tables = 1" >> "${RUNTIME_CONFIG}" + echo " server_id = ${REPLICA_INDEX}" >> "${RUNTIME_CONFIG}" + echo " buddy_path = /usr/share/manticore/modules/manticore-buddy/bin/manticore-buddy" >> "${RUNTIME_CONFIG}" + echo "}" >> "${RUNTIME_CONFIG}" + echo "common {" >> "${RUNTIME_CONFIG}" + echo " plugin_dir = /usr/share/manticore/modules" >> "${RUNTIME_CONFIG}" + echo "}" >> "${RUNTIME_CONFIG}" + + echo "Runtime config generated:" + cat "${RUNTIME_CONFIG}" + echo "" + + chmod +x /usr/share/manticore/modules/manticore-buddy/bin/manticore-buddy + + echo "============================================" + echo "Starting searchd..." + echo "============================================" + + echo "" + echo "Launching searchd in background..." + touch /tmp/searchd.log + tail -f /tmp/searchd.log & + searchd --config "${RUNTIME_CONFIG}" --nodetach 2>&1 & + SEARCHD_PID="$!" + echo "searchd launched with PID: ${SEARCHD_PID}" + + sleep 2 + + if ! kill -0 "${SEARCHD_PID}" 2>/dev/null; then + echo "ERROR: searchd exited immediately! Check config." + exit 1 + fi + + echo "searchd is running. Waiting for MySQL port..." + wait_for_manticore + + # ============================================================================= + # CLUSTER STATUS CHECK (cluster setup is handled by orchestrator init) + # ============================================================================= + + echo "" + echo "============================================" + echo "Cluster Status" + echo "============================================" + + # Check if already part of a cluster (from preserved state in manticore.json) + # Wait a few seconds for cluster module to load state + sleep 3 + CLUSTER_STATUS=$(mysql_exec 127.0.0.1 -e "SHOW STATUS LIKE 'cluster_${CLUSTER_NAME}_status'" 2>/dev/null || echo "") + + if echo "${CLUSTER_STATUS}" | grep -qE "(primary|synced)"; then + echo "Already part of cluster ${CLUSTER_NAME}." + else + echo "Not part of any cluster. Run orchestrator init to bootstrap/join cluster." + fi + + # ============================================================================= + # KEEP CONTAINER ALIVE + # ============================================================================= + + echo "" + echo "============================================" + echo "Startup complete" + echo "============================================" + echo "searchd PID: ${SEARCHD_PID}" + echo "============================================" + echo "" + echo "Import operations are handled by the agent sidecar." + echo "Use the orchestrator workload for coordinated imports." + echo "" + + wait "${SEARCHD_PID}" +--- +# ============================================================================= +# Agent Authentication Token +# ============================================================================= +# Bearer token for orchestrator/agent/UI communication +# Generate with: openssl rand -base64 32 +kind: secret +name: {{ include "manticore.secretAgentTokenName" . }} +description: Bearer token for orchestrator-agent authentication +tags: {{- include "manticore.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: {{ required "agent.token is required" .Values.orchestrator.agent.token }} + +# ============================================================================= +# K6 Load Test Script +# ============================================================================= +{{- if .Values.loadTest.enabled }} +--- +# Generated k6 script from loadTest.* values +kind: secret +name: {{ include "manticore.secretK6ScriptName" . }} +description: K6 load test script for Manticore search +tags: {{- include "manticore.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + import http from 'k6/http'; + import { check } from 'k6'; + import { Rate } from 'k6/metrics'; + + const errorRate = new Rate('errors'); + + export const options = { + vus: {{ .Values.loadTest.vus }}, + duration: '{{ .Values.loadTest.duration }}', + thresholds: { + 'http_req_duration': ['p(95)<{{ .Values.loadTest.thresholds.p95ResponseTime }}'], + 'http_req_failed': ['rate<{{ .Values.loadTest.thresholds.errorRate }}'], + }, + }; + + const BASE_URL = 'http://{{ include "manticore.name" . }}.{{ .Values.global.cpln.gvc }}.cpln.local:{{ .Values.loadTest.target.port }}'; + + const QUERY = JSON.stringify({{ .Values.loadTest.query | toJson }}); + + export default function () { + const res = http.post(`${BASE_URL}/{{ .Values.loadTest.target.endpoint }}`, QUERY, { + headers: { 'Content-Type': 'application/json' }, + }); + + const success = check(res, { + 'status is 200': (r) => r.status === 200, + }); + + errorRate.add(!success); + } +{{- end }} diff --git a/manticore/versions/2.0.1/templates/volumeset-shared.yaml b/manticore/versions/2.0.1/templates/volumeset-shared.yaml new file mode 100644 index 00000000..77934a40 --- /dev/null +++ b/manticore/versions/2.0.1/templates/volumeset-shared.yaml @@ -0,0 +1,11 @@ +# ============================================================================= +# Shared Volumeset +# ============================================================================= +# Shared storage across replicas and orchestrator +kind: volumeset +name: {{ include "manticore.sharedVolumeName" . }} +description: Shared storage across Manticore replicas and orchestrator +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + fileSystemType: shared + initialCapacity: {{ .Values.manticore.sharedVolumeset.capacity }} diff --git a/manticore/versions/2.0.1/templates/volumeset.yaml b/manticore/versions/2.0.1/templates/volumeset.yaml new file mode 100644 index 00000000..7935d42b --- /dev/null +++ b/manticore/versions/2.0.1/templates/volumeset.yaml @@ -0,0 +1,15 @@ +# ============================================================================= +# Manticore Volumeset +# ============================================================================= +# Persistent storage per replica (data, binlog, cluster state) +kind: volumeset +name: {{ include "manticore.volumeName" . }} +description: Persistent storage for Manticore data and cluster state +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.manticore.volumeset.capacity }} + performanceClass: general-purpose-ssd + snapshots: + createFinalSnapshot: true + retentionDuration: 7d \ No newline at end of file diff --git a/manticore/versions/2.0.1/templates/workload.yaml b/manticore/versions/2.0.1/templates/workload.yaml new file mode 100644 index 00000000..74055172 --- /dev/null +++ b/manticore/versions/2.0.1/templates/workload.yaml @@ -0,0 +1,566 @@ +# ============================================================================= +# Manticore Search Workload (Stateful) +# ============================================================================= +# Each replica runs manticore (searchd) + agent sidecar for orchestrator coordination +kind: workload +name: {{ include "manticore.name" . }} +description: Manticore search cluster with agent sidecar +tags: {{- include "manticore.tags" . | nindent 4 }} + cpln/publishNotReadyAddresses: "true" +spec: + type: stateful + containers: + - name: manticore + cpu: {{ .Values.manticore.resources.cpu | quote }} + image: {{ .Values.manticore.image }} + command: "/bin/bash" + args: + - "/usr/local/bin/start.sh" + inheritEnv: false + memory: {{ .Values.manticore.resources.memory | quote }} + metrics: + path: "/metrics" + port: 9308 + ports: + - number: 9306 + protocol: tcp + - number: 9308 + protocol: http + - number: 9312 + protocol: tcp + # Galera gcomm + IST ports + - number: 9322 + protocol: tcp + - number: 9323 + protocol: tcp + readinessProbe: + failureThreshold: 6 + initialDelaySeconds: 30 + periodSeconds: 15 + successThreshold: 1 + tcpSocket: + port: 9306 + timeoutSeconds: 10 + volumes: + - path: /var/lib/manticore + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.volumeName" . }} + - path: /mnt/shared + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.sharedVolumeName" . }} + - path: /etc/manticore/manticore.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "manticore.secretConfigName" . }}.payload + - path: /etc/manticore/schema.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "manticore.secretSchemaConfigName" . }}.payload + - path: /usr/local/bin/start.sh + recoveryPolicy: retain + uri: cpln://secret/{{ include "manticore.secretStartupName" . }}.payload + - name: agent + cpu: {{ .Values.orchestrator.agent.resources.cpu | quote }} + minCpu: {{ .Values.orchestrator.agent.resources.minCpu | default .Values.orchestrator.agent.resources.cpu | quote }} + image: {{ .Values.orchestrator.agent.image }}:{{ .Values.orchestrator.agent.version }} + inheritEnv: false + memory: {{ .Values.orchestrator.agent.resources.memory | quote }} + minMemory: {{ .Values.orchestrator.agent.resources.minMemory | default .Values.orchestrator.agent.resources.memory | quote }} + ports: + - number: 8080 + protocol: http + env: + - name: MYSQL_HOST + value: "127.0.0.1" + - name: MYSQL_PORT + value: "9306" + - name: SCHEMA_FILE + value: "/etc/manticore/schema.conf" + - name: CLUSTER_NAME + value: "{{ .Values.manticore.clusterName }}" + - name: LISTEN_ADDR + value: ":8080" + - name: AUTH_TOKEN + value: "cpln://secret/{{ include "manticore.secretAgentTokenName" . }}.payload" + # Cluster recovery settings + - name: WORKLOAD_NAME + value: "{{ include "manticore.name" . }}" + - name: DATA_DIR + value: "/var/lib/manticore" + - name: MAX_SCALE + value: "{{ .Values.manticore.autoscaling.maxScale }}" + - name: RECOVERY_MAX_RETRIES + value: "{{ .Values.orchestrator.agent.recovery.maxRetries | default 5 }}" + - name: RECOVERY_INITIAL_BACKOFF_SEC + value: "{{ .Values.orchestrator.agent.recovery.initialBackoffSec | default 5 }}" + - name: RECOVERY_MAX_BACKOFF_SEC + value: "{{ .Values.orchestrator.agent.recovery.maxBackoffSec | default 60 }}" + - name: REPLICATION_PORT + value: "9312" + - name: IMPORT_BATCH_SIZE + value: "{{ .Values.orchestrator.agent.import.batchSize | default 1000 }}" + - name: ORCHESTRATOR_API_URL + value: "http://{{ include "manticore.orchestratorAPIName" . }}.{{ .Values.global.cpln.gvc }}.cpln.local:8080" + readinessProbe: + failureThreshold: 20 + initialDelaySeconds: 10 + periodSeconds: 15 + successThreshold: 1 + httpGet: + path: /api/ready + port: 8080 + timeoutSeconds: 5 + volumes: + - path: /etc/manticore/schema.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "manticore.secretSchemaConfigName" . }}.payload + - path: /mnt/shared + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.sharedVolumeName" . }} + - path: /var/lib/manticore + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.volumeName" . }} + defaultOptions: + autoscaling: + maxScale: {{ .Values.manticore.autoscaling.maxScale }} + metric: {{ .Values.manticore.autoscaling.metric }} + minScale: {{ .Values.manticore.autoscaling.minScale }} + scaleToZeroDelay: {{ .Values.manticore.autoscaling.scaleToZeroDelay }} + target: {{ .Values.manticore.autoscaling.target }} + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowHostname: [] + internal: + inboundAllowType: {{ .Values.manticore.firewall.internalAccess.type }} + {{- if .Values.manticore.firewall.internalAccess.workloads }} + inboundAllowWorkload: {{ .Values.manticore.firewall.internalAccess.workloads | toYaml | nindent 8 }} + {{- end }} + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.identityName" . }} + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: true + rolloutOptions: + maxSurgeReplicas: {{ .Values.manticore.rolloutOptions.maxSurgeReplicas }} + maxUnavailableReplicas: '{{ .Values.manticore.rolloutOptions.maxUnavailableReplicas }}' + minReadySeconds: {{ .Values.manticore.rolloutOptions.minReadySeconds }} + scalingPolicy: {{ .Values.manticore.rolloutOptions.scalingPolicy }} + terminationGracePeriodSeconds: {{ .Values.manticore.rolloutOptions.terminationGracePeriodSeconds }} + supportDynamicTags: false +--- +# ============================================================================= +# Orchestrator Cron Workload +# ============================================================================= +# Executes scheduled/manual actions: init, import, health, repair +# Starts suspended by default (trigger via UI/API) +kind: workload +name: {{ include "manticore.orchestratorJobName" . }} +description: Manticore orchestrator for scheduled/manual cluster operations +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: cron + containers: + - name: orchestrator + cpu: {{ .Values.orchestrator.resources.cpu | quote }} + image: {{ .Values.orchestrator.image }}:{{ .Values.orchestrator.version }} + inheritEnv: false + memory: {{ .Values.orchestrator.resources.memory | quote }} + env: + - name: MODE + value: "cli" + - name: ACTION + value: "{{ .Values.orchestrator.action }}" + - name: REPLICA_COUNT + value: "{{ .Values.manticore.autoscaling.minScale }}" + - name: AGENT_PORT + value: "8080" + - name: WORKLOAD_NAME + value: "{{ include "manticore.name" . }}" + - name: GVC + value: "{{ .Values.global.cpln.gvc }}" + - name: LOCATION + value: "{{ .Values.global.cpln.location }}" + - name: TABLE_NAME + value: "{{ .Values.orchestrator.tableName }}" + - name: TABLES_CONFIG + value: '{{ include "manticore.tablesConfigJSON" .Values.tables }}' + - name: STATE_FILE + value: "/tmp/orchestrator_state.json" + - name: AUTH_TOKEN + value: "cpln://secret/{{ include "manticore.secretAgentTokenName" . }}.payload" + - name: LOG_LEVEL + value: "{{ .Values.orchestrator.logLevel }}" + - name: IMPORT_MEM_LIMIT + value: "{{ .Values.orchestrator.importMemLimit }}" + - name: INDEXER_WORK_DIR + value: "/mnt/s3/indexer-temp" + - name: S3_MOUNT + value: "/mnt/s3" + - name: SHARED_VOLUME_MOUNT + value: "/mnt/shared" + volumes: + - path: /mnt/shared + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.sharedVolumeName" . }} + - path: /mnt/s3 + uri: s3://{{ .Values.buckets.sourceBucket }} + recoveryPolicy: retain + defaultOptions: + autoscaling: + maxScale: 1 + minScale: 1 + capacityAI: false + debug: false + suspend: {{ .Values.orchestrator.suspend }} + timeoutSeconds: {{ .Values.orchestrator.timeoutSeconds }} + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + internal: + inboundAllowType: none + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorJobIdentityName" . }} + job: + schedule: {{ .Values.orchestrator.schedule | quote }} + concurrencyPolicy: Forbid + restartPolicy: Never + activeDeadlineSeconds: {{ .Values.orchestrator.activeDeadlineSeconds }} + supportDynamicTags: false +--- +# ============================================================================= +# Orchestrator API Workload (Standard) +# ============================================================================= +# REST API for cluster coordination (init, import, repair, cluster status) +# Backend for UI dashboard, communicates with agents via bearer token +kind: workload +name: {{ include "manticore.orchestratorAPIName" . }} +description: Manticore orchestrator REST API for cluster coordination +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: api + cpu: {{ .Values.orchestrator.api.resources.cpu | quote }} + image: {{ .Values.orchestrator.api.image }}:{{ .Values.orchestrator.api.version }} + inheritEnv: false + memory: {{ .Values.orchestrator.api.resources.memory | quote }} + port: 8080 + readinessProbe: + httpGet: + path: /api/status + port: 8080 + periodSeconds: 10 + failureThreshold: 3 + env: + - name: MODE + value: "server" + - name: LISTEN_ADDR + value: ":8080" + - name: REPLICA_COUNT + value: "{{ .Values.manticore.autoscaling.minScale }}" + - name: AGENT_PORT + value: "8080" + - name: WORKLOAD_NAME + value: "{{ include "manticore.name" . }}" + # CPLN_GVC, CPLN_LOCATION auto-injected by Control Plane + - name: TABLES_CONFIG + value: '{{ include "manticore.tablesConfigJSON" .Values.tables }}' + - name: STATE_FILE + value: "/tmp/orchestrator_state.json" + - name: AUTH_TOKEN + value: "cpln://secret/{{ include "manticore.secretAgentTokenName" . }}.payload" + - name: LOG_LEVEL + value: "{{ .Values.orchestrator.logLevel }}" + # CPLN_TOKEN, CPLN_ORG, CPLN_GVC auto-injected by Control Plane + - name: ORCHESTRATOR_WORKLOAD + value: "{{ include "manticore.orchestratorJobName" . }}" + - name: IMPORT_POLL_INTERVAL + value: "{{ .Values.orchestrator.api.importPollInterval | default "30s" }}" + - name: IMPORT_POLL_TIMEOUT + value: "{{ .Values.orchestrator.api.importPollTimeout | default "2h" }}" + {{- if .Values.orchestrator.backup.enabled }} + - name: BACKUP_SCHEDULES + value: '{{ .Values.orchestrator.backup.schedules | toJson }}' + - name: BACKUP_BUCKET + value: {{ .Values.orchestrator.backup.s3Bucket }} + - name: BACKUP_PREFIX + value: {{ .Values.orchestrator.backup.prefix }} + - name: BACKUP_PROVIDER + value: aws + - name: BACKUP_REGION + value: {{ .Values.orchestrator.backup.s3Region }} + {{- end }} + defaultOptions: + autoscaling: + maxScale: {{ .Values.orchestrator.api.autoscaling.maxScale }} + minScale: {{ .Values.orchestrator.api.autoscaling.minScale }} + metric: {{ .Values.orchestrator.api.autoscaling.metric }} + target: {{ .Values.orchestrator.api.autoscaling.target }} + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 30 + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + inboundAllowType: same-gvc + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorIdentityName" . }} + supportDynamicTags: false +--- +# ============================================================================= +# Web UI Workload (Standard) +# ============================================================================= +# Dashboard for cluster monitoring, table status, imports, and repairs +kind: workload +name: {{ include "manticore.UIName" . }} +description: Manticore cluster management dashboard +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: ui + cpu: {{ .Values.orchestrator.ui.resources.cpu | quote }} + image: {{ .Values.orchestrator.ui.image }}:{{ .Values.orchestrator.ui.version }} + inheritEnv: false + memory: {{ .Values.orchestrator.ui.resources.memory | quote }} + port: 3000 + env: + - name: ORCHESTRATOR_API_URL + value: "http://{{ include "manticore.orchestratorAPIName" . }}.{{ .Values.global.cpln.gvc }}.cpln.local:8080" + - name: ORCHESTRATOR_AUTH_TOKEN + value: "cpln://secret/{{ include "manticore.secretAgentTokenName" . }}.payload" + - name: PORT + value: "3000" + readinessProbe: + httpGet: + path: / + port: 3000 + periodSeconds: 10 + failureThreshold: 3 + defaultOptions: + autoscaling: + maxScale: {{ .Values.orchestrator.ui.autoscaling.maxScale }} + minScale: {{ .Values.orchestrator.ui.autoscaling.minScale }} + metric: {{ .Values.orchestrator.ui.autoscaling.metric }} + target: {{ .Values.orchestrator.ui.autoscaling.target }} + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 30 + firewallConfig: + external: + inboundAllowCIDR: {{ if .Values.orchestrator.ui.allowExternalAccess }}[ "0.0.0.0/0" ]{{ else }}[ ]{{ end }} + outboundAllowCIDR: [] + internal: + inboundAllowType: same-gvc + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.orchestratorIdentityName" . }} + supportDynamicTags: false + +{{- if .Values.orchestrator.backup.enabled }} +--- +# ============================================================================= +# Backup Workload (Cron) +# ============================================================================= +# Cron job for logical backups on delta table to S3 bucket +kind: workload +name: {{ include "manticore.backupName" . }} +description: manticore backup +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: cron + containers: + - name: backup + cpu: {{ .Values.orchestrator.backup.resources.cpu | quote }} + env: + - name: ACTION + value: backup + - name: TYPE + value: delta + - name: BACKUP_BUCKET + value: {{ .Values.orchestrator.backup.s3Bucket }} + - name: BACKUP_PREFIX + value: {{ .Values.orchestrator.backup.prefix }} + - name: BACKUP_PROVIDER + value: aws + - name: BACKUP_REGION + value: {{ .Values.buckets.awsRegion }} + - name: DATASET + value: {{ .Values.orchestrator.backup.dataSet }} + - name: MANTICORE_HOST + value: "{{ include "manticore.name" . }}" + - name: MANTICORE_PORT + value: "9306" + - name: AUTH_TOKEN + value: "cpln://secret/{{ include "manticore.secretAgentTokenName" . }}.payload" + image: {{ .Values.orchestrator.backup.image }}:{{ .Values.orchestrator.backup.version }} + inheritEnv: false + memory: {{ .Values.orchestrator.backup.resources.memory | quote }} + ports: + - number: 8080 + protocol: http + volumes: + - path: /mnt/shared + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "manticore.sharedVolumeName" . }} + defaultOptions: + autoscaling: + maxScale: 1 + metric: disabled + minScale: 1 + capacityAI: false + debug: false + multiZone: + enabled: false + suspend: true + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: [] + inboundBlockedCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: same-gvc + inboundAllowWorkload: [] + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.backupIdentityName" . }} + job: + concurrencyPolicy: Forbid + historyLimit: 5 + restartPolicy: Never + schedule: '0 2 * * *' + activeDeadlineSeconds: {{ .Values.orchestrator.backup.activeDeadlineSeconds }} + localOptions: [] + supportDynamicTags: false +{{- end }} + +# ============================================================================= +# K6 Load Test Workload (Standard) +# ============================================================================= +# Runs k6 load tests (starts at scale 0, scaled up by controller) +{{- if .Values.loadTest.enabled }} +--- +kind: workload +name: {{ include "manticore.loadTestName" . }} +description: K6 load test runner for Manticore search +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: k6 + image: {{ .Values.loadTest.image }} + cpu: {{ .Values.loadTest.resources.cpu | quote }} + memory: {{ .Values.loadTest.resources.memory | quote }} + inheritEnv: false + command: /bin/sh + args: + - '-c' + - >- + k6 run + --vus {{ .Values.loadTest.vus }} + --duration {{ .Values.loadTest.duration }} + {{- if .Values.loadTest.rps }} + --rps {{ .Values.loadTest.rps }} + {{- end }} + /scripts/payload + volumes: + - path: /scripts + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "manticore.secretK6ScriptName" . }}.payload' + defaultOptions: + autoscaling: + maxScale: 0 + minScale: 0 + scaleToZeroDelay: 30 + target: 95 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 3600 + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + inboundAllowType: none + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.loadTestIdentityName" . }} + supportDynamicTags: false +--- +# ============================================================================= +# Load Test Controller (Cron) +# ============================================================================= +# Scales k6 workload up, waits for test duration, then scales back to 0 +# Suspended if no schedule (manual trigger only) +kind: workload +name: {{ include "manticore.loadTestControllerName" . }} +description: Controller for triggering and managing load tests +tags: {{- include "manticore.tags" . | nindent 4 }} +spec: + type: cron + containers: + - name: controller + image: {{ .Values.loadTest.controller.image }} + cpu: '0.25' + memory: 256Mi + inheritEnv: false + command: /bin/sh + args: + - '-c' + - | + set -e + WORKLOAD="{{ include "manticore.loadTestName" . }}" + REPLICAS="{{ .Values.loadTest.replicas }}" + + echo "Scaling $WORKLOAD to $REPLICAS replicas..." + curl -sf -X PATCH \ + -H "Authorization: Bearer $CPLN_TOKEN" \ + -H "Content-Type: application/json" \ + -d "{\"spec\":{\"defaultOptions\":{\"autoscaling\":{\"minScale\":$REPLICAS,\"maxScale\":$REPLICAS}}}}" \ + "http://api.cpln.io/org/$CPLN_ORG/gvc/$CPLN_GVC/workload/$WORKLOAD" + + echo "Waiting for test duration + buffer..." + sleep {{ include "loadTest.totalDurationSeconds" . }} + + echo "Scaling $WORKLOAD back to 0..." + curl -sf -X PATCH \ + -H "Authorization: Bearer $CPLN_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"spec":{"defaultOptions":{"autoscaling":{"minScale":0,"maxScale":0}}}}' \ + "http://api.cpln.io/org/$CPLN_ORG/gvc/$CPLN_GVC/workload/$WORKLOAD" + + echo "Load test complete." + defaultOptions: + autoscaling: + maxScale: 1 + minScale: 1 + capacityAI: false + debug: false + suspend: {{ if .Values.loadTest.controller.schedule }}false{{ else }}true{{ end }} + firewallConfig: + external: + inboundAllowCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + inboundAllowType: none + identityLink: /org/{{ .Values.global.cpln.org }}/gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "manticore.loadTestControllerIdentityName" . }} + job: + schedule: '{{ .Values.loadTest.controller.schedule | default "0 0 1 1 *" }}' + concurrencyPolicy: Forbid + restartPolicy: Never + historyLimit: 10 + activeDeadlineSeconds: 7200 + supportDynamicTags: false +{{- end }} diff --git a/manticore/versions/2.0.1/values.yaml b/manticore/versions/2.0.1/values.yaml new file mode 100644 index 00000000..ebad6a68 --- /dev/null +++ b/manticore/versions/2.0.1/values.yaml @@ -0,0 +1,220 @@ +# ============================================================================= +# AWS Cloud Account and S3 Configuration +# ============================================================================= +# See README.md for Cloud Account and IAM policy setup instructions. +buckets: + cloudAccountName: my-cloudaccount # Name of your configured Cloud Account + awsPolicyRefs: # IAM policies for S3 access + - my-manticore-policy # Note: if using a custom policy, omit the aws:: prefix as this is only for AWS managed policies + awsRegion: us-east-1 # Region of your S3 bucket + sourceBucket: my-manticore-bucket # S3 bucket containing files to import + +# ============================================================================= +# Tables Configuration +# ============================================================================= +# See README.md for table structure, column types, and config options. +# Add/remove tables as needed. +tables: + - name: addresses + csvPath: + - imports/addresses.csv + config: + haStrategy: noerrors + agentRetryCount: 3 + clusterMain: false + segmentCount: 1 + charsetTable: non_cont + memLimit: 2G + hasHeader: true + secondaryIndexes: false + schema: + columns: + - name: address_id + type: attr_uint + - name: street_number + type: attr_uint + - name: street_name + type: field + - name: city + type: field + - name: county + type: field + - name: state + type: field + - name: postal_code + type: field + - name: country + type: field + - name: latitude + type: attr_float + - name: longitude + type: attr_float + +# ============================================================================= +# Manticore Search Configuration +# ============================================================================= +manticore: + image: manticoresearch/manticore:25.0.0 + clusterName: manticore # Galera cluster name + resources: + cpu: 4 + memory: 8Gi + volumeset: + capacity: 200 # GB per replica + sharedVolumeset: + capacity: 100 # GB shared across replicas and orchestrator + + # minScale = replica count used by orchestrator for coordination + autoscaling: + minScale: 3 + maxScale: 4 + metric: rps # rps for stateful, cpu for standard + target: 100 + scaleToZeroDelay: 300 + + rolloutOptions: + maxSurgeReplicas: 25% + maxUnavailableReplicas: '0' + minReadySeconds: 10 + scalingPolicy: OrderedReady + terminationGracePeriodSeconds: 60 + + # IMPORTANT: same-gvc required for Galera replication + firewall: + internalAccess: + type: same-gvc + workloads: [] + +# ============================================================================= +# Orchestrator Configuration +# ============================================================================= +orchestrator: + version: v6.0.5 + image: ghcr.io/controlplane-com/manticore-orchestrator/manticore-cpln-api + logLevel: debug # debug, info, warn, error + resources: + cpu: 1 + memory: 2Gi + + # Cron job settings + schedule: "0 * * * *" # Cron schedule (default = every hour) + action: import # init, import, health, repair + tableName: addresses # Must match a name in tables[] + suspend: true # Start suspended (trigger via UI/API) + timeoutSeconds: 900 # Container timeout (seconds, default 15 minutes) + importMemLimit: 2G # Memory limit for import jobs + activeDeadlineSeconds: 14400 # Max job runtime (seconds, default 4 hours) + + # Orchestrator API + api: + version: v6.0.5 + image: ghcr.io/controlplane-com/manticore-orchestrator/manticore-cpln-api + logLevel: debug + importPollInterval: 30s + importPollTimeout: 2h + resources: + cpu: 0.25 + memory: 256Mi + autoscaling: + maxScale: 3 + minScale: 2 + metric: cpu + target: 80 + + # Agent sidecar + agent: + version: v6.0.5 + image: ghcr.io/controlplane-com/manticore-orchestrator/manticore-cpln-agent + # REQUIRED: Generate with `openssl rand -base64 32` + token: "6Gl5uO9KkKAh1u+ymoBW98WCtjTFpljuhpLdKb+tNAA=" + resources: + cpu: 250m + minCpu: 100m + memory: 512Mi + minMemory: 128Mi + import: + batchSize: 20000 # Rows per INSERT statement + recovery: + maxRetries: 5 # Retry attempts for cluster recovery + initialBackoffSec: 5 # Initial delay between retries + maxBackoffSec: 60 # Max backoff delay (exponential) + + # Web UI - See README.md "Authentication" section for security notes + ui: + version: v6.0.5 + image: ghcr.io/controlplane-com/manticore-orchestrator/manticore-cpln-ui + resources: + cpu: 0.25 + memory: 0.25Gi + allowExternalAccess: true # false = internal GVC access only + autoscaling: + maxScale: 2 + minScale: 1 + metric: cpu + target: 80 + + # Backup Configuration (Optional) - Cron job runs logical backup on delta table to S3 bucket. + backup: + enabled: false + version: v6.0.5 + image: ghcr.io/controlplane-com/manticore-orchestrator/manticore-cpln-backup + cloudAccountName: my-backup-cloud-account + s3Bucket: my-backup-bucket # S3 bucket for backups + s3Policy: # IAM policies for S3 access + - my-backup-policy # Custom policy created in S3 setup instructions + s3Region: us-east-1 + dataSet: addresses # Data set to back up + prefix: manticore-backups # S3 prefix/folder for backups + schedules: [ + {"table":"addresses","type":"delta","schedule":"0 2 * * *"}, # Daily at 2am UTC + {"table":"addresses","type":"main","schedule":"0 2 1 * *"} # Monthly full backup on 1st at 2am UTC + ] + activeDeadlineSeconds: 14400 # Max job runtime (seconds, default 4 hours) + resources: + cpu: 1 + memory: 1Gi + +# ============================================================================= +# Domain Configuration (Optional) +# ============================================================================= +# Routes /api/* to orchestrator-api, everything else to UI. +domain: + enabled: false + name: "" # FQDN, e.g., manticore.example.com + dnsMode: cname # cname (subdomains) or ns (zone delegation) + +# ============================================================================= +# Load Testing Configuration (Optional) +# ============================================================================= +loadTest: + enabled: false + image: grafana/k6:0.47.0 + resources: + cpu: 0.5 + memory: 512Mi + + vus: 10 # Virtual users + duration: "5m" # Test duration (e.g., 30s, 5m, 1h) + rps: null # Target RPS (null = unlimited) + replicas: 1 # Number of k6 pods to spawn + + controller: + image: alpine/curl # Alpine image with curl pre-installed + schedule: "" # Cron expression (empty = manual only) + testDurationBuffer: 60 # Seconds added to duration before scale-down + + target: + port: 9308 # Manticore HTTP API port + endpoint: search # "search" or "sql" + + # Full JSON body for /search endpoint + query: + index: addresses + query: + match: + "*": "test" + limit: 10 + + thresholds: + p95ResponseTime: 500 # ms + errorRate: 0.01 # 1% diff --git a/ollama/icon.png b/ollama/icon.png new file mode 100644 index 00000000..b087107a Binary files /dev/null and b/ollama/icon.png differ diff --git a/ollama/versions/1.0.0/.helmignore b/ollama/versions/1.0.0/.helmignore new file mode 100644 index 00000000..0e8a0eb3 --- /dev/null +++ b/ollama/versions/1.0.0/.helmignore @@ -0,0 +1,23 @@ +# Patterns to ignore when building packages. +# This supports shell glob matching, relative path matching, and +# negation (prefixed with !). Only one pattern per line. +.DS_Store +# Common VCS dirs +.git/ +.gitignore +.bzr/ +.bzrignore +.hg/ +.hgignore +.svn/ +# Common backup files +*.swp +*.bak +*.tmp +*.orig +*~ +# Various IDEs +.project +.idea/ +*.tmproj +.vscode/ diff --git a/ollama/versions/1.0.0/Chart.yaml b/ollama/versions/1.0.0/Chart.yaml new file mode 100644 index 00000000..27e1498c --- /dev/null +++ b/ollama/versions/1.0.0/Chart.yaml @@ -0,0 +1,13 @@ +apiVersion: v2 +name: ollama +description: An Ollama app for Control Plane + +type: application +version: 1.0.0 +appVersion: "0.4" + +annotations: + created: "2024-12-03" + lastModified: "2024-12-03" + category: "llm" + createsGvc: false \ No newline at end of file diff --git a/ollama/versions/1.0.0/PREREQUISITES.md b/ollama/versions/1.0.0/PREREQUISITES.md new file mode 100644 index 00000000..e69de29b diff --git a/ollama/versions/1.0.0/README.md b/ollama/versions/1.0.0/README.md new file mode 100644 index 00000000..cb316d10 --- /dev/null +++ b/ollama/versions/1.0.0/README.md @@ -0,0 +1,36 @@ +## Ollama App + +### Warning + +You will need to request a quota increase for CPU and Memory if your org is at the default quotas. + +### Overview + +The user interface is the project https://github.com/open-webui/open-webui +It runs on port 8080 as a sidecar to the ollama API. Since 8080 is the first port specified in the workload definition all external traffic is forwarded to it. + +The ollama API is the project https://github.com/ollama/ollama +It runs on port 11434 and is accessed by the open-webui sidecar. There is a persistent storage volume of 10Gib (default) that is used to store the models. On startup, a script is used to download a default model (default llama2) if it does not yet exist on the filesystem. + +On Control Plane, you can access GPU's from any cloud provider. You can even deploy this example to multiple cloud provider geo locations at the same time and end users will be routed to the closest available location. + +### Specification + +- NVIDIA T4 GPU + +### Access the web-ui using the deployment link, found with [CLI](#CLI) or [UI](#UI) + +Documentation and examples of how to use the ollama open-webui are available here: +https://github.com/open-webui/open-webui + +#### CLI + +1. Run the command below to get the deployment link (replacing gvc and workload as needed) + +```bash +cpln workload get {workload-name} --gvc {gvc-name} -o json | jq -r '.status.endpoint' +``` + +#### UI + +1. Navigate to the generated workload and click `Open` next to the workload name \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/_helpers.tpl b/ollama/versions/1.0.0/templates/_helpers.tpl new file mode 100644 index 00000000..0e796eca --- /dev/null +++ b/ollama/versions/1.0.0/templates/_helpers.tpl @@ -0,0 +1,6 @@ +{{/* +Ollama Name +*/}} +{{- define "ollama.name" -}} +{{- printf "%s" .Release.Name }} +{{- end }} \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/identity.yaml b/ollama/versions/1.0.0/templates/identity.yaml new file mode 100644 index 00000000..5ab990ca --- /dev/null +++ b/ollama/versions/1.0.0/templates/identity.yaml @@ -0,0 +1,4 @@ +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.name" . }} +description: Auto-managed identity for the workload {{ include "ollama.name" . }} \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/policy.yaml b/ollama/versions/1.0.0/templates/policy.yaml new file mode 100644 index 00000000..b2cbe7ff --- /dev/null +++ b/ollama/versions/1.0.0/templates/policy.yaml @@ -0,0 +1,11 @@ +kind: policy +name: {{ include "ollama.name" . }} +description: Gives access to the ollama entrypoint +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ollama.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "ollama.name" . }} \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/secret.yaml b/ollama/versions/1.0.0/templates/secret.yaml new file mode 100644 index 00000000..cd8fefdd --- /dev/null +++ b/ollama/versions/1.0.0/templates/secret.yaml @@ -0,0 +1,8 @@ +kind: secret +name: {{ include "ollama.name" . }} +description: The entrypoint for the ollama container +type: opaque +data: + encoding: plain + payload: | +{{ .Values.entrypoint.payload | indent 4}} \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/volumeset.yaml b/ollama/versions/1.0.0/templates/volumeset.yaml new file mode 100644 index 00000000..a6b786d3 --- /dev/null +++ b/ollama/versions/1.0.0/templates/volumeset.yaml @@ -0,0 +1,10 @@ +kind: volumeset +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.name" . }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.initialCapacity }} + performanceClass: {{ .Values.volumeset.performanceClass }} + snapshots: + createFinalSnapshot: true + retentionDuration: {{ .Values.volumeset.snapshots.retentionDuration }} \ No newline at end of file diff --git a/ollama/versions/1.0.0/templates/workload.yaml b/ollama/versions/1.0.0/templates/workload.yaml new file mode 100644 index 00000000..fe1a0e3a --- /dev/null +++ b/ollama/versions/1.0.0/templates/workload.yaml @@ -0,0 +1,106 @@ +kind: workload +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.name" . }} +spec: + type: stateful + containers: + - name: {{ .Values.workload.containers.ui.name }} + cpu: {{ .Values.workload.containers.ui.resources.cpu }} + env: + - name: DEFAULT_MODELS + value: {{ .Values.defaultModel }} + - name: OLLAMA_BASE_URL + value: http://localhost:{{ .Values.workload.containers.api.port }} + image: {{ .Values.workload.containers.ui.image }} + inheritEnv: false + memory: {{ .Values.workload.containers.ui.resources.memory }} + ports: + - number: {{ .Values.workload.containers.ui.port }} + protocol: http + readinessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: / + port: {{ .Values.workload.containers.ui.port }} + scheme: HTTP + initialDelaySeconds: 0 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + livenessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: / + port: {{ .Values.workload.containers.ui.port }} + scheme: HTTP + initialDelaySeconds: 120 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + volumes: + - path: /app/backend/data + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "ollama.name" . }} + - name: {{ .Values.workload.containers.api.name }} + args: + - '-c' + - /startup/entrypoint.sh + command: bash + cpu: {{ .Values.workload.containers.api.resources.cpu }} + {{- if .Values.workload.containers.api.gpu }} + gpu: + {{- toYaml .Values.workload.containers.api.gpu | nindent 8 }} + {{- end }} + image: {{ .Values.workload.containers.api.image }} + inheritEnv: false + livenessProbe: + failureThreshold: 5 + initialDelaySeconds: 180 + periodSeconds: 30 + successThreshold: 1 + tcpSocket: + port: {{ .Values.workload.containers.api.port }} + timeoutSeconds: 1 + memory: {{ .Values.workload.containers.api.resources.memory }} + ports: + - number: {{ .Values.workload.containers.api.port }} + protocol: http + readinessProbe: + failureThreshold: 6 + initialDelaySeconds: 10 + periodSeconds: 10 + successThreshold: 1 + tcpSocket: + port: {{ .Values.workload.containers.api.port }} + timeoutSeconds: 1 + volumes: + - path: /root/.ollama + recoveryPolicy: retain + uri: 'cpln://volumeset/{{ include "ollama.name" . }}' + - path: /startup/entrypoint.sh + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "ollama.name" . }}' + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + timeoutSeconds: 600 + firewallConfig: + external: + inboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowCIDR: + - 0.0.0.0/0 + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ollama.name" . }} \ No newline at end of file diff --git a/ollama/versions/1.0.0/values.yaml b/ollama/versions/1.0.0/values.yaml new file mode 100644 index 00000000..774a7908 --- /dev/null +++ b/ollama/versions/1.0.0/values.yaml @@ -0,0 +1,56 @@ +# Alternatives: llava, gemmma, mistral, etc. +defaultModel: llama3 + +workload: + containers: + ui: + name: ollama-ui + image: ghcr.io/open-webui/open-webui:main + port: 8080 + resources: + cpu: 500m + memory: 1Gi + api: + name: ollama + image: ollama/ollama + port: 11434 + resources: + cpu: 6000m + memory: 7Gi + gpu: + nvidia: + model: t4 + quantity: 1 + +volumeset: + initialCapacity: 10 + performanceClass: general-purpose-ssd + snapshots: + retentionDuration: 7d + +internal_access: + type: none # options: same-gvc, same-org, workload-list + workloads: # Note: can only be used if type is same-gvc or workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + +entrypoint: + payload: | + #!/bin/bash + # Define the model directory + MODEL_DIR="/root/.ollama/models/manifests/registry.ollama.ai/library/$DEFAULT_MODELS/" + # Start ollama serve in the background + /bin/ollama serve & + # Check if the model directory exists + if [ ! -d "$MODEL_DIR" ]; then + echo "Model directory not found. Pulling the $DEFAULT_MODELS model..." + # Pull the $DEFAULT_MODELS model using the Ollama API + apt-get update && apt-get install curl -y + curl http://localhost:11434/api/pull -d '{ + "name": "$DEFAULT_MODELS" + }' + else + echo "Model directory exists. No action required." + fi + # Keep the script running + while true; do sleep 86400; done diff --git a/ollama/versions/1.1.0/Chart.yaml b/ollama/versions/1.1.0/Chart.yaml new file mode 100644 index 00000000..23f6d49f --- /dev/null +++ b/ollama/versions/1.1.0/Chart.yaml @@ -0,0 +1,18 @@ +apiVersion: v2 +name: ollama +description: An Ollama app for Control Plane + +type: application +version: 1.1.0 +appVersion: "0.4" + +dependencies: + - name: cpln-common + version: 1.0.0 + repository: "oci://ghcr.io/controlplane-com/templates" + +annotations: + created: "2024-12-03" + lastModified: "2026-04-27" + category: "llm" + createsGvc: false \ No newline at end of file diff --git a/ollama/versions/1.1.0/README.md b/ollama/versions/1.1.0/README.md new file mode 100644 index 00000000..068c7e77 --- /dev/null +++ b/ollama/versions/1.1.0/README.md @@ -0,0 +1,93 @@ +## Ollama + +### Warning + +You will need to request a quota increase for CPU and memory if your org is at the default quotas. GPU resources require explicit enablement — contact Control Plane support if you do not have access. + +### Overview + +Deploys [Ollama](https://github.com/ollama/ollama) as a stateful workload with the [Open WebUI](https://github.com/open-webui/open-webui) as a sidecar. The WebUI runs on port 8080 and is the externally exposed interface. The Ollama API runs on port 11434 and is accessed internally by the WebUI. On first startup, a script downloads the configured default model if it is not already present on the volume. + +On Control Plane, GPUs are available across multiple cloud provider locations. You can deploy this template to several regions simultaneously and end users will be routed to the closest available instance. + +### Configuration + +**Default model** — set the model to pull on first startup. Any model available in the [Ollama library](https://ollama.com/library) can be used: +```yaml +defaultModel: llama3 +``` +Common alternatives: `llava`, `gemma`, `mistral`, `phi3` + +**UI container** — configure the Open WebUI image and resources: +```yaml +workload: + containers: + ui: + image: ghcr.io/open-webui/open-webui:main + resources: + cpu: 500m + memory: 1Gi +``` + +**API container** — configure the Ollama image, resources, and GPU: +```yaml +workload: + containers: + api: + image: ollama/ollama + resources: + cpu: 6 + memory: 7Gi + gpu: + nvidia: + model: t4 + quantity: 1 +``` + +**Volume** — persistent storage for downloaded models. Default is 10 GiB. Optionally enable autoscaling to expand as models accumulate: +```yaml +volumeset: + initialCapacity: 10 + autoscaling: + enabled: true + maxCapacity: 100 + minFreePercentage: 10 + scalingFactor: 1.2 +``` + +**Firewall** — restrict inbound and outbound access. Defaults to open: +```yaml +firewall: + external: + inboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowCIDR: + - 0.0.0.0/0 +``` + +**Internal access** — controls which workloads can reach Ollama internally: +```yaml +internal_access: + type: same-gvc # options: none, same-gvc, same-org, workload-list + workloads: + - //gvc/my-gvc/workload/my-app +``` + +### Connecting + +Once deployed, access the Open WebUI through the Control Plane endpoint: + +``` +https://RELEASE_NAME-ollama.GVC_NAME.cpln.app +``` + +The Ollama API is also available internally to other workloads in the same GVC: + +``` +http://RELEASE_NAME-ollama.GVC_NAME.cpln.local:11434 +``` + +### Supported External Services +- [Ollama Documentation](https://github.com/ollama/ollama) +- [Open WebUI Documentation](https://github.com/open-webui/open-webui) +- [Ollama Model Library](https://ollama.com/library) \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/_helpers.tpl b/ollama/versions/1.1.0/templates/_helpers.tpl new file mode 100644 index 00000000..9eba200c --- /dev/null +++ b/ollama/versions/1.1.0/templates/_helpers.tpl @@ -0,0 +1,46 @@ +{{/* Resource Naming */}} + +{{/* +Ollama Workload Name +*/}} +{{- define "ollama.name" -}} +{{- printf "%s-ollama" .Release.Name }} +{{- end }} + +{{/* +Ollama Secret Entrypoint Name +*/}} +{{- define "ollama.secret.name" -}} +{{- printf "%s-ollama-secret" .Release.Name }} +{{- end }} + +{{/* +Ollama Identity Name +*/}} +{{- define "ollama.identity.name" -}} +{{- printf "%s-ollama-identity" .Release.Name }} +{{- end }} + +{{/* +Ollama Policy Name +*/}} +{{- define "ollama.policy.name" -}} +{{- printf "%s-ollama-policy" .Release.Name }} +{{- end }} + +{{/* +Ollama Volume Set Name +*/}} +{{- define "ollama.volume.name" -}} +{{- printf "%s-ollama-vs" .Release.Name }} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels +*/}} +{{- define "ollama.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/identity.yaml b/ollama/versions/1.1.0/templates/identity.yaml new file mode 100644 index 00000000..cb71433f --- /dev/null +++ b/ollama/versions/1.1.0/templates/identity.yaml @@ -0,0 +1,5 @@ +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.identity.name" . }} +description: Ollama identity +tags: {{- include "ollama.tags" . | nindent 4 }} \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/policy.yaml b/ollama/versions/1.1.0/templates/policy.yaml new file mode 100644 index 00000000..c466374d --- /dev/null +++ b/ollama/versions/1.1.0/templates/policy.yaml @@ -0,0 +1,12 @@ +kind: policy +name: {{ include "ollama.policy.name" . }} +description: Ollama policy +tags: {{- include "ollama.tags" . | nindent 4 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ollama.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "ollama.secret.name" . }} \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/secret.yaml b/ollama/versions/1.1.0/templates/secret.yaml new file mode 100644 index 00000000..874803f1 --- /dev/null +++ b/ollama/versions/1.1.0/templates/secret.yaml @@ -0,0 +1,9 @@ +kind: secret +name: {{ include "ollama.secret.name" . }} +description: Ollama entrypoint secret +tags: {{- include "ollama.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: | +{{ .Values.entrypoint.payload | indent 4}} \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/volumeset.yaml b/ollama/versions/1.1.0/templates/volumeset.yaml new file mode 100644 index 00000000..91835fb8 --- /dev/null +++ b/ollama/versions/1.1.0/templates/volumeset.yaml @@ -0,0 +1,16 @@ +kind: volumeset +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.volume.name" . }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.initialCapacity }} + {{- if .Values.volumeset.autoscaling.enabled }} + autoscaling: + maxCapacity: {{ .Values.volumeset.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.volumeset.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.volumeset.autoscaling.scalingFactor }} + {{- end }} + performanceClass: {{ .Values.volumeset.performanceClass }} + snapshots: + createFinalSnapshot: true + retentionDuration: {{ .Values.volumeset.snapshots.retentionDuration }} \ No newline at end of file diff --git a/ollama/versions/1.1.0/templates/workload.yaml b/ollama/versions/1.1.0/templates/workload.yaml new file mode 100644 index 00000000..d6bc88f0 --- /dev/null +++ b/ollama/versions/1.1.0/templates/workload.yaml @@ -0,0 +1,104 @@ +kind: workload +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "ollama.name" . }} +spec: + type: stateful + containers: + - name: {{ .Values.workload.containers.ui.name }} + cpu: {{ .Values.workload.containers.ui.resources.cpu | quote }} + env: + - name: DEFAULT_MODELS + value: {{ .Values.defaultModel }} + - name: OLLAMA_BASE_URL + value: http://localhost:{{ .Values.workload.containers.api.port }} + image: {{ .Values.workload.containers.ui.image }} + inheritEnv: false + memory: {{ .Values.workload.containers.ui.resources.memory | quote }} + ports: + - number: {{ .Values.workload.containers.ui.port }} + protocol: http + readinessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: / + port: {{ .Values.workload.containers.ui.port }} + scheme: HTTP + initialDelaySeconds: 0 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + livenessProbe: + failureThreshold: 3 + httpGet: + httpHeaders: [] + path: / + port: {{ .Values.workload.containers.ui.port }} + scheme: HTTP + initialDelaySeconds: 120 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 1 + volumes: + - path: /app/backend/data + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "ollama.volume.name" . }} + - name: {{ .Values.workload.containers.api.name }} + args: + - '-c' + - /startup/entrypoint.sh + command: bash + cpu: {{ .Values.workload.containers.api.resources.cpu | quote }} + {{- if .Values.workload.containers.api.gpu }} + gpu: + {{- toYaml .Values.workload.containers.api.gpu | nindent 8 }} + {{- end }} + image: {{ .Values.workload.containers.api.image }} + inheritEnv: false + livenessProbe: + failureThreshold: 5 + initialDelaySeconds: 180 + periodSeconds: 30 + successThreshold: 1 + tcpSocket: + port: {{ .Values.workload.containers.api.port }} + timeoutSeconds: 1 + memory: {{ .Values.workload.containers.api.resources.memory | quote }} + ports: + - number: {{ .Values.workload.containers.api.port }} + protocol: http + readinessProbe: + failureThreshold: 6 + initialDelaySeconds: 10 + periodSeconds: 10 + successThreshold: 1 + tcpSocket: + port: {{ .Values.workload.containers.api.port }} + timeoutSeconds: 1 + volumes: + - path: /root/.ollama + recoveryPolicy: retain + uri: 'cpln://volumeset/{{ include "ollama.volume.name" . }}' + - path: /startup/entrypoint.sh + recoveryPolicy: retain + uri: 'cpln://secret/{{ include "ollama.secret.name" . }}' + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: cpu + minScale: 1 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + timeoutSeconds: 600 + firewallConfig: + external: + inboundAllowCIDR: {{ if .Values.firewall.external.inboundAllowCIDR }}{{ toYaml .Values.firewall.external.inboundAllowCIDR | nindent 8 }}{{ else }}[]{{ end }} + outboundAllowCIDR: {{ if .Values.firewall.external.outboundAllowCIDR }}{{ toYaml .Values.firewall.external.outboundAllowCIDR | nindent 8 }}{{ else }}[]{{ end }} + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "ollama.identity.name" . }} \ No newline at end of file diff --git a/ollama/versions/1.1.0/values.yaml b/ollama/versions/1.1.0/values.yaml new file mode 100644 index 00000000..780bc807 --- /dev/null +++ b/ollama/versions/1.1.0/values.yaml @@ -0,0 +1,67 @@ +# Alternatives: llava, gemmma, mistral, etc. +defaultModel: llama3 + +workload: + containers: + ui: + name: ollama-ui + image: ghcr.io/open-webui/open-webui:main + port: 8080 + resources: + cpu: 500m + memory: 1Gi + api: + name: ollama + image: ollama/ollama + port: 11434 + resources: + cpu: 6 + memory: 8Gi + gpu: + nvidia: + model: t4 + quantity: 1 + +volumeset: + initialCapacity: 10 + autoscaling: + enabled: false # Set to true to enable autoscaling + maxCapacity: 100 # Maximum capacity in GiB when autoscaling is enabled + minFreePercentage: 10 # Minimum free percentage to trigger scaling when autoscaling is enabled + scalingFactor: 1.2 # Scaling factor to determine how much to scale up when autoscaling is triggered + performanceClass: general-purpose-ssd + snapshots: + retentionDuration: 7d + +firewall: + external: # Change to restrict access to the workload if needed + inboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowCIDR: + - 0.0.0.0/0 +internal_access: + type: same-gvc # options: same-gvc, same-org, workload-list + workloads: # Note: can only be used if type is same-gvc or workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + +entrypoint: + payload: | + #!/bin/bash + # Define the model directory + MODEL_DIR="/root/.ollama/models/manifests/registry.ollama.ai/library/$DEFAULT_MODELS/" + # Start ollama serve in the background + /bin/ollama serve & + # Check if the model directory exists + if [ ! -d "$MODEL_DIR" ]; then + echo "Model directory not found. Pulling the $DEFAULT_MODELS model..." + # Pull the $DEFAULT_MODELS model using the Ollama API + apt-get update && apt-get install curl -y + curl http://localhost:11434/api/pull -d '{ + "name": "$DEFAULT_MODELS" + }' + else + echo "Model directory exists. No action required." + fi + # Keep the script running + while true; do sleep 86400; done \ No newline at end of file diff --git a/postgres-highly-available/versions/2.2.0/charts/etcd-1.4.0.tgz b/postgres-highly-available/versions/2.2.0/charts/etcd-1.4.0.tgz new file mode 100644 index 00000000..cdc97d44 Binary files /dev/null and b/postgres-highly-available/versions/2.2.0/charts/etcd-1.4.0.tgz differ diff --git a/postgres-highly-available/versions/2.2.0/templates/_helpers.tpl b/postgres-highly-available/versions/2.2.0/templates/_helpers.tpl index 08b97cfa..14c31511 100644 --- a/postgres-highly-available/versions/2.2.0/templates/_helpers.tpl +++ b/postgres-highly-available/versions/2.2.0/templates/_helpers.tpl @@ -130,6 +130,18 @@ Validate backup configuration - when backup is enabled, backup.provider must be {{- end }} +{{/* +Validate multi-location configuration +*/}} +{{- define "pg-ha.validateLocations" -}} +{{- if .Values.global.locations }} + {{- if lt (len .Values.global.locations) 3 }} + {{- fail "at least 3 locations are required for multi-DC deployment (Patroni recommendation)" }} + {{- end }} +{{- end }} +{{- end }} + + {{/* Labeling */}} {{/* diff --git a/postgres-highly-available/versions/2.2.0/templates/secret-ha-proxy.yaml b/postgres-highly-available/versions/2.2.0/templates/secret-ha-proxy.yaml index f0207abe..a03639bd 100644 --- a/postgres-highly-available/versions/2.2.0/templates/secret-ha-proxy.yaml +++ b/postgres-highly-available/versions/2.2.0/templates/secret-ha-proxy.yaml @@ -29,15 +29,32 @@ data: CFG="/tmp/haproxy.cfg" SERVERS="" + {{- if .Values.global.locations }} + # Multi-location mode: 1 replica per location, check all locations + {{- range $idx, $loc := .Values.global.locations }} + HOST="replica-0.${WORKLOAD}.{{ $loc }}.${GVC}.cpln.local" + SERVERS="${SERVERS} + server pg-{{ $loc }} ${HOST}:5432 check resolvers cpln_dns init-addr last,libc,none" + {{- end }} + {{- else }} + # Single-location mode: all replicas in local location i=0 while [ "${i}" -lt "${REPLICAS}" ]; do HOST="replica-${i}.${WORKLOAD}.${LOCATION}.${GVC}.cpln.local" SERVERS="${SERVERS} - server pg${i} ${HOST}:5432 check" + server pg${i} ${HOST}:5432 check resolvers cpln_dns init-addr last,libc,none" i=$((i + 1)) done + {{- end }} cat > "${CFG}" <> "$CONFIG_FILE" < +``` +5. Re-point the postgres workload to the restored volume set and restart the workload. +6. **After restore**: Change the WAL-G prefix before re-enabling backups to avoid system identifier conflicts. + +## Supported External Services + +- [Patroni Documentation](https://patroni.readthedocs.io/) +- [Postgres Doccumentation](https://www.postgresql.org/docs/) +- [etcd Documentation](https://etcd.io/docs/v3.6/) \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/_helpers.tpl b/postgres-highly-available/versions/2.3.1/templates/_helpers.tpl new file mode 100644 index 00000000..d9532a5d --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/_helpers.tpl @@ -0,0 +1,140 @@ +{{/* Resource Naming */}} + +{{/* +Postgres HA Workload Name +*/}} +{{- define "pg-ha.name" -}} +{{- printf "%s-postgres-ha" .Release.Name }} +{{- end }} + +{{/* +Postgres HA etcd Workload Name +*/}} +{{- define "pg-ha.etcd.name" -}} +{{- printf "%s-etcd" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Proxy Workload Name +*/}} +{{- define "pg-ha.proxy.name" -}} +{{- printf "%s-postgres-ha-proxy" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Workload Logical Backup Name +*/}} +{{- define "pg-ha.backup.name" -}} +{{- printf "%s-postgres-ha-backup" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Secret Database Config Name +*/}} +{{- define "pg-ha.secretDatabase.name" -}} +{{- printf "%s-postgres-config" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Secret Startup Name +*/}} +{{- define "pg-ha.secretStartup.name" -}} +{{- printf "%s-postgres-proxy-startup" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Secret Proxy Startup Name +*/}} +{{- define "pg-ha.secretProxyStartup.name" -}} +{{- printf "%s-patroni-startup" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Secret WAL-G Backup Startup Name +*/}} +{{- define "pg-ha.secretWALGStartup.name" -}} +{{- printf "%s-wal-g-backup-script" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Identity Name +*/}} +{{- define "pg-ha.identity.name" -}} +{{- printf "%s-postgres-ha-identity" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Policy Name +*/}} +{{- define "pg-ha.policy.name" -}} +{{- printf "%s-postgres-ha-policy" .Release.Name }} +{{- end }} + +{{/* +Postgres HA Volume Set Name +*/}} +{{- define "pg-ha.volume.name" -}} +{{- printf "%s-postgres-ha-vs" .Release.Name }} +{{- end }} + +{{/* +PgBouncer Workload Name +*/}} +{{- define "pg-ha.pgbouncer.name" -}} +{{- printf "%s-pgbouncer" .Release.Name }} +{{- end }} + + +{{/* Validation */}} + +{{/* +Validate backup mode - must be "logical" or "wal-g" +*/}} +{{- define "pg-ha.validateBackupMode" -}} +{{- $mode := .Values.backup.mode -}} +{{- if and .Values.backup.enabled (not (or (eq $mode "logical") (eq $mode "wal-g"))) -}} + {{- fail (printf "Invalid backup.mode: '%s'. Must be either 'logical' or 'wal-g'." $mode) -}} +{{- end -}} +{{- end }} + +{{/* +Validate backup configuration - when backup is enabled, backup.provider must be set to 'aws' or 'gcp' +*/}} +{{- define "pg-ha.validateBackupConfig" -}} +{{- include "pg-ha.validateBackupMode" . -}} +{{- if .Values.backup.enabled -}} + {{- $provider := .Values.backup.provider -}} + {{- if not (or (eq $provider "aws") (eq $provider "gcp")) -}} + {{- fail "Invalid backup configuration: backup.provider must be set to 'aws' or 'gcp'." -}} + {{- end -}} + {{- if eq $provider "aws" -}} + {{- if not .Values.backup.aws.bucket -}} + {{- fail "Invalid backup configuration: backup.aws.bucket is required when provider is 'aws'." -}} + {{- end -}} + {{- if not .Values.backup.aws.region -}} + {{- fail "Invalid backup configuration: backup.aws.region is required when provider is 'aws'." -}} + {{- end -}} + {{- if not .Values.backup.aws.cloudAccountName -}} + {{- fail "Invalid backup configuration: backup.aws.cloudAccountName is required when provider is 'aws'." -}} + {{- end -}} + {{- end -}} + {{- if eq $provider "gcp" -}} + {{- if not .Values.backup.gcp.bucket -}} + {{- fail "Invalid backup configuration: backup.gcp.bucket is required when provider is 'gcp'." -}} + {{- end -}} + {{- if not .Values.backup.gcp.cloudAccountName -}} + {{- fail "Invalid backup configuration: backup.gcp.cloudAccountName is required when provider is 'gcp'." -}} + {{- end -}} + {{- end -}} +{{- end -}} +{{- end }} + + +{{/* Labeling */}} + +{{/* +Common labels - delegated to cpln-common +*/}} +{{- define "pg-ha.tags" -}} +{{- include "cpln-common.tags" . }} +{{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/identity.yaml b/postgres-highly-available/versions/2.3.1/templates/identity.yaml new file mode 100644 index 00000000..838626cf --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/identity.yaml @@ -0,0 +1,25 @@ +{{- include "pg-ha.validateBackupConfig" . -}} +kind: identity +gvc: {{ .Values.global.cpln.gvc }} +name: {{ include "pg-ha.identity.name" . }} +description: Postgres Highly Available identity +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.backup.aws.cloudAccountName }} + policyRefs: + - cpln-connector + - aws::ReadOnlyAccess + - "{{ .Values.backup.aws.policyName }}" +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} +gcp: + bindings: + - resource: //storage.googleapis.com/projects/_/buckets/{{ .Values.backup.gcp.bucket }} + roles: + - roles/storage.objectAdmin + cloudAccountLink: //cloudaccount/{{ .Values.backup.gcp.cloudAccountName }} + scopes: + - https://www.googleapis.com/auth/cloud-platform +{{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/policy.yaml b/postgres-highly-available/versions/2.3.1/templates/policy.yaml new file mode 100644 index 00000000..3fe654f0 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/policy.yaml @@ -0,0 +1,20 @@ +kind: policy +name: {{ include "pg-ha.policy.name" . }} +description: Postgres Highly Available policy +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "pg-ha.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "pg-ha.secretStartup.name" . }} + - //secret/{{ include "pg-ha.secretDatabase.name" . }} + {{- if .Values.proxy.enabled }} + - //secret/{{ include "pg-ha.secretProxyStartup.name" . }} + {{- end }} + {{- if and .Values.backup.enabled (eq .Values.backup.mode "wal-g") }} + - //secret/{{ include "pg-ha.secretWALGStartup.name" . }} + {{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/secret-config.yaml b/postgres-highly-available/versions/2.3.1/templates/secret-config.yaml new file mode 100644 index 00000000..f3cd9b45 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/secret-config.yaml @@ -0,0 +1,17 @@ +kind: secret +name: {{ include "pg-ha.secretDatabase.name" . }} +description: Postgres Highly Available config +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +type: dictionary +data: + username: {{ .Values.postgres.username | quote }} + password: {{ .Values.postgres.password | quote }} + database: {{ .Values.postgres.database | quote }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} + backup-bucket: {{ .Values.backup.aws.bucket | quote }} + aws-region: {{ .Values.backup.aws.region | quote }} +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} + backup-bucket: {{ .Values.backup.gcp.bucket | quote }} +{{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/secret-ha-proxy.yaml b/postgres-highly-available/versions/2.3.1/templates/secret-ha-proxy.yaml new file mode 100644 index 00000000..362460e8 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/secret-ha-proxy.yaml @@ -0,0 +1,91 @@ +{{- if or .Values.proxy.enabled .Values.pgbouncer.enabled }} +kind: secret +name: {{ include "pg-ha.secretProxyStartup.name" . }} +description: HAProxy startup script for Postgres Highly Available +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +type: opaque +data: + encoding: plain + payload: |- + #!/usr/bin/env sh + set -eu + + LOCATION="$(basename "${CPLN_LOCATION:-}")" + if [ -z "${LOCATION}" ]; then + echo "ERROR: CPLN_LOCATION is empty; cannot derive location" + exit 1 + fi + + GVC="{{ .Values.global.cpln.gvc }}" + WORKLOAD="{{ include "pg-ha.name" . }}" + REPLICAS="{{ .Values.replicas }}" + + echo "Starting HAProxy Patroni leader proxy" + echo "Derived location: ${LOCATION}" + echo "Target workload: ${WORKLOAD}" + echo "Replicas: ${REPLICAS}" + + CFG="/tmp/haproxy.cfg" + + SERVERS="" + i=0 + while [ "${i}" -lt "${REPLICAS}" ]; do + HOST="replica-${i}.${WORKLOAD}.${LOCATION}.${GVC}.cpln.local" + SERVERS="${SERVERS} + server pg${i} ${HOST}:5432 check port 8008" + i=$((i + 1)) + done + + cat > "${CFG}" < "$CONFIG_FILE" <> "$CONFIG_FILE" < + sh -lc 'psql -h 127.0.0.1 -p 5432 -U {{ .Values.postgres.username }} -d postgres -c "CREATE DATABASE {{ .Values.postgres.database }}"' + EOF + else + echo "PGDATA exists — starting Patroni normally" + fi + + # --- drop privileges and start Patroni --- + echo "Dropping privileges to postgres user..." + exec gosu postgres patroni "$CONFIG_FILE" \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/secret-wal-g-script.yaml b/postgres-highly-available/versions/2.3.1/templates/secret-wal-g-script.yaml new file mode 100644 index 00000000..88ebe64a --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/secret-wal-g-script.yaml @@ -0,0 +1,39 @@ +kind: secret +name: {{ include "pg-ha.secretWALGStartup.name" . }} +description: WAL-G base backup sidecar script +type: opaque +data: + encoding: plain + payload: |- + #!/usr/bin/env bash + set -euo pipefail + + : "${PGDATA:?Missing PGDATA}" + : "${PATRONI_API:=http://127.0.0.1:8008}" + : "${WALG_BACKUP_INTERVAL_SECONDS:=3600}" + + echo "WAL-G base backup sidecar started" + echo "PGDATA=${PGDATA}" + echo "PATRONI_API=${PATRONI_API}" + echo "Interval=${WALG_BACKUP_INTERVAL_SECONDS}s" + + # small jitter so fleets don't backup at the same second + sleep $(( RANDOM % 30 )) + + while true; do + code="$(curl -s -o /dev/null -w "%{http_code}" "${PATRONI_API}/primary" || true)" + + if [ "${code}" = "200" ]; then + echo "Leader confirmed. Running wal-g backup-push ${PGDATA}" + wal-g backup-push "${PGDATA}" + echo "Backup complete" + + # Optional retention (uncomment if you want it) + # : "${WALG_RETAIN_FULL:=7}" + # wal-g delete retain FULL "${WALG_RETAIN_FULL}" --confirm || true + else + echo "Not leader (/primary=${code}). Skipping backup." + fi + + sleep "${WALG_BACKUP_INTERVAL_SECONDS}" + done \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/volumeset.yaml b/postgres-highly-available/versions/2.3.1/templates/volumeset.yaml new file mode 100644 index 00000000..6a39ba8c --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/volumeset.yaml @@ -0,0 +1,20 @@ +kind: volumeset +name: {{ include "pg-ha.volume.name" . }} +description: Postgres Highly Available volumeset +gvc: {{ .Values.global.cpln.gvc }} +tags: + {{- include "pg-ha.tags" . | nindent 2 }} + workload: {{ include "pg-ha.name" . }} +spec: + fileSystemType: ext4 + initialCapacity: {{ .Values.volumeset.capacity }} + {{- if .Values.volumeset.autoscaling.enabled }} + autoscaling: + maxCapacity: {{ .Values.volumeset.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.volumeset.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.volumeset.autoscaling.scalingFactor }} + {{- end }} + performanceClass: general-purpose-ssd + snapshots: + createFinalSnapshot: true + retentionDuration: 7d \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/workload-ha-proxy.yaml b/postgres-highly-available/versions/2.3.1/templates/workload-ha-proxy.yaml new file mode 100644 index 00000000..4ed0855f --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/workload-ha-proxy.yaml @@ -0,0 +1,56 @@ +{{- if or .Values.proxy.enabled .Values.pgbouncer.enabled }} +kind: workload +name: {{ include "pg-ha.proxy.name" . }} +description: HAProxy leader-only endpoint for Patroni Postgres +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: haproxy + image: {{ .Values.proxy.image }} + inheritEnv: false + command: /bin/sh + args: + - /proxy/start.sh + cpu: {{ .Values.proxy.resources.cpu }} + memory: {{ .Values.proxy.resources.memory }} + ports: + - number: 5432 + protocol: tcp + - number: 8404 + protocol: http + - number: 8405 + protocol: http + volumes: + - path: /proxy/start.sh + uri: cpln://secret/{{ include "pg-ha.secretProxyStartup.name" . }}.payload + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "pg-ha.identity.name" . }} + defaultOptions: + autoscaling: + metric: rps + minScale: {{ .Values.proxy.minReplicas }} + maxScale: {{ .Values.proxy.maxReplicas }} + maxConcurrency: 0 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + multiZone: + enabled: {{ .Values.multiZone }} + timeoutSeconds: 5 + firewallConfig: + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + external: + inboundAllowCIDR: [] + outboundAllowCIDR: [] + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: false +{{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/workload-logical-backup.yaml b/postgres-highly-available/versions/2.3.1/templates/workload-logical-backup.yaml new file mode 100644 index 00000000..a9897b60 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/workload-logical-backup.yaml @@ -0,0 +1,74 @@ +{{- if and .Values.backup.enabled (eq .Values.backup.mode "logical") }} +kind: workload +name: {{ include "pg-ha.backup.name" . }} +description: Scheduled logical backup +tags: + {{- include "pg-ha.tags" . | nindent 2 }} +spec: + type: cron + containers: + - name: backup + image: {{ .Values.backup.logical.image }} + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + inheritEnv: false + env: + - name: PG_HOST + value: {{ include "pg-ha.proxy.name" . }}.{{ .Values.global.cpln.gvc }}.cpln.local + - name: PG_PORT + value: '5432' + - name: PG_USER + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.username + - name: PG_PASSWORD + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.password + - name: BACKUP_BUCKET + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.backup-bucket + {{- if eq .Values.backup.provider "aws" }} + - name: AWS_REGION + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.aws-region + - name: BACKUP_PROVIDER + value: aws + - name: BACKUP_PREFIX + value: {{ .Values.backup.aws.prefix }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + - name: BACKUP_PROVIDER + value: gcp + - name: BACKUP_PREFIX + value: {{ .Values.backup.gcp.prefix }} + {{- end }} + identityLink: //identity/{{ include "pg-ha.identity.name" . }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: disabled + minScale: 1 + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + multiZone: + enabled: false + firewallConfig: + external: + inboundAllowCIDR: [] + inboundBlockedCIDR: [] + outboundAllowPort: + - number: 443 + protocol: tcp + {{- if eq .Values.backup.provider "aws" }} + outboundAllowHostname: + - s3.{{ .Values.backup.aws.region }}.amazonaws.com + - s3.amazonaws.com + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + outboundAllowHostname: + - storage.googleapis.com + {{- end }} + job: + schedule: {{ .Values.backup.logical.schedule | quote }} + concurrencyPolicy: Forbid + restartPolicy: Never + historyLimit: 5 +{{- end }} \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/workload-patroni-postgres.yaml b/postgres-highly-available/versions/2.3.1/templates/workload-patroni-postgres.yaml new file mode 100644 index 00000000..49a7b568 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/workload-patroni-postgres.yaml @@ -0,0 +1,179 @@ +kind: workload +name: {{ include "pg-ha.name" . }} +description: Postgres Highly Available +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +spec: + type: stateful + containers: + - name: patroni-postgres + command: "/bin/bash" + args: + - "/patroni/start.sh" + minCpu: {{ .Values.resources.minCpu | quote }} + minMemory: {{ .Values.resources.minMemory | quote }} + cpu: {{ .Values.resources.maxCpu | quote }} + memory: {{ .Values.resources.maxMemory | quote }} + env: + - name: PGDATA + value: /var/lib/postgresql/data/pgdata + {{- if and .Values.backup.enabled (eq .Values.backup.mode "wal-g") }} + - name: WALG_COMPRESSION_METHOD + value: zstd + {{- if .Values.backup.aws.enabled }} + - name: WALG_S3_PREFIX + value: s3://{{ .Values.backup.aws.s3.bucket }}/{{ .Values.backup.aws.s3.prefix }} + - name: AWS_REGION + value: {{ .Values.backup.aws.s3.region | quote }} + - name: AWS_S3_FORCE_PATH_STYLE + value: "true" + {{- end }} + {{- if .Values.backup.gcp.enabled }} + - name: WALG_GS_PREFIX + value: gs://{{ .Values.backup.gcp.gcs.bucket }}/{{ .Values.backup.gcp.gcs.prefix }} + {{- end }} + {{- end }} + image: {{ .Values.image }} + inheritEnv: false + ports: + - number: 8008 + protocol: tcp + - number: 5432 + protocol: tcp + volumes: + - path: /var/lib/postgresql/data + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "pg-ha.volume.name" . }} + - path: /patroni/start.sh + uri: cpln://secret/{{ include "pg-ha.secretStartup.name" . }}.payload + lifecycle: + preStop: + exec: + command: + - /bin/bash + - -c + - | + set -uo pipefail + ROLE=$(curl -fs --max-time 2 http://localhost:8008/patroni | python3 -c 'import sys,json; print(json.load(sys.stdin).get("role",""))' 2>/dev/null || echo "") + if [ "$ROLE" = "master" ] || [ "$ROLE" = "primary" ]; then + curl -fs --max-time 10 -X POST http://localhost:8008/switchover -H 'Content-Type: application/json' -d "{\"leader\":\"${HOSTNAME}\"}" || true + for i in $(seq 1 20); do + R=$(curl -fs --max-time 1 http://localhost:8008/patroni | python3 -c 'import sys,json; print(json.load(sys.stdin).get("role",""))' 2>/dev/null) + [ "$R" = "replica" ] && break + sleep 1 + done + fi + livenessProbe: + httpGet: + path: /liveness + port: 8008 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 6 + readinessProbe: + httpGet: + path: /readiness + port: 8008 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + # WAL-G backup sidecar + {{- if and .Values.backup.enabled (eq .Values.backup.mode "wal-g") }} + - name: wal-g-backup + image: {{ .Values.image }} + command: "/bin/bash" + args: + - "/walg/backup.sh" + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + inheritEnv: false + env: + - name: PGDATA + value: /var/lib/postgresql/data/pgdata + - name: PATRONI_API + value: http://127.0.0.1:8008 + - name: WALG_BACKUP_INTERVAL_SECONDS + value: {{ .Values.backup.walg.intervalSeconds | quote }} + - name: WALG_COMPRESSION_METHOD + value: zstd + - name: PGHOST + value: 127.0.0.1 + - name: PGPORT + value: "5432" + - name: PGUSER + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.username + - name: PGPASSWORD + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.password + - name: PGDATABASE + value: postgres + {{- if .Values.backup.aws.enabled }} + - name: WALG_S3_PREFIX + value: s3://{{ .Values.backup.aws.s3.bucket }}/{{ .Values.backup.aws.s3.prefix }} + - name: AWS_REGION + value: {{ .Values.backup.aws.s3.region | quote }} + - name: AWS_S3_FORCE_PATH_STYLE + value: "true" + {{- end }} + {{- if .Values.backup.gcp.enabled }} + - name: WALG_GS_PREFIX + value: gs://{{ .Values.backup.gcp.gcs.bucket }}/{{ .Values.backup.gcp.gcs.prefix }} + {{- end }} + volumes: + - path: /var/lib/postgresql/data + uri: cpln://volumeset/{{ include "pg-ha.volume.name" . }} + recoveryPolicy: retain + - path: /walg/backup.sh + uri: cpln://secret/{{ include "pg-ha.secretWALGStartup.name" . }}.payload + {{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "pg-ha.identity.name" . }} + defaultOptions: + autoscaling: + keda: + cooldownPeriod: 1 + initialCooldownPeriod: 1 + pollingInterval: 1 + triggers: [] + maxConcurrency: 0 + maxScale: {{ .Values.replicas }} + metric: disabled + minScale: {{ .Values.replicas }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + multiZone: + enabled: {{ .Values.multiZone }} + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: [] + inboundBlockedCIDR: [] + outboundAllowPort: [] + {{- if and .Values.backup.enabled (eq .Values.backup.mode "wal-g") }} + {{- if .Values.backup.aws.enabled }} + outboundAllowHostname: + - s3.{{ .Values.backup.aws.s3.region }}.amazonaws.com + - s3.amazonaws.com + {{- end }} + {{- end }} + {{- if and .Values.backup.enabled (eq .Values.backup.mode "wal-g") }} + {{- if .Values.backup.gcp.enabled }} + outboundAllowHostname: + - storage.googleapis.com + {{- end }} + {{- end }} + outboundBlockedCIDR: [] + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: true + securityOptions: + filesystemGroupId: 999 + supportDynamicTags: false \ No newline at end of file diff --git a/postgres-highly-available/versions/2.3.1/templates/workload-pgbouncer.yaml b/postgres-highly-available/versions/2.3.1/templates/workload-pgbouncer.yaml new file mode 100644 index 00000000..4969b460 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/templates/workload-pgbouncer.yaml @@ -0,0 +1,67 @@ +{{- if .Values.pgbouncer.enabled }} +kind: workload +name: {{ include "pg-ha.pgbouncer.name" . }} +description: PgBouncer Connection Pooler +tags: + {{- include "pg-ha.tags" . | nindent 4 }} +spec: + type: standard + containers: + - name: pgbouncer + image: {{ .Values.pgbouncer.image }} + cpu: {{ .Values.pgbouncer.resources.cpu | quote }} + memory: {{ .Values.pgbouncer.resources.memory | quote }} + inheritEnv: false + env: + - name: DB_HOST + value: {{ include "pg-ha.proxy.name" . }}.{{ .Values.global.cpln.gvc }}.cpln.local + - name: DB_PORT + value: '5432' + - name: DB_USER + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.username + - name: DB_PASSWORD + value: cpln://secret/{{ include "pg-ha.secretDatabase.name" . }}.password + - name: DB_NAME + value: {{ .Values.postgres.database | quote }} + - name: POOL_MODE + value: {{ .Values.pgbouncer.poolMode | quote }} + - name: DEFAULT_POOL_SIZE + value: {{ .Values.pgbouncer.defaultPoolSize | quote }} + - name: MAX_CLIENT_CONN + value: {{ .Values.pgbouncer.maxClientConn | quote }} + - name: MAX_DB_CONNECTIONS + value: {{ .Values.pgbouncer.maxDbConnections | quote }} + - name: AUTH_TYPE + value: plain + ports: + - number: 5432 + protocol: tcp + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "pg-ha.identity.name" . }} + defaultOptions: + autoscaling: + metric: rps + minScale: {{ .Values.pgbouncer.minReplicas }} + maxScale: {{ .Values.pgbouncer.maxReplicas }} + maxConcurrency: 0 + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + multiZone: + enabled: {{ .Values.multiZone }} + timeoutSeconds: 5 + firewallConfig: + internal: + inboundAllowType: {{ .Values.internal_access.type }} + {{- if .Values.internal_access.workloads }} + inboundAllowWorkload: {{ .Values.internal_access.workloads | toYaml | nindent 8 }} + {{- end }} + external: + inboundAllowCIDR: [] + outboundAllowCIDR: [] + loadBalancer: + direct: + enabled: false + ports: [] + replicaDirect: false +{{- end }} diff --git a/postgres-highly-available/versions/2.3.1/values.yaml b/postgres-highly-available/versions/2.3.1/values.yaml new file mode 100644 index 00000000..3b7315e7 --- /dev/null +++ b/postgres-highly-available/versions/2.3.1/values.yaml @@ -0,0 +1,96 @@ +replicas: 3 + +resources: + minCpu: 500m + minMemory: 1Gi + maxCpu: 1 + maxMemory: 2Gi + +image: controlplanecorporation/patroni-postgres:0.7 + +postgres: + username: username + password: password + database: test + +multiZone: false + +volumeset: + capacity: 10 # initial capacity in GiB (minimum is 10) + autoscaling: + enabled: false # Set to true to enable autoscaling + maxCapacity: 100 # Maximum capacity in GiB when autoscaling is enabled + minFreePercentage: 10 # Minimum free percentage to trigger scaling when autoscaling is enabled + scalingFactor: 1.2 # Scaling factor to determine how much to scale up when autoscaling is triggered + +internal_access: + type: same-gvc # options: same-gvc, same-org, workload-list + workloads: # Note: can only be used if type is same-gvc or workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + +etcd: + replicas: 3 + resources: + cpu: 500m + memory: 512Mi + multiZone: false + volumeset: + capacity: 10 # initial capacity in GiB (minimum is 10) + internal_access: + type: same-gvc # options: same-gvc, same-org, workload-list + workloads: # Note: can only be used if type is same-gvc or workload-list + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + #- //gvc/GVC_NAME/workload/WORKLOAD_NAME + +pgbouncer: + enabled: false + image: edoburu/pgbouncer:v1.25.1-p0 + poolMode: transaction # options: session, transaction, statement + defaultPoolSize: 25 # number of real Postgres connections PgBouncer maintains per pod + maxClientConn: 1000 # maximum number of client connections PgBouncer accepts per pod + maxDbConnections: 100 # hard cap on total Postgres connections regardless of how many PgBouncer pods are running + minReplicas: 2 + maxReplicas: 4 + + resources: + cpu: 200m + memory: 128Mi + +proxy: # HA Proxy endpoint to write to leader replica. Automatically enabled when pgbouncer is enabled. + enabled: true + image: haproxy:2.9 + resources: + cpu: 100m + memory: 128Mi + minReplicas: 2 + maxReplicas: 2 + +backup: + enabled: false + mode: logical # logical or wal-g + resources: # applies to whichever mode is enabled + cpu: 100m + memory: 128Mi + + logical: # logical backup settings + image: controlplanecorporation/pg-backup:17.1.0 # tag 17.1.0 = Postgres 17, no other tags currently supported + schedule: "0 2 * * *" # cron schedule, default isdaily at 2am UTC + + walg: # wal-g backup settings + intervalSeconds: 21600 # interval in seconds between backups, default is every 6 hours + + # storage settings are applied to whichever mode is enabled + provider: aws # Options: aws or gcp + + aws: + bucket: pg-ha-backup-bucket + region: us-east-1 + cloudAccountName: my-s3-cloud-account + policyName: pg-ha-backup-policy + prefix: postgres/backups # folder name where your backups will be stored + + gcp: + bucket: pg-ha-backup-bucket + cloudAccountName: my-gcs-cloud-account + prefix: postgres/backups # folder name where your backups will be stored \ No newline at end of file diff --git a/redis/RELEASES.md b/redis/RELEASES.md index 42a55397..4d92b4d6 100644 --- a/redis/RELEASES.md +++ b/redis/RELEASES.md @@ -1,3 +1,11 @@ +# Release Notes - Version 3.3.0 + +## What's New + +- **Smart Master Discovery**: Non-primary replicas now query Sentinel at startup to find the current master rather than hardcoding replica-0, ensuring correct replication after any failover. +- **Resilient Sentinel Targeting**: All modes query the Sentinel service endpoint so any healthy Sentinel instance can respond, rather than always targeting replica-0. + + # Release Notes - Version 3.2.0 ## What's New @@ -5,6 +13,15 @@ - **Backup Support**: Added optional scheduled backup to AWS S3 or GCS via a dedicated cron workload. Configure with `backup.enabled`, `backup.provider`, and your cloud provider settings. Supports Redis password authentication (inline or from secret). See the README for full setup instructions. +# Release Notes - Version 3.1.1 + +## What's New + +- **Template Refactoring**: Centralized all resource naming into helper functions, improving consistency across templates. +- **Password Quoting Fix**: Secret password values are now properly quoted, preventing YAML parsing issues with special characters. +- **README Rewrite**: Documentation updated with clearer configuration examples and internal endpoint reference. + + # Release Notes - Version 3.1.0 ## What's New diff --git a/redis/versions/3.2.0/templates/secret-redis-config.yaml b/redis/versions/3.2.0/templates/secret-redis-config.yaml index 7b35f60d..6f894191 100644 --- a/redis/versions/3.2.0/templates/secret-redis-config.yaml +++ b/redis/versions/3.2.0/templates/secret-redis-config.yaml @@ -9,4 +9,7 @@ data: save 900 1 save 300 10 save 60 10000 - appendonly yes \ No newline at end of file + appendonly yes + repl-backlog-size {{ .Values.redis.replication.backlogSize }} + repl-timeout {{ .Values.redis.replication.timeout }} + client-output-buffer-limit slave {{ .Values.redis.replication.slaveOutputBufferLimit }} \ No newline at end of file diff --git a/redis/versions/3.2.0/templates/workload-redis.yaml b/redis/versions/3.2.0/templates/workload-redis.yaml index 91cb9e8d..43167f17 100644 --- a/redis/versions/3.2.0/templates/workload-redis.yaml +++ b/redis/versions/3.2.0/templates/workload-redis.yaml @@ -6,6 +6,11 @@ tags: {{- if .Values.redis.tags }} {{ toYaml .Values.redis.tags | indent 2 }} {{- end }} + # Sentinel discovery and replica startup must resolve `.:6379` even + # while the pod is still resyncing or marked NotReady by the replication-aware + # readiness probe. This tag exposes not-yet-Ready pods on the headless service so + # peers can reach them for replication and Sentinel can monitor them. + cpln/publishNotReadyAddresses: "true" {{- include "redis.tags" . | nindent 2 }} spec: type: stateful @@ -59,15 +64,19 @@ spec: echo "\nreplica-announce-port 6379" >> /etc/redis/redis.conf {{ end }} + # exec replaces the shell with redis-server so SIGTERM from the + # platform routes directly to redis-server instead of being swallowed + # by /bin/sh. Without exec, shutdown waits the full grace period + # before SIGKILL since the shell doesn't forward signals to children. if [ "$(hostname)" = "{{ include "redis.name" . }}-0" ]; then - {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} else {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} - {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof {{ .Values.redis.publicAccess.address }} 6380 + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof {{ .Values.redis.publicAccess.address }} 6380 {{- else if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} - {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof replica-0.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local 6379 + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof replica-0.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local 6379 {{- else }} - {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof {{ include "redis.name" . }}-0.{{ include "redis.name" . }} 6379 + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof {{ include "redis.name" . }}-0.{{ include "redis.name" . }} 6379 {{- end }} fi command: /bin/sh @@ -88,16 +97,48 @@ spec: {{- else }} PORT=6379 {{- end }} - if [ ! -z "$CUSTOM_REDIS_PASSWORD" ]; then - redis-cli -p $PORT --no-auth-warning -a "$CUSTOM_REDIS_PASSWORD" ping; - else - redis-cli -p $PORT ping; + rcli() { + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + redis-cli -p $PORT --no-auth-warning -a "$CUSTOM_REDIS_PASSWORD" "$@" + else + redis-cli -p $PORT "$@" + fi + } + rcli ping >/dev/null && \ + if [ "$(rcli role | head -1)" = "slave" ]; then + [ "$(rcli info replication | awk -F: '/master_link_status/{print $2}' | tr -d '\r')" = "up" ] && \ + [ "$(rcli info replication | awk -F: '/master_sync_in_progress/{print $2}' | tr -d '\r')" = "0" ] fi failureThreshold: 10 initialDelaySeconds: 10 periodSeconds: 5 successThreshold: 1 timeoutSeconds: 4 + # Explicit liveness probe — keep it permissive (just "is the process up?") + # so the platform doesn't kill a pod mid-resync. cpln defaults the liveness + # probe to whatever the readiness probe is, and our readiness probe + # intentionally returns failure during full resync (master_link_status:up + # AND master_sync_in_progress:0). Without this override, a slave doing a + # full resync would be killed before it could finish. + livenessProbe: + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + exec: + command: + - /bin/bash + - "-c" + - |- + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((6380 + POD_ID)) + exec 3<>/dev/tcp/127.0.0.1/$PORT + {{- else }} + tcpSocket: + port: 6379 + {{- end }} + failureThreshold: 5 + initialDelaySeconds: 30 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 5 inheritEnv: false ports: {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled (gt (.Values.redis.replicas | int) 0) }} @@ -139,6 +180,17 @@ spec: multiZone: enabled: false {{- end }} + # Parallel pod management. Default OrderedReady serializes replica boots, + # which forces sentinels and redis pods to wait for each peer to be Ready + # before the next starts and produces extended cold-start +sdown noise. + # Parallel boots all replicas at once; publishNotReadyAddresses keeps peer + # DNS resolvable during the simultaneous cold start so the cluster can form. + rolloutOptions: + scalingPolicy: Parallel +{{- if .Values.redis.requestRetryPolicy }} + requestRetryPolicy: +{{ toYaml .Values.redis.requestRetryPolicy | indent 4 }} +{{- end }} {{- if .Values.redis.firewall }} firewallConfig: {{- if or (hasKey .Values.redis.firewall "external_inboundAllowCIDR") (hasKey .Values.redis.firewall "external_outboundAllowCIDR") }} diff --git a/redis/versions/3.2.0/templates/workload-sentinel.yaml b/redis/versions/3.2.0/templates/workload-sentinel.yaml index 6ef60f7c..82ab5787 100644 --- a/redis/versions/3.2.0/templates/workload-sentinel.yaml +++ b/redis/versions/3.2.0/templates/workload-sentinel.yaml @@ -6,6 +6,13 @@ tags: {{- if .Values.sentinel.tags }} {{ toYaml .Values.sentinel.tags | indent 2 }} {{- end }} + # Currently a no-op: the sentinel readiness probe is a plain `redis-cli ping`, so + # pods become Ready as soon as the port answers and the headless service exposes + # them anyway. Set here as a hedge — if the probe is ever tightened (e.g. to + # require quorum visibility or a known master), peers must still resolve each + # other via `.` during cold start to form the quorum, otherwise + # they deadlock: NotReady because no peers, no peers because NotReady. + cpln/publishNotReadyAddresses: "true" {{- include "redis.tags" . | nindent 2 }} spec: type: stateful @@ -19,43 +26,72 @@ spec: {{- else }} mkdir -p /etc/sentinel {{- end }} - cp /config/sentinel.conf /etc/sentinel/sentinel.conf - if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then - echo "\nsentinel auth-pass mymaster $CUSTOM_REDIS_PASSWORD" >> /etc/sentinel/sentinel.conf - fi + # bootstrap_sentinel_conf: write /etc/sentinel/sentinel.conf only if + # sentinel hasn't already taken ownership of the file. The marker we + # check is the `sentinel myid ` directive — sentinel writes this + # via CONFIG REWRITE within milliseconds of its first successful + # startup, and the chart's static template never contains it. So: + # - marker present → previous boot got far enough that sentinel + # owns the file. Preserve everything (current master after any + # failover, known replicas, peer sentinels). + # - marker absent → file missing, empty, or written but sentinel + # never reached its first CONFIG REWRITE. Re-run bootstrap; + # idempotent against the static config so safe to re-run. + # When sentinel.persistence is disabled this still runs every boot + # (ephemeral filesystem, no marker survives) — same behavior as the + # pre-marker chart. The check only changes behavior when /etc/sentinel + # is backed by a persistent volume. + bootstrap_sentinel_conf() { + if [ -s /etc/sentinel/sentinel.conf ] && grep -q "^sentinel myid " /etc/sentinel/sentinel.conf; then + echo "sentinel.conf already bootstrapped; preserving sentinel-managed state" + return 0 + fi + echo "Bootstrapping sentinel.conf from static config" + cp /config/sentinel.conf /etc/sentinel/sentinel.conf - if [ -n "$CUSTOM_SENTINEL_PASSWORD" ]; then - echo "\nrequirepass $CUSTOM_SENTINEL_PASSWORD" >> /etc/sentinel/sentinel.conf - fi + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + echo "\nsentinel auth-pass mymaster $CUSTOM_REDIS_PASSWORD" >> /etc/sentinel/sentinel.conf + fi - {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} - POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) - PORT=$((26380 + POD_ID)) - echo "\nport $PORT" >> /etc/sentinel/sentinel.conf - echo "\nsentinel announce-ip {{ .Values.sentinel.publicAccess.address }}" >> /etc/sentinel/sentinel.conf - echo "\nsentinel announce-port $PORT" >> /etc/sentinel/sentinel.conf - {{ else }} - echo "\nport 26379" >> /etc/sentinel/sentinel.conf - POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) - LOCATION=${CPLN_LOCATION##*/} - CPLN_WORKLOAD_NAME="${CPLN_WORKLOAD##*/}" - if [ -n "$REPLICA_DIRECT" ]; then - echo "\nsentinel announce-ip replica-${POD_ID}.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local" >> /etc/sentinel/sentinel.conf - else - echo "\nsentinel announce-ip ${HOSTNAME}.{{ include "redis.sentinel.name" . }}" >> /etc/sentinel/sentinel.conf - fi - echo "\nsentinel announce-port 26379" >> /etc/sentinel/sentinel.conf - {{ end }} - {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} - echo "sentinel monitor mymaster {{ .Values.redis.publicAccess.address }} 6380 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf - {{- else if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} - echo "sentinel monitor mymaster replica-0.{{ include "redis.name" . }}.${LOCATION}.${CPLN_GVC}.cpln.local 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf - {{- else }} - echo "sentinel monitor mymaster {{ include "redis.name" . }}-0.{{ include "redis.name" . }} 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf - {{- end }} + if [ -n "$CUSTOM_SENTINEL_PASSWORD" ]; then + echo "\nrequirepass $CUSTOM_SENTINEL_PASSWORD" >> /etc/sentinel/sentinel.conf + fi + + {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((26380 + POD_ID)) + echo "\nport $PORT" >> /etc/sentinel/sentinel.conf + echo "\nsentinel announce-ip {{ .Values.sentinel.publicAccess.address }}" >> /etc/sentinel/sentinel.conf + echo "\nsentinel announce-port $PORT" >> /etc/sentinel/sentinel.conf + {{ else }} + echo "\nport 26379" >> /etc/sentinel/sentinel.conf + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + LOCATION=${CPLN_LOCATION##*/} + CPLN_WORKLOAD_NAME="${CPLN_WORKLOAD##*/}" + if [ -n "$REPLICA_DIRECT" ]; then + echo "\nsentinel announce-ip replica-${POD_ID}.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local" >> /etc/sentinel/sentinel.conf + else + echo "\nsentinel announce-ip ${HOSTNAME}.{{ include "redis.sentinel.name" . }}" >> /etc/sentinel/sentinel.conf + fi + echo "\nsentinel announce-port 26379" >> /etc/sentinel/sentinel.conf + {{ end }} + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + echo "sentinel monitor mymaster {{ .Values.redis.publicAccess.address }} 6380 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- else if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} + echo "sentinel monitor mymaster replica-0.{{ include "redis.name" . }}.${LOCATION}.${CPLN_GVC}.cpln.local 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- else }} + echo "sentinel monitor mymaster {{ include "redis.name" . }}-0.{{ include "redis.name" . }} 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- end }} + } - redis-sentinel /etc/sentinel/sentinel.conf + bootstrap_sentinel_conf + + # exec replaces the shell with redis-sentinel so SIGTERM from the + # platform routes directly to redis-sentinel for clean shutdown. + # Without exec, the shell swallows the signal and sentinel only dies + # when the platform sends SIGKILL after the grace period expires. + exec redis-sentinel /etc/sentinel/sentinel.conf command: /bin/sh cpu: {{ .Values.sentinel.resources.cpu }} memory: {{ .Values.sentinel.resources.memory }} @@ -151,6 +187,14 @@ spec: multiZone: enabled: false {{- end }} + # See workload-redis.yaml for rationale. Parallel boots all sentinels at + # once instead of waiting for each to be Ready in sequence. + rolloutOptions: + scalingPolicy: Parallel +{{- if .Values.sentinel.requestRetryPolicy }} + requestRetryPolicy: +{{ toYaml .Values.sentinel.requestRetryPolicy | indent 4 }} +{{- end }} {{- if .Values.sentinel.firewall }} firewallConfig: {{- if or (hasKey .Values.sentinel.firewall "external_inboundAllowCIDR") (hasKey .Values.sentinel.firewall "external_outboundAllowCIDR") }} diff --git a/redis/versions/3.2.0/values.yaml b/redis/versions/3.2.0/values.yaml index 480f054e..28adcd93 100644 --- a/redis/versions/3.2.0/values.yaml +++ b/redis/versions/3.2.0/values.yaml @@ -32,6 +32,30 @@ redis: # external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list env: [] tags: {} + # requestRetryPolicy: + # attempts: 2 + # retryOn: + # - connect-failure + # - refused-stream + # - unavailable + # - cancelled + # - resource-exhausted + # - retriable-status-codes + requestRetryPolicy: {} + # Replication tuning. See secret-redis-config.yaml for how these are rendered. + # backlogSize: sized for (peak write throughput × tolerable disconnect window). + # 1mb (Redis default) escalates any brief disconnect to a full RDB resync. 1gb + # covers ~5 minutes of disconnect at ~3MB/s of writes. + # timeout (seconds): bound on full-resync transfer + RDB load + heartbeat. + # 60s (Redis default) is too low for multi-GB datasets — master/slave drop the + # link mid-sync. 300s covers ~30GB at typical 1Gbps + load throughput. + # slaveOutputBufferLimit: " ". Default + # "256mb 64mb 60" can't sustain a full resync of a multi-GB dataset at high + # write rate — master kills the replica mid-stream. Bump for production loads. + replication: + backlogSize: 1gb + timeout: 300 + slaveOutputBufferLimit: "2gb 512mb 300" dataDir: /data persistence: enabled: false @@ -86,6 +110,19 @@ sentinel: # external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list env: [] tags: {} + # requestRetryPolicy: + # attempts: 2 + # retryOn: + # - connect-failure + # - refused-stream + # - unavailable + # - cancelled + # - resource-exhausted + # - retriable-status-codes + requestRetryPolicy: {} + # If all sentinels are lost AND the topology changed underfoot during the + # outage, persisted `mymaster` survives but disagrees with the chart's + # redis-0-as-default-master bootstrap, deadlocking recovery. Default OFF. persistence: enabled: false volumes: diff --git a/redis/versions/3.3.0/Chart.yaml b/redis/versions/3.3.0/Chart.yaml new file mode 100644 index 00000000..5a24875d --- /dev/null +++ b/redis/versions/3.3.0/Chart.yaml @@ -0,0 +1,12 @@ +apiVersion: v2 +name: redis +description: A master-replica Redis configuration with Redis Sentinel +type: application +version: 3.3.0 +appVersion: "7.4" + +annotations: + created: "2026-01-29" + lastModified: "2026-05-07" + category: "cache" + createsGvc: false diff --git a/redis/versions/3.3.0/README.md b/redis/versions/3.3.0/README.md new file mode 100644 index 00000000..fcb9b08f --- /dev/null +++ b/redis/versions/3.3.0/README.md @@ -0,0 +1,209 @@ +## Redis Sentinel + +Creates a Redis Sentinel cluster on Control Plane with automatic leader election, failover, and an optional backup configuration. + +### Configuration + +**Redis and Sentinel** — set replicas, resources, and timeouts for each. Sentinel replicas must be an odd number for quorum: +```yaml +redis: + replicas: 2 + resources: + minCpu: 80m + minMemory: 128Mi + cpu: 200m + memory: 256Mi + +sentinel: + replicas: 3 + quorumAutoCalculation: true # calculates as (replicas/2)+1 +``` + +**Authentication** — enable one method. Apply the same config under both `redis.auth` and `sentinel.auth`: +```yaml +redis: + auth: + password: + enabled: true + value: your-password + # fromSecret: + # enabled: true + # name: my-redis-secret + # passwordKey: password +``` + +**Persistence** — disabled by default. Enable to attach a persistent volume to Redis: +```yaml +redis: + persistence: + enabled: true + volumes: + data: + initialCapacity: 10 + performanceClass: general-purpose-ssd # or high-throughput-ssd (min 1000 GiB) + fileSystemType: ext4 +``` + +**Firewall** — set the internal access scope for both Redis and Sentinel: +```yaml +firewall: + internal_inboundAllowType: same-gvc # same-gvc, same-org, or workload-list +``` + +### Public Access (External TCP) + +Redis and Sentinel can be exposed over the internet via TCP using Control Plane's domain resource with per-replica port routing. + +#### Prerequisites + +1. **A domain you control** with DNS managed by your registrar (e.g. Cloudflare) +2. **Dedicated Load Balancer** enabled on your GVC — required for arbitrary TCP port routing. Enable this under your GVC settings in the Control Plane console. +3. **DNS records added before deploying** — Control Plane will reject the domain resource on first deploy if ownership has not been proven. Add the following records in your DNS provider for each address before running the deployment. **Disable proxying** (e.g. Cloudflare's orange cloud) — TCP traffic must pass through directly: + +| Type | Name | Value | +|------|------|-------| +| TXT | `_cpln-` | your Control Plane org name or org ID (either is accepted) | +| CNAME | `` | `.cpln.app` | + +Your GVC alias is visible in the Control Plane console under GVC settings. The TXT record proves domain ownership — without it, the first deploy will fail with an `Unable to apply domain` error. + +#### Configuration + +Enable `publicAccess` for Redis and/or Sentinel and set a subdomain you own: + +```yaml +redis: + publicAccess: + enabled: true + address: redis.your-domain.com + firewall: + internal_inboundAllowType: same-gvc + external_inboundAllowCIDR: "0.0.0.0/0" # or restrict to specific CIDRs + +sentinel: + publicAccess: + enabled: true + address: redis-sentinel.your-domain.com + firewall: + internal_inboundAllowType: same-gvc + external_inboundAllowCIDR: "0.0.0.0/0" +``` + +When enabled, a Control Plane `domain` resource is created for each address. Port mapping is one port per replica: +- **Redis**: ports `6380`, `6381`, ... (replica 0, 1, ...) +- **Sentinel**: ports `26380`, `26381`, `26382`, ... (replica 0, 1, 2, ...) + +After DNS propagates, the domain status in Control Plane will show **Ready**. You can verify the full DNS chain resolves correctly with: + +```bash +dig .your-domain.com CNAME # should return .cpln.app +dig .cpln.app # should return an IP address +``` + +#### Connecting Externally (Public Access Enabled) + +```bash +# add -a if auth is enabled + +# Redis replica 0 +redis-cli -h redis.your-domain.com -p 6380 ping + +# Redis replica 1 +redis-cli -h redis.your-domain.com -p 6381 ping + +# Sentinel replica 0 +redis-cli -h redis-sentinel.your-domain.com -p 26380 ping +``` + +### Connecting + +Redis is accessible internally on port 6379: +``` +RELEASE_NAME-redis.GVC_NAME.cpln.local:6379 +``` + +Sentinel is accessible on port 26379: +``` +RELEASE_NAME-sentinel.GVC_NAME.cpln.local:26379 +``` + +To route writes to the current master: +```bash +MASTER_INFO=$(redis-cli -h RELEASE_NAME-sentinel.GVC_NAME.cpln.local -p 26379 SENTINEL get-master-addr-by-name mymaster) +MASTER_HOST=$(echo $MASTER_INFO | cut -d' ' -f1) +MASTER_PORT=$(echo $MASTER_INFO | cut -d' ' -f2) +redis-cli -h $MASTER_HOST -p $MASTER_PORT SET my-key "value" +``` + +### Backing Up + +Set `backup.enabled` to `true`, configure your provider, and set your desired schedule. The backup image is compatible with all Redis versions. + +```yaml +backup: + enabled: true + schedule: "0 2 * * *" # daily at 2am UTC + provider: aws # Options: aws or gcp +``` + +#### AWS S3 + +For the backup cron job to access an S3 bucket, complete the following in your AWS account first: + +1. Create your bucket. Set `backup.aws.bucket` to its name and `backup.aws.region` to its region. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Set `backup.aws.cloudAccountName` to its name. + +3. Create a new IAM policy with the following JSON (replace `YOUR_BUCKET_NAME`) and set `backup.aws.policyName` to match: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:ListBucket", + "s3:GetObjectVersion", + "s3:DeleteObjectVersion" + ], + "Resource": [ + "arn:aws:s3:::YOUR_BUCKET_NAME", + "arn:aws:s3:::YOUR_BUCKET_NAME/*" + ] + } + ] +} +``` + +#### GCS + +For the backup cron job to access a GCS bucket, complete the following in your GCP account first: + +1. Create your bucket. Set `backup.gcp.bucket` to its name. + +2. If you do not have a Cloud Account set up, refer to the docs to [Create a Cloud Account](https://docs.controlplane.com/guides/create-cloud-account). Set `backup.gcp.cloudAccountName` to its name. + +**Important**: You must add the `Storage Admin` role when creating your GCP service account. + +### Restoring a Backup + +Run the following command from a client with access to the bucket (replace `aws s3 cp` with `gsutil cp` for GCS): + +```sh +aws s3 cp s3://BUCKET_NAME/PREFIX/BACKUP_FILE.rdb /tmp/dump.rdb +redis-cli \ + -h RELEASE_NAME-redis.GVC_NAME.cpln.local \ + -p 6379 \ + --rdb /tmp/dump.rdb +``` + +### Supported External Services +- [Redis Documentation](https://redis.io/docs/) +- [Redis Sentinel Documentation](https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/) + +### Release Notes +See [RELEASES.md](https://github.com/controlplane-com/templates/blob/main/redis/RELEASES.md) diff --git a/redis/versions/3.3.0/templates/_helpers.tpl b/redis/versions/3.3.0/templates/_helpers.tpl new file mode 100644 index 00000000..93eed473 --- /dev/null +++ b/redis/versions/3.3.0/templates/_helpers.tpl @@ -0,0 +1,267 @@ +{{/* Resource Naming */}} + +{{/* +Redis Workload Name +*/}} +{{- define "redis.name" -}} +{{- printf "%s-redis" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Workload Name +*/}} +{{- define "redis.sentinel.name" -}} +{{- printf "%s-sentinel" .Release.Name }} +{{- end }} + +{{/* +Redis Secret Config Name +*/}} +{{- define "redis.secretConfig.name" -}} +{{- printf "%s-redis-config" .Release.Name }} +{{- end }} + +{{/* +Redis Secret Auth Password Name +*/}} +{{- define "redis.secretPassword.name" -}} +{{- printf "%s-redis-auth-password" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Secret Config Name +*/}} +{{- define "redis.sentinelSecretConfig.name" -}} +{{- printf "%s-sentinel-config" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Secret Auth Password Name +*/}} +{{- define "redis.sentinelSecretPassword.name" -}} +{{- printf "%s-sentinel-auth-password" .Release.Name }} +{{- end }} + +{{/* +Redis Identity Name +*/}} +{{- define "redis.identity.name" -}} +{{- printf "%s-redis-identity" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Identity Name +*/}} +{{- define "redis.sentinelIdentity.name" -}} +{{- printf "%s-sentinel-identity" .Release.Name }} +{{- end }} + +{{/* +Redis Policy Name +*/}} +{{- define "redis.policy.name" -}} +{{- printf "%s-redis-policy" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Policy Name +*/}} +{{- define "redis.sentinelPolicy.name" -}} +{{- printf "%s-sentinel-policy" .Release.Name }} +{{- end }} + +{{/* +Redis Volume Set Name +*/}} +{{- define "redis.volume.name" -}} +{{- printf "%s-redis-vs" .Release.Name }} +{{- end }} + +{{/* +Redis Sentinel Volume Set Name +*/}} +{{- define "redis.sentinelVolume.name" -}} +{{- printf "%s-sentinel-vs" .Release.Name }} +{{- end }} + + +{{/* +Redis Backup Workload Name +*/}} +{{- define "redis.backup.name" -}} +{{- printf "%s-redis-backup" .Release.Name }} +{{- end }} + +{{/* +Redis Backup Secret Config Name +*/}} +{{- define "redis.secretBackup.name" -}} +{{- printf "%s-redis-backup-config" .Release.Name }} +{{- end }} + +{{/* +Redis Backup Policy Name +*/}} +{{- define "redis.backupPolicy.name" -}} +{{- printf "%s-redis-backup-policy" .Release.Name }} +{{- end }} + + +{{/* Validation */}} + +{{/* +Validate backup configuration - when backup is enabled, backup.provider must be set to 'aws' or 'gcp' +*/}} +{{- define "redis.validateBackupConfig" -}} +{{- if .Values.backup.enabled -}} + {{- $provider := .Values.backup.provider -}} + {{- if not (or (eq $provider "aws") (eq $provider "gcp")) -}} + {{- fail "Invalid backup configuration: backup.provider must be set to 'aws' or 'gcp'." -}} + {{- end -}} + {{- if eq $provider "aws" -}} + {{- if not .Values.backup.aws.bucket -}} + {{- fail "All fields are required for AWS backup. Missing: backup.aws.bucket" -}} + {{- end -}} + {{- if not .Values.backup.aws.region -}} + {{- fail "All fields are required for AWS backup. Missing: backup.aws.region" -}} + {{- end -}} + {{- if not .Values.backup.aws.cloudAccountName -}} + {{- fail "All fields are required for AWS backup. Missing: backup.aws.cloudAccountName" -}} + {{- end -}} + {{- if not .Values.backup.aws.policyName -}} + {{- fail "All fields are required for AWS backup. Missing: backup.aws.policyName" -}} + {{- end -}} + {{- end -}} + {{- if eq $provider "gcp" -}} + {{- if not .Values.backup.gcp.bucket -}} + {{- fail "All fields are required for GCP backup. Missing: backup.gcp.bucket" -}} + {{- end -}} + {{- if not .Values.backup.gcp.cloudAccountName -}} + {{- fail "All fields are required for GCP backup. Missing: backup.gcp.cloudAccountName" -}} + {{- end -}} + {{- end -}} +{{- end -}} +{{- end }} + +{{- define "calculateWorkloadCounts" -}} +{{- $quorumCount := int .Values.sentinel.quorum }} +{{- $workloadCount := 0 }} +{{- if eq $quorumCount 1 }} + {{- $workloadCount = 1 }} +{{- else }} + {{- $workloadCount = int (add $quorumCount 1) }} +{{- end }} +{{- $locations := default (list) .Values.locations }} +{{- if and $locations (gt (len $locations) 0) }} + {{- $locationCount := (len $locations) }} + {{- $baseCount := int (div $workloadCount $locationCount) }} + {{- $remainderCount := int (mod $workloadCount $locationCount) }} + {{- if not .Values.global }} + {{- $ := set .Values "global" (dict) }} + {{- end }} + {{- $ := set .Values.global "baseCount" $baseCount }} + {{- $ := set .Values.global "remainderCount" $remainderCount }} + {{- $ := set .Values.global "locationCount" $locationCount }} + {{- $ := set .Values.global "workloadCount" $workloadCount }} +{{- end }} +{{- end }} + + +{{ include "redis.auth" (dict "auth" .Values.redis.auth) }} + +redis: + image: redis/redis-stack:7.4.0-v3 + resources: + cpu: 200m + memory: 256Mi + minCpu: 80m + minMemory: 128Mi + replicas: 3 + timeoutSeconds: 15 + auth: + fromSecret: + enabled: false + name: example-redis-auth-password + passwordKey: password + password: + enabled: true + value: fu3h4f9834f8 + +{{/* +Validate auth configuration block +*/}} +{{- define "validateAuth" -}} +{{- $auth := .auth -}} + +{{- /* Check if auth block exists */ -}} +{{- if $auth -}} + {{- /* Count enabled auth methods */ -}} + {{- $enabledCount := 0 -}} + + {{- /* Check fromSecret */ -}} + {{- if and (hasKey $auth "fromSecret") $auth.fromSecret.enabled -}} + {{- $enabledCount = add1 $enabledCount -}} + {{- end -}} + + {{- /* Check password */ -}} + {{- if and (hasKey $auth "password") $auth.password.enabled -}} + {{- $enabledCount = add1 $enabledCount -}} + {{- end -}} + + {{- /* Validate that at most one method is enabled */ -}} + {{- if gt $enabledCount 1 -}} + {{- fail "Only one authentication method can be enabled at a time" -}} + {{- end -}} + + {{- /* If fromSecret is enabled, validate its configuration */ -}} + {{- if and (hasKey $auth "fromSecret") $auth.fromSecret.enabled -}} + {{- if not (hasKey $auth.fromSecret "name") -}} + {{- fail "fromSecret authentication requires a name" -}} + {{- end -}} + {{- if not (hasKey $auth.fromSecret "passwordKey") -}} + {{- fail "fromSecret authentication requires a passwordKey" -}} + {{- end -}} + {{- end -}} + + {{- /* If password is enabled, validate its configuration */ -}} + {{- if and (hasKey $auth "password") $auth.password.enabled -}} + {{- if not (hasKey $auth.password "value") -}} + {{- fail "password authentication requires a value" -}} + {{- end -}} + {{- end -}} +{{- end -}} +{{- end -}} + + +{{/* Labeling */}} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "redis.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Common labels +*/}} +{{- define "redis.tags" -}} +helm.sh/chart: {{ include "redis.chart" . }} +{{ include "redis.selectorLabels" . }} +{{- if .Chart.AppVersion }} +app.cpln.io/version: {{ .Chart.AppVersion | quote }} +{{- end }} +app.cpln.io/managed-by: {{ .Release.Service }} +cpln/marketplace: "true" +cpln/marketplace-template: redis +cpln/marketplace-template-version: {{ .Chart.Version }} +cpln/marketplace-gvc: {{ .Values.global.cpln.gvc }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "redis.selectorLabels" -}} +app.cpln.io/name: {{ .Release.Name }} +app.cpln.io/instance: {{ .Release.Name }} +{{- end }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/domain-redis.yaml b/redis/versions/3.3.0/templates/domain-redis.yaml new file mode 100644 index 00000000..313c6e67 --- /dev/null +++ b/redis/versions/3.3.0/templates/domain-redis.yaml @@ -0,0 +1,20 @@ +{{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} +kind: domain +name: {{ .Values.redis.publicAccess.address }} +description: {{ .Values.redis.publicAccess.address }} +spec: + acceptAllHosts: false + dnsMode: cname + {{- if gt (.Values.redis.replicas | int) 0 }} + ports: + {{- range $i := until (int .Values.redis.replicas) }} + - number: {{ add 6380 $i }} + protocol: tcp + routes: + - port: {{ add 6380 $i }} + prefix: / + replica: {{ $i }} + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "redis.name" $ }} + {{- end }} + {{- end }} +{{- end }} diff --git a/redis/versions/3.3.0/templates/domain-sentinel.yaml b/redis/versions/3.3.0/templates/domain-sentinel.yaml new file mode 100644 index 00000000..0c5c6464 --- /dev/null +++ b/redis/versions/3.3.0/templates/domain-sentinel.yaml @@ -0,0 +1,20 @@ +{{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} +kind: domain +name: {{ .Values.sentinel.publicAccess.address }} +description: {{ .Values.sentinel.publicAccess.address }} +spec: + acceptAllHosts: false + dnsMode: cname + {{- if gt (.Values.sentinel.replicas | int) 0 }} + ports: + {{- range $i := until (int .Values.sentinel.replicas) }} + - number: {{ add 26380 $i }} + protocol: tcp + routes: + - port: {{ add 26380 $i }} + prefix: / + replica: {{ $i }} + workloadLink: //gvc/{{ $.Values.global.cpln.gvc }}/workload/{{ include "redis.sentinel.name" $ }} + {{- end }} + {{- end }} +{{- end }} diff --git a/redis/versions/3.3.0/templates/identity-redis.yaml b/redis/versions/3.3.0/templates/identity-redis.yaml new file mode 100644 index 00000000..9cd19d54 --- /dev/null +++ b/redis/versions/3.3.0/templates/identity-redis.yaml @@ -0,0 +1,23 @@ +{{- include "redis.validateBackupConfig" . -}} +kind: identity +name: {{ include "redis.identity.name" . }} +description: Redis Identity +gvc: {{ .Values.global.cpln.gvc }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "aws") }} +aws: + cloudAccountLink: //cloudaccount/{{ .Values.backup.aws.cloudAccountName }} + policyRefs: + - cpln-connector + - aws::ReadOnlyAccess + - {{ .Values.backup.aws.policyName | quote }} +{{- end }} +{{- if and .Values.backup.enabled (eq .Values.backup.provider "gcp") }} +gcp: + bindings: + - resource: //storage.googleapis.com/projects/_/buckets/{{ .Values.backup.gcp.bucket }} + roles: + - roles/storage.objectAdmin + cloudAccountLink: //cloudaccount/{{ .Values.backup.gcp.cloudAccountName }} + scopes: + - https://www.googleapis.com/auth/cloud-platform +{{- end }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/identity-sentinel.yaml b/redis/versions/3.3.0/templates/identity-sentinel.yaml new file mode 100644 index 00000000..c7311c98 --- /dev/null +++ b/redis/versions/3.3.0/templates/identity-sentinel.yaml @@ -0,0 +1,4 @@ +kind: identity +name: {{ include "redis.sentinelIdentity.name" . }} +description: Redis Sentinel Identity +gvc: {{ .Values.global.cpln.gvc }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/policy-backup.yaml b/redis/versions/3.3.0/templates/policy-backup.yaml new file mode 100644 index 00000000..3185c454 --- /dev/null +++ b/redis/versions/3.3.0/templates/policy-backup.yaml @@ -0,0 +1,17 @@ +{{- if .Values.backup.enabled }} +kind: policy +name: {{ include "redis.backupPolicy.name" . }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "redis.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "redis.secretBackup.name" . }} + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - //secret/{{ .Values.redis.auth.fromSecret.name }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - //secret/{{ include "redis.secretPassword.name" . }} + {{- end }} +{{- end }} diff --git a/redis/versions/3.3.0/templates/policy-redis.yaml b/redis/versions/3.3.0/templates/policy-redis.yaml new file mode 100644 index 00000000..322d26a1 --- /dev/null +++ b/redis/versions/3.3.0/templates/policy-redis.yaml @@ -0,0 +1,20 @@ +kind: policy +name: {{ include "redis.policy.name" . }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "redis.identity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "redis.secretConfig.name" . }} + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - //secret/{{ .Values.redis.auth.fromSecret.name }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - //secret/{{ include "redis.secretPassword.name" . }} + {{- end }} + {{- if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "fromSecret") .Values.sentinel.auth.fromSecret.enabled }} + - //secret/{{ .Values.sentinel.auth.fromSecret.name }} + {{- else if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled }} + - //secret/{{ include "redis.sentinelSecretPassword.name" . }} + {{- end }} diff --git a/redis/versions/3.3.0/templates/policy-sentinel.yaml b/redis/versions/3.3.0/templates/policy-sentinel.yaml new file mode 100644 index 00000000..f7aff9f5 --- /dev/null +++ b/redis/versions/3.3.0/templates/policy-sentinel.yaml @@ -0,0 +1,20 @@ +kind: policy +name: {{ include "redis.sentinelPolicy.name" . }} +bindings: + - permissions: + - reveal + principalLinks: + - //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "redis.sentinelIdentity.name" . }} +targetKind: secret +targetLinks: + - //secret/{{ include "redis.sentinelSecretConfig.name" . }} + {{- if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "fromSecret") .Values.sentinel.auth.fromSecret.enabled }} + - //secret/{{ .Values.sentinel.auth.fromSecret.name }} + {{- else if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled }} + - //secret/{{ include "redis.sentinelSecretPassword.name" . }} + {{- end }} + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - //secret/{{ .Values.redis.auth.fromSecret.name }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - //secret/{{ include "redis.secretPassword.name" . }} + {{- end }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/secret-backup.yaml b/redis/versions/3.3.0/templates/secret-backup.yaml new file mode 100644 index 00000000..d244ca02 --- /dev/null +++ b/redis/versions/3.3.0/templates/secret-backup.yaml @@ -0,0 +1,17 @@ +{{- include "redis.validateBackupConfig" . }} +{{- if .Values.backup.enabled }} +kind: secret +name: {{ include "redis.secretBackup.name" . }} +description: Redis backup configuration +tags: + {{- include "redis.tags" . | nindent 4 }} +type: dictionary +data: +{{- if eq .Values.backup.provider "aws" }} + backup-bucket: {{ .Values.backup.aws.bucket | quote }} + aws-region: {{ .Values.backup.aws.region | quote }} +{{- end }} +{{- if eq .Values.backup.provider "gcp" }} + backup-bucket: {{ .Values.backup.gcp.bucket | quote }} +{{- end }} +{{- end }} diff --git a/redis/versions/3.3.0/templates/secret-redis-config.yaml b/redis/versions/3.3.0/templates/secret-redis-config.yaml new file mode 100644 index 00000000..6f894191 --- /dev/null +++ b/redis/versions/3.3.0/templates/secret-redis-config.yaml @@ -0,0 +1,15 @@ +kind: secret +name: {{ include "redis.secretConfig.name" . }} +type: opaque +data: + encoding: plain + payload: |- + bind 0.0.0.0 + protected-mode no + save 900 1 + save 300 10 + save 60 10000 + appendonly yes + repl-backlog-size {{ .Values.redis.replication.backlogSize }} + repl-timeout {{ .Values.redis.replication.timeout }} + client-output-buffer-limit slave {{ .Values.redis.replication.slaveOutputBufferLimit }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/secret-redis-password.yaml b/redis/versions/3.3.0/templates/secret-redis-password.yaml new file mode 100644 index 00000000..51b90dc7 --- /dev/null +++ b/redis/versions/3.3.0/templates/secret-redis-password.yaml @@ -0,0 +1,7 @@ +{{ if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} +kind: secret +name: {{ include "redis.secretPassword.name" . }} +type: dictionary +data: + password: {{ .Values.redis.auth.password.value | quote }} +{{- end }} diff --git a/redis/versions/3.3.0/templates/secret-sentinel-config.yaml b/redis/versions/3.3.0/templates/secret-sentinel-config.yaml new file mode 100644 index 00000000..6465128d --- /dev/null +++ b/redis/versions/3.3.0/templates/secret-sentinel-config.yaml @@ -0,0 +1,16 @@ +kind: secret +name: {{ include "redis.sentinelSecretConfig.name" . }} +type: opaque +data: + encoding: plain + payload: |- + {{- if and (hasKey .Values.sentinel "persistence") .Values.sentinel.persistence.enabled }} + dir /etc/sentinel/data + {{- else }} + dir /tmp + {{- end }} + sentinel announce-hostnames yes + sentinel resolve-hostnames yes + sentinel down-after-milliseconds mymaster 5000 + sentinel failover-timeout mymaster 10000 + sentinel parallel-syncs mymaster 1 \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/secret-sentinel-password.yaml b/redis/versions/3.3.0/templates/secret-sentinel-password.yaml new file mode 100644 index 00000000..1e752fc0 --- /dev/null +++ b/redis/versions/3.3.0/templates/secret-sentinel-password.yaml @@ -0,0 +1,7 @@ +{{ if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled }} +kind: secret +name: {{ include "redis.sentinelSecretPassword.name" . }} +type: dictionary +data: + password: {{ .Values.sentinel.auth.password.value | quote}} +{{- end }} diff --git a/redis/versions/3.3.0/templates/volumeset-redis.yaml b/redis/versions/3.3.0/templates/volumeset-redis.yaml new file mode 100644 index 00000000..5b2eb0ef --- /dev/null +++ b/redis/versions/3.3.0/templates/volumeset-redis.yaml @@ -0,0 +1,26 @@ +{{- if and (hasKey .Values.redis "persistence") .Values.redis.persistence.enabled }} +kind: volumeset +name: {{ include "redis.volume.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +spec: + fileSystemType: {{ .Values.redis.persistence.volumes.data.fileSystemType }} + initialCapacity: {{ .Values.redis.persistence.volumes.data.initialCapacity }} + performanceClass: {{ .Values.redis.persistence.volumes.data.performanceClass }} + {{- if and .Values.redis.persistence.volumes.data.customEncryption .Values.redis.persistence.volumes.data.customEncryption.enabled }} + customEncryption: + regions: + {{ .Values.redis.persistence.volumes.data.customEncryption.region }}: + keyId: '{{ .Values.redis.persistence.volumes.data.customEncryption.keyId }}' + {{- end }} + {{- if .Values.redis.persistence.volumes.data.snapshots }} + snapshots: + retentionDuration: {{ .Values.redis.persistence.volumes.data.snapshots.retentionDuration }} + schedule: {{ .Values.redis.persistence.volumes.data.snapshots.schedule }} + {{- end }} + {{- if .Values.redis.persistence.volumes.data.autoscaling }} + autoscaling: + maxCapacity: {{ .Values.redis.persistence.volumes.data.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.redis.persistence.volumes.data.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.redis.persistence.volumes.data.autoscaling.scalingFactor }} + {{- end }} +{{- end }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/volumeset-sentinel.yaml b/redis/versions/3.3.0/templates/volumeset-sentinel.yaml new file mode 100644 index 00000000..976c315d --- /dev/null +++ b/redis/versions/3.3.0/templates/volumeset-sentinel.yaml @@ -0,0 +1,26 @@ +{{- if and (hasKey .Values.sentinel "persistence") .Values.sentinel.persistence.enabled }} +kind: volumeset +name: {{ include "redis.sentinelVolume.name" . }} +gvc: {{ .Values.global.cpln.gvc }} +spec: + fileSystemType: {{ .Values.sentinel.persistence.volumes.data.fileSystemType }} + initialCapacity: {{ .Values.sentinel.persistence.volumes.data.initialCapacity }} + performanceClass: {{ .Values.sentinel.persistence.volumes.data.performanceClass }} + {{- if and .Values.sentinel.persistence.volumes.data.customEncryption .Values.sentinel.persistence.volumes.data.customEncryption.enabled }} + customEncryption: + regions: + {{ .Values.sentinel.persistence.volumes.data.customEncryption.region }}: + keyId: '{{ .Values.sentinel.persistence.volumes.data.customEncryption.keyId }}' + {{- end }} + {{- if .Values.sentinel.persistence.volumes.data.snapshots }} + snapshots: + retentionDuration: {{ .Values.sentinel.persistence.volumes.data.snapshots.retentionDuration }} + schedule: {{ .Values.sentinel.persistence.volumes.data.snapshots.schedule }} + {{- end }} + {{- if .Values.sentinel.persistence.volumes.data.autoscaling }} + autoscaling: + maxCapacity: {{ .Values.sentinel.persistence.volumes.data.autoscaling.maxCapacity }} + minFreePercentage: {{ .Values.sentinel.persistence.volumes.data.autoscaling.minFreePercentage }} + scalingFactor: {{ .Values.sentinel.persistence.volumes.data.autoscaling.scalingFactor }} + {{- end }} +{{- end }} \ No newline at end of file diff --git a/redis/versions/3.3.0/templates/workload-backup.yaml b/redis/versions/3.3.0/templates/workload-backup.yaml new file mode 100644 index 00000000..56106fcb --- /dev/null +++ b/redis/versions/3.3.0/templates/workload-backup.yaml @@ -0,0 +1,75 @@ +{{- include "redis.validateBackupConfig" . }} +{{- if .Values.backup.enabled }} +kind: workload +name: {{ include "redis.backup.name" . }} +description: Redis Backup +tags: + {{- include "redis.tags" . | nindent 2 }} +spec: + type: cron + containers: + - name: backup-redis + cpu: {{ .Values.backup.resources.cpu | quote }} + memory: {{ .Values.backup.resources.memory | quote }} + env: + {{- if eq .Values.backup.provider "aws" }} + - name: AWS_REGION + value: cpln://secret/{{ include "redis.secretBackup.name" . }}.aws-region + - name: BACKUP_PROVIDER + value: aws + - name: BACKUP_BUCKET + value: cpln://secret/{{ include "redis.secretBackup.name" . }}.backup-bucket + - name: BACKUP_PREFIX + value: {{ .Values.backup.aws.prefix | quote }} + {{- end }} + {{- if eq .Values.backup.provider "gcp" }} + - name: BACKUP_PROVIDER + value: gcp + - name: BACKUP_BUCKET + value: cpln://secret/{{ include "redis.secretBackup.name" . }}.backup-bucket + - name: BACKUP_PREFIX + value: {{ .Values.backup.gcp.prefix | quote }} + {{- end }} + - name: REDIS_HOST + value: {{ include "redis.name" . }}.{{ .Values.global.cpln.gvc }}.cpln.local + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - name: REDIS_PASSWORD + value: cpln://secret/{{ .Values.redis.auth.fromSecret.name }}.{{ .Values.redis.auth.fromSecret.passwordKey }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - name: REDIS_PASSWORD + value: cpln://secret/{{ include "redis.secretPassword.name" . }}.password + {{- end }} + image: {{ .Values.backup.image }} + inheritEnv: false + defaultOptions: + autoscaling: + maxConcurrency: 0 + maxScale: 1 + metric: disabled + minScale: 1 + scaleToZeroDelay: 300 + target: 95 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: 5 + firewallConfig: + external: + inboundAllowCIDR: [] + inboundBlockedCIDR: [] + outboundAllowCIDR: + - 0.0.0.0/0 + outboundAllowHostname: [] + outboundAllowPort: [] + outboundBlockedCIDR: [] + internal: + inboundAllowType: none + inboundAllowWorkload: [] + identityLink: //identity/{{ include "redis.identity.name" . }} + job: + concurrencyPolicy: Forbid + historyLimit: 5 + restartPolicy: Never + schedule: {{ .Values.backup.schedule }} + supportDynamicTags: false +{{- end }} diff --git a/redis/versions/3.3.0/templates/workload-redis.yaml b/redis/versions/3.3.0/templates/workload-redis.yaml new file mode 100644 index 00000000..d7302624 --- /dev/null +++ b/redis/versions/3.3.0/templates/workload-redis.yaml @@ -0,0 +1,301 @@ +{{ include "validateAuth" (dict "auth" .Values.redis.auth) }} +kind: workload +name: {{ include "redis.name" . }} +description: Redis +tags: + {{- if .Values.redis.tags }} +{{ toYaml .Values.redis.tags | indent 2 }} + {{- end }} + # Sentinel discovery and replica startup must resolve `.:6379` even + # while the pod is still resyncing or marked NotReady by the replication-aware + # readiness probe. This tag exposes not-yet-Ready pods on the headless service so + # peers can reach them for replication and Sentinel can monitor them. + cpln/publishNotReadyAddresses: "true" + {{- include "redis.tags" . | nindent 2 }} +spec: + type: stateful + containers: + - name: redis + env: + {{- if .Values.redis.env }} +{{ toYaml .Values.redis.env | indent 8 }} + {{- end }} + {{- if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} + - name: REPLICA_DIRECT + value: "true" + {{- end }} + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - name: CUSTOM_REDIS_PASSWORD + value: cpln://secret/{{ .Values.redis.auth.fromSecret.name }}.{{ .Values.redis.auth.fromSecret.passwordKey }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - name: CUSTOM_REDIS_PASSWORD + value: cpln://secret/{{ include "redis.secretPassword.name" . }}.password + {{- end }} + {{- if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "fromSecret") .Values.sentinel.auth.fromSecret.enabled }} + - name: CUSTOM_SENTINEL_PASSWORD + value: cpln://secret/{{ .Values.sentinel.auth.fromSecret.name }}.{{ .Values.sentinel.auth.fromSecret.passwordKey }} + {{- else if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled }} + - name: CUSTOM_SENTINEL_PASSWORD + value: cpln://secret/{{ include "redis.sentinelSecretPassword.name" . }}.password + {{- end }} + {{- if not (or .Values.redis.env .Values.redis.replicaDirect (and (hasKey .Values.redis "auth") (or (and (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled) (and (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled))) (and (hasKey .Values.sentinel "auth") (or (and (hasKey .Values.sentinel.auth "fromSecret") .Values.sentinel.auth.fromSecret.enabled) (and (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled)))) }} + [] + {{- end }} + args: + - '-c' + - |- + mkdir /etc/redis + + cp /config/redis.conf /etc/redis/redis.conf + + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + echo "\nrequirepass $CUSTOM_REDIS_PASSWORD" >> /etc/redis/redis.conf + echo "\nmasterauth $CUSTOM_REDIS_PASSWORD" >> /etc/redis/redis.conf + fi + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((6380 + POD_ID)) + echo "\nport $PORT" >> /etc/redis/redis.conf + echo "\nreplica-announce-ip {{ .Values.redis.publicAccess.address }}" >> /etc/redis/redis.conf + echo "\nreplica-announce-port $PORT" >> /etc/redis/redis.conf + SELF_HOST="{{ .Values.redis.publicAccess.address }}" + SELF_PORT=$PORT + {{ else }} + PORT=6379 + echo "\nport 6379" >> /etc/redis/redis.conf + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + LOCATION=${CPLN_LOCATION##*/} + CPLN_WORKLOAD_NAME="${CPLN_WORKLOAD##*/}" + if [ -n "$REPLICA_DIRECT" ]; then + echo "\nreplica-announce-ip replica-${POD_ID}.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local" >> /etc/redis/redis.conf + SELF_HOST="replica-${POD_ID}.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local" + else + echo "\nreplica-announce-ip ${HOSTNAME}.{{ include "redis.name" . }}" >> /etc/redis/redis.conf + SELF_HOST="${HOSTNAME}.{{ include "redis.name" . }}" + fi + echo "\nreplica-announce-port 6379" >> /etc/redis/redis.conf + SELF_PORT=6379 + {{ end }} + + # Discovery: ask sentinel who the master is, retrying forever. A + # redis cluster without reachable sentinels isn't manageable + # regardless, so blocking startup until they respond is acceptable. + # On cold start, sentinels respond from their static config + # (mymaster -> pod-0); after any failover with sentinel persistence + # enabled, they respond from CONFIG REWRITE state (the real master). + # The pod whose announced identity matches sentinel's answer boots + # as master; everyone else replicaof's it. + SENTINEL_REPLICAS={{ .Values.sentinel.replicas }} + SIDX=0 + MASTER_HOST="" + MASTER_PORT="" + until echo "$MASTER_PORT" | grep -qE '^[0-9]+$'; do + {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + S_HOST="{{ include "redis.sentinel.name" . }}-${SIDX}.{{ include "redis.sentinel.name" . }}" + S_PORT=$((26380 + SIDX)) + {{- else if and (hasKey .Values.sentinel "replicaDirect") .Values.sentinel.replicaDirect }} + S_HOST="replica-${SIDX}.{{ include "redis.sentinel.name" . }}.${LOCATION}.${CPLN_GVC}.cpln.local" + S_PORT=26379 + {{- else }} + S_HOST="{{ include "redis.sentinel.name" . }}-${SIDX}.{{ include "redis.sentinel.name" . }}" + S_PORT=26379 + {{- end }} + if [ -n "$CUSTOM_SENTINEL_PASSWORD" ]; then + INFO=$(redis-cli -h "$S_HOST" -p "$S_PORT" --no-auth-warning -a "$CUSTOM_SENTINEL_PASSWORD" SENTINEL get-master-addr-by-name mymaster 2>/dev/null) + else + INFO=$(redis-cli -h "$S_HOST" -p "$S_PORT" SENTINEL get-master-addr-by-name mymaster 2>/dev/null) + fi + MASTER_HOST=$(echo "$INFO" | head -1) + MASTER_PORT=$(echo "$INFO" | tail -1) + if ! echo "$MASTER_PORT" | grep -qE '^[0-9]+$'; then + echo "Sentinel at $S_HOST:$S_PORT not responding; trying next" + SIDX=$(( (SIDX + 1) % SENTINEL_REPLICAS )) + [ $SIDX -eq 0 ] && sleep 5 + fi + done + echo "Sentinel says master=$MASTER_HOST:$MASTER_PORT (self=$SELF_HOST:$SELF_PORT)" + + # exec replaces the shell with redis-server so SIGTERM from the + # platform routes directly to redis-server instead of being swallowed + # by /bin/sh. Without exec, shutdown waits the full grace period + # before SIGKILL since the shell doesn't forward signals to children. + if [ "$MASTER_HOST" = "$SELF_HOST" ] && [ "$MASTER_PORT" = "$SELF_PORT" ]; then + echo "Booting as master" + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} + else + echo "Booting as replica of $MASTER_HOST:$MASTER_PORT" + # Self-heal sentinel's view of the slave set. Without this, a + # scale-up race or a sentinel restart (which can wipe known-replica + # entries on volume loss) can leave this replica invisible to + # sentinel and orphaned at the next failover. SENTINEL RESET only + # refreshes sentinel's bookkeeping — no failover is triggered. + ( + while true; do + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + redis-cli -p $PORT --no-auth-warning -a "$CUSTOM_REDIS_PASSWORD" ping >/dev/null 2>&1 && break + else + redis-cli -p $PORT ping >/dev/null 2>&1 && break + fi + sleep 2 + done + sleep 3 + IDX=0 + while [ $IDX -lt $SENTINEL_REPLICAS ]; do + {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + RS_HOST="{{ include "redis.sentinel.name" . }}-${IDX}.{{ include "redis.sentinel.name" . }}" + RS_PORT=$((26380 + IDX)) + {{- else if and (hasKey .Values.sentinel "replicaDirect") .Values.sentinel.replicaDirect }} + RS_HOST="replica-${IDX}.{{ include "redis.sentinel.name" . }}.${LOCATION}.${CPLN_GVC}.cpln.local" + RS_PORT=26379 + {{- else }} + RS_HOST="{{ include "redis.sentinel.name" . }}-${IDX}.{{ include "redis.sentinel.name" . }}" + RS_PORT=26379 + {{- end }} + if [ -n "$CUSTOM_SENTINEL_PASSWORD" ]; then + redis-cli -h $RS_HOST -p $RS_PORT --no-auth-warning -a "$CUSTOM_SENTINEL_PASSWORD" SENTINEL RESET mymaster >/dev/null 2>&1 || true + else + redis-cli -h $RS_HOST -p $RS_PORT SENTINEL RESET mymaster >/dev/null 2>&1 || true + fi + IDX=$((IDX + 1)) + done + ) & + exec {{ .Values.redis.serverCommand }} /etc/redis/redis.conf {{ .Values.redis.extraArgs }} --dir {{ .Values.redis.dataDir }} --replicaof "$MASTER_HOST" "$MASTER_PORT" + fi + command: /bin/sh + cpu: {{ .Values.redis.resources.cpu }} + memory: {{ .Values.redis.resources.memory }} + minCpu: {{ .Values.redis.resources.minCpu }} + minMemory: {{ .Values.redis.resources.minMemory }} + image: {{ .Values.redis.image }} + readinessProbe: + exec: + command: + - /bin/bash + - "-c" + - |- + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((6380 + POD_ID)) + {{- else }} + PORT=6379 + {{- end }} + rcli() { + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + redis-cli -p $PORT --no-auth-warning -a "$CUSTOM_REDIS_PASSWORD" "$@" + else + redis-cli -p $PORT "$@" + fi + } + rcli ping >/dev/null && \ + if [ "$(rcli role | head -1)" = "slave" ]; then + [ "$(rcli info replication | awk -F: '/master_link_status/{print $2}' | tr -d '\r')" = "up" ] && \ + [ "$(rcli info replication | awk -F: '/master_sync_in_progress/{print $2}' | tr -d '\r')" = "0" ] + fi + failureThreshold: 10 + initialDelaySeconds: 10 + periodSeconds: 5 + successThreshold: 1 + timeoutSeconds: 4 + # Explicit liveness probe — keep it permissive (just "is the process up?") + # so the platform doesn't kill a pod mid-resync. cpln defaults the liveness + # probe to whatever the readiness probe is, and our readiness probe + # intentionally returns failure during full resync (master_link_status:up + # AND master_sync_in_progress:0). Without this override, a slave doing a + # full resync would be killed before it could finish. + livenessProbe: + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + exec: + command: + - /bin/bash + - "-c" + - |- + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((6380 + POD_ID)) + exec 3<>/dev/tcp/127.0.0.1/$PORT + {{- else }} + tcpSocket: + port: 6379 + {{- end }} + failureThreshold: 5 + initialDelaySeconds: 30 + periodSeconds: 10 + successThreshold: 1 + timeoutSeconds: 5 + inheritEnv: false + ports: +{{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled (gt (.Values.redis.replicas | int) 0) }} + {{- $startPort := 6380 }} + {{- $replicas := $.Values.redis.replicas | int }} + {{- range $replicaIndex := until $replicas }} + - number: {{ add $startPort $replicaIndex }} + protocol: tcp + {{- end }} +{{- else }} + - number: 6379 + protocol: tcp +{{- end }} + volumes: + - path: /config/redis.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "redis.secretConfig.name" . }} + {{- if and (hasKey .Values.redis "persistence") .Values.redis.persistence.enabled }} + - path: {{ .Values.redis.dataDir }} + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "redis.volume.name" . }} + {{- end }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + metric: disabled + minScale: {{ .Values.redis.replicas }} + maxScale: {{ .Values.redis.replicas }} + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: {{ .Values.redis.timeoutSeconds }} + {{- if .Values.redis.multiZone }} + multiZone: + enabled: true + {{- else }} + multiZone: + enabled: false + {{- end }} + # Parallel pod management. Default OrderedReady serializes replica boots, + # which forces sentinels and redis pods to wait for each peer to be Ready + # before the next starts and produces extended cold-start +sdown noise. + # Parallel boots all replicas at once; publishNotReadyAddresses keeps peer + # DNS resolvable during the simultaneous cold start so the cluster can form. + rolloutOptions: + scalingPolicy: Parallel +{{- if .Values.redis.requestRetryPolicy }} + requestRetryPolicy: +{{ toYaml .Values.redis.requestRetryPolicy | indent 4 }} +{{- end }} +{{- if .Values.redis.firewall }} + firewallConfig: + {{- if or (hasKey .Values.redis.firewall "external_inboundAllowCIDR") (hasKey .Values.redis.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.redis.firewall.external_inboundAllowCIDR }}{{ .Values.redis.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.redis.firewall.external_outboundAllowCIDR }}{{ .Values.redis.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.redis.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.redis.firewall.internal_inboundAllowType }} + {{- if .Values.redis.firewall.inboundAllowWorkload }} + inboundAllowWorkload: {{ .Values.redis.firewall.inboundAllowWorkload | toYaml | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "redis.identity.name" . }} +{{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + loadBalancer: + replicaDirect: true +{{- else if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} + loadBalancer: + replicaDirect: true +{{- else }} + loadBalancer: + replicaDirect: false +{{- end }} diff --git a/redis/versions/3.3.0/templates/workload-sentinel.yaml b/redis/versions/3.3.0/templates/workload-sentinel.yaml new file mode 100644 index 00000000..82ab5787 --- /dev/null +++ b/redis/versions/3.3.0/templates/workload-sentinel.yaml @@ -0,0 +1,223 @@ +{{ include "validateAuth" (dict "auth" .Values.sentinel.auth) }} +kind: workload +name: {{ include "redis.sentinel.name" . }} +description: Redis Sentinel +tags: + {{- if .Values.sentinel.tags }} +{{ toYaml .Values.sentinel.tags | indent 2 }} + {{- end }} + # Currently a no-op: the sentinel readiness probe is a plain `redis-cli ping`, so + # pods become Ready as soon as the port answers and the headless service exposes + # them anyway. Set here as a hedge — if the probe is ever tightened (e.g. to + # require quorum visibility or a known master), peers must still resolve each + # other via `.` during cold start to form the quorum, otherwise + # they deadlock: NotReady because no peers, no peers because NotReady. + cpln/publishNotReadyAddresses: "true" + {{- include "redis.tags" . | nindent 2 }} +spec: + type: stateful + containers: + - name: sentinel + args: + - '-c' + - |- + {{- if and (hasKey .Values.sentinel "persistence") .Values.sentinel.persistence.enabled }} + mkdir -p /etc/sentinel/data + {{- else }} + mkdir -p /etc/sentinel + {{- end }} + + # bootstrap_sentinel_conf: write /etc/sentinel/sentinel.conf only if + # sentinel hasn't already taken ownership of the file. The marker we + # check is the `sentinel myid ` directive — sentinel writes this + # via CONFIG REWRITE within milliseconds of its first successful + # startup, and the chart's static template never contains it. So: + # - marker present → previous boot got far enough that sentinel + # owns the file. Preserve everything (current master after any + # failover, known replicas, peer sentinels). + # - marker absent → file missing, empty, or written but sentinel + # never reached its first CONFIG REWRITE. Re-run bootstrap; + # idempotent against the static config so safe to re-run. + # When sentinel.persistence is disabled this still runs every boot + # (ephemeral filesystem, no marker survives) — same behavior as the + # pre-marker chart. The check only changes behavior when /etc/sentinel + # is backed by a persistent volume. + bootstrap_sentinel_conf() { + if [ -s /etc/sentinel/sentinel.conf ] && grep -q "^sentinel myid " /etc/sentinel/sentinel.conf; then + echo "sentinel.conf already bootstrapped; preserving sentinel-managed state" + return 0 + fi + echo "Bootstrapping sentinel.conf from static config" + cp /config/sentinel.conf /etc/sentinel/sentinel.conf + + if [ -n "$CUSTOM_REDIS_PASSWORD" ]; then + echo "\nsentinel auth-pass mymaster $CUSTOM_REDIS_PASSWORD" >> /etc/sentinel/sentinel.conf + fi + + if [ -n "$CUSTOM_SENTINEL_PASSWORD" ]; then + echo "\nrequirepass $CUSTOM_SENTINEL_PASSWORD" >> /etc/sentinel/sentinel.conf + fi + + {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((26380 + POD_ID)) + echo "\nport $PORT" >> /etc/sentinel/sentinel.conf + echo "\nsentinel announce-ip {{ .Values.sentinel.publicAccess.address }}" >> /etc/sentinel/sentinel.conf + echo "\nsentinel announce-port $PORT" >> /etc/sentinel/sentinel.conf + {{ else }} + echo "\nport 26379" >> /etc/sentinel/sentinel.conf + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + LOCATION=${CPLN_LOCATION##*/} + CPLN_WORKLOAD_NAME="${CPLN_WORKLOAD##*/}" + if [ -n "$REPLICA_DIRECT" ]; then + echo "\nsentinel announce-ip replica-${POD_ID}.${CPLN_WORKLOAD_NAME}.${LOCATION}.${CPLN_GVC}.cpln.local" >> /etc/sentinel/sentinel.conf + else + echo "\nsentinel announce-ip ${HOSTNAME}.{{ include "redis.sentinel.name" . }}" >> /etc/sentinel/sentinel.conf + fi + echo "\nsentinel announce-port 26379" >> /etc/sentinel/sentinel.conf + {{ end }} + {{- if and (hasKey .Values.redis "publicAccess") .Values.redis.publicAccess.enabled }} + echo "sentinel monitor mymaster {{ .Values.redis.publicAccess.address }} 6380 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- else if and (hasKey .Values.redis "replicaDirect") .Values.redis.replicaDirect }} + echo "sentinel monitor mymaster replica-0.{{ include "redis.name" . }}.${LOCATION}.${CPLN_GVC}.cpln.local 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- else }} + echo "sentinel monitor mymaster {{ include "redis.name" . }}-0.{{ include "redis.name" . }} 6379 ${REDIS_SENTINEL_QUORUM}" >> /etc/sentinel/sentinel.conf + {{- end }} + } + + bootstrap_sentinel_conf + + # exec replaces the shell with redis-sentinel so SIGTERM from the + # platform routes directly to redis-sentinel for clean shutdown. + # Without exec, the shell swallows the signal and sentinel only dies + # when the platform sends SIGKILL after the grace period expires. + exec redis-sentinel /etc/sentinel/sentinel.conf + command: /bin/sh + cpu: {{ .Values.sentinel.resources.cpu }} + memory: {{ .Values.sentinel.resources.memory }} + minCpu: {{ .Values.sentinel.resources.minCpu }} + minMemory: {{ .Values.sentinel.resources.minMemory }} + env: + - name: REDIS_SENTINEL_QUORUM + value: '{{ if .Values.sentinel.quorumAutoCalculation }}{{ add (div (int .Values.sentinel.replicas) 2) 1 }}{{ else }}{{ .Values.sentinel.quorumOverride }}{{ end }}' + - name: REDIS_SENTINEL_DATA_DIR + value: /etc/sentinel/data + {{- if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "fromSecret") .Values.redis.auth.fromSecret.enabled }} + - name: CUSTOM_REDIS_PASSWORD + value: cpln://secret/{{ .Values.redis.auth.fromSecret.name }}.{{ .Values.redis.auth.fromSecret.passwordKey }} + {{- else if and (hasKey .Values.redis "auth") (hasKey .Values.redis.auth "password") .Values.redis.auth.password.enabled }} + - name: CUSTOM_REDIS_PASSWORD + value: cpln://secret/{{ include "redis.secretPassword.name" . }}.password + {{- end }} + {{- if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "fromSecret") .Values.sentinel.auth.fromSecret.enabled }} + - name: CUSTOM_SENTINEL_PASSWORD + value: cpln://secret/{{ .Values.sentinel.auth.fromSecret.name }}.{{ .Values.sentinel.auth.fromSecret.passwordKey }} + {{- else if and (hasKey .Values.sentinel "auth") (hasKey .Values.sentinel.auth "password") .Values.sentinel.auth.password.enabled }} + - name: CUSTOM_SENTINEL_PASSWORD + value: cpln://secret/{{ include "redis.sentinelSecretPassword.name" . }}.password + {{- end }} + {{- if and (hasKey .Values.sentinel "replicaDirect") .Values.sentinel.replicaDirect }} + - name: REPLICA_DIRECT + value: "true" + {{- end }} + {{- if .Values.sentinel.env }} +{{ toYaml .Values.sentinel.env | indent 8 }} + {{- end }} + image: {{ .Values.sentinel.image }} + readinessProbe: + exec: + command: + - /bin/bash + - "-c" + - |- + {{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + POD_ID=$(echo "$POD_NAME" | rev | cut -d'-' -f 1 | rev) + PORT=$((26380 + POD_ID)) + {{- else }} + PORT=26379 + {{- end }} + if [ ! -z "$CUSTOM_SENTINEL_PASSWORD" ]; then + redis-cli -p $PORT --no-auth-warning -a "$CUSTOM_SENTINEL_PASSWORD" ping; + else + redis-cli -p $PORT ping; + fi + failureThreshold: 10 + initialDelaySeconds: 10 + periodSeconds: 5 + successThreshold: 1 + timeoutSeconds: 4 + inheritEnv: false + ports: +{{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled (gt (.Values.sentinel.replicas | int) 0) }} + {{- $startPort := 26380 }} + {{- $replicas := $.Values.sentinel.replicas | int }} + {{- range $replicaIndex := until $replicas }} + - number: {{ add $startPort $replicaIndex }} + protocol: tcp + {{- end }} +{{- else }} + - number: 26379 + protocol: tcp +{{- end }} + volumes: + - path: /config/sentinel.conf + recoveryPolicy: retain + uri: cpln://secret/{{ include "redis.sentinelSecretConfig.name" . }} + {{- if and (hasKey .Values.sentinel "persistence") .Values.sentinel.persistence.enabled }} + - path: /etc/sentinel + recoveryPolicy: retain + uri: cpln://volumeset/{{ include "redis.sentinelVolume.name" . }} + {{- end }} + defaultOptions: + autoscaling: + maxConcurrency: 0 + minScale: {{ .Values.sentinel.replicas }} + maxScale: {{ .Values.sentinel.replicas }} + metric: disabled + scaleToZeroDelay: 300 + target: 100 + capacityAI: false + debug: false + suspend: false + timeoutSeconds: {{ .Values.sentinel.timeoutSeconds }} + {{- if .Values.sentinel.multiZone }} + multiZone: + enabled: true + {{- else }} + multiZone: + enabled: false + {{- end }} + # See workload-redis.yaml for rationale. Parallel boots all sentinels at + # once instead of waiting for each to be Ready in sequence. + rolloutOptions: + scalingPolicy: Parallel +{{- if .Values.sentinel.requestRetryPolicy }} + requestRetryPolicy: +{{ toYaml .Values.sentinel.requestRetryPolicy | indent 4 }} +{{- end }} +{{- if .Values.sentinel.firewall }} + firewallConfig: + {{- if or (hasKey .Values.sentinel.firewall "external_inboundAllowCIDR") (hasKey .Values.sentinel.firewall "external_outboundAllowCIDR") }} + external: + inboundAllowCIDR: {{- if .Values.sentinel.firewall.external_inboundAllowCIDR }}{{ .Values.sentinel.firewall.external_inboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + outboundAllowCIDR: {{- if .Values.sentinel.firewall.external_outboundAllowCIDR }}{{ .Values.sentinel.firewall.external_outboundAllowCIDR | splitList "," | toYaml | nindent 8 }}{{- else }} []{{- end }} + {{- end }} + {{- if hasKey .Values.sentinel.firewall "internal_inboundAllowType" }} + internal: + inboundAllowType: {{ default "[]" .Values.sentinel.firewall.internal_inboundAllowType }} + {{- if .Values.sentinel.firewall.inboundAllowWorkload }} + inboundAllowWorkload: {{ .Values.sentinel.firewall.inboundAllowWorkload | toYaml | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} + identityLink: //gvc/{{ .Values.global.cpln.gvc }}/identity/{{ include "redis.sentinelIdentity.name" . }} +{{- if and (hasKey .Values.sentinel "publicAccess") .Values.sentinel.publicAccess.enabled }} + loadBalancer: + replicaDirect: true +{{- else if and (hasKey .Values.sentinel "replicaDirect") .Values.sentinel.replicaDirect }} + loadBalancer: + replicaDirect: true +{{- else }} + loadBalancer: + replicaDirect: false +{{- end }} diff --git a/redis/versions/3.3.0/values.yaml b/redis/versions/3.3.0/values.yaml new file mode 100644 index 00000000..6dff2f11 --- /dev/null +++ b/redis/versions/3.3.0/values.yaml @@ -0,0 +1,166 @@ +redis: + image: redis:7.4 + resources: + cpu: 200m + memory: 256Mi + minCpu: 80m + minMemory: 128Mi + replicas: 2 + timeoutSeconds: 15 + multiZone: false + replicaDirect: false # https://docs.controlplane.com/reference/workload/general#internal-endpoint-formatting + auth: + fromSecret: + enabled: false + name: example-redis-auth-password + passwordKey: password + password: + enabled: false + value: your-password + serverCommand: redis-server # Can be overridden based on the version of redis image + # extraArgs: "--maxclients 20000 --maxmemory 200mb --maxmemory-policy allkeys-lru" + publicAccess: + enabled: false + address: redis-test.example-cpln.com + firewall: + internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc(Recommended) / workload-list + # external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + # # You can specify additional workloads with either same-gvc or workload-list: + # inboundAllowWorkload: + # - //gvc/main-redis/workload/main-redis-sentinel + # - //gvc/client-gvc/workload/client + # external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + env: [] + tags: {} + # requestRetryPolicy: + # attempts: 2 + # retryOn: + # - connect-failure + # - refused-stream + # - unavailable + # - cancelled + # - resource-exhausted + # - retriable-status-codes + requestRetryPolicy: {} + # Replication tuning. See secret-redis-config.yaml for how these are rendered. + # backlogSize: sized for (peak write throughput × tolerable disconnect window). + # 1mb (Redis default) escalates any brief disconnect to a full RDB resync. 1gb + # covers ~5 minutes of disconnect at ~3MB/s of writes. + # timeout (seconds): bound on full-resync transfer + RDB load + heartbeat. + # 60s (Redis default) is too low for multi-GB datasets — master/slave drop the + # link mid-sync. 300s covers ~30GB at typical 1Gbps + load throughput. + # slaveOutputBufferLimit: " ". Default + # "256mb 64mb 60" can't sustain a full resync of a multi-GB dataset at high + # write rate — master kills the replica mid-stream. Bump for production loads. + replication: + backlogSize: 1gb + timeout: 300 + slaveOutputBufferLimit: "2gb 512mb 300" + dataDir: /data + persistence: + enabled: false + volumes: + data: + initialCapacity: 10 # In GB + performanceClass: general-purpose-ssd # general-purpose-ssd / high-throughput-ssd (Min 1000GB) + fileSystemType: ext4 # ext4 / xfs + snapshots: + retentionDuration: 7d + schedule: 0 0 * * * # UTC + autoscaling: + maxCapacity: 100 # In GB + minFreePercentage: 20 + scalingFactor: 1.2 + # customEncryption: + # enabled: true + # region: aws-us-east-1 # Replace with the appropriate region + # keyId: arn:aws:kms:us-east-1:1234567890:key/d411f35a-1d31-4515-9934-4f193e042d80 # Replace with your AWS KMS key ARN + +sentinel: + image: redis:7.4 + resources: + cpu: 200m + memory: 256Mi + minCpu: 80m + minMemory: 128Mi + replicas: 3 + timeoutSeconds: 10 + multiZone: false + replicaDirect: false # https://docs.controlplane.com/reference/workload/general#internal-endpoint-formatting + quorumAutoCalculation: true # Set to false if you want to override quorum. Quorum is (replicas/2)+1 + quorumOverride: null # Only used if quorumAutoCalculation is false + auth: + fromSecret: + enabled: false + name: example-redis-auth-password + passwordKey: password + password: + enabled: false + value: your-password + publicAccess: + enabled: false + address: redis-sentinel-test.example-cpln.com + firewall: + internal_inboundAllowType: "same-gvc" # Options: same-org / same-gvc(Recommended) + # external_inboundAllowCIDR: 0.0.0.0/0 # Provide a comma-separated list + # # You can specify additional workloads with either same-gvc or workload-list: + # inboundAllowWorkload: + # - //gvc/main-redis/workload/main-redis-sentinel + # - //gvc/client-gvc/workload/client + # external_outboundAllowCIDR: "0.0.0.0/0" # Provide a comma-separated list + env: [] + tags: {} + # requestRetryPolicy: + # attempts: 2 + # retryOn: + # - connect-failure + # - refused-stream + # - unavailable + # - cancelled + # - resource-exhausted + # - retriable-status-codes + requestRetryPolicy: {} + # Sentinel persistence preserves the post-failover master across restarts via + # CONFIG REWRITE. Replicas query sentinel at startup, so persisted state lets + # them rejoin the real master rather than redis-0. Recommended ON in prod. + persistence: + enabled: false + volumes: + data: + initialCapacity: 10 # In GB + performanceClass: general-purpose-ssd # general-purpose-ssd / high-throughput-ssd (Min 1000GB) + fileSystemType: ext4 # ext4 / xfs + snapshots: + retentionDuration: 7d + schedule: 0 0 * * * # UTC + autoscaling: + maxCapacity: 50 # In GB + minFreePercentage: 20 + scalingFactor: 1.2 + # customEncryption: + # enabled: true + # region: aws-us-east-1 # Replace with the appropriate region + # keyId: arn:aws:kms:us-east-1:1234567890:key/d411f35a-1d31-4515-9934-4f193e042d80 # Replace with your AWS KMS key ARN + +backup: + enabled: false + image: controlplanecorporation/redis-backup:1.0 + schedule: "0 2 * * *" # daily at 2am UTC + + resources: + cpu: 100m + memory: 128Mi + + provider: aws # Options: aws or gcp + + aws: + bucket: my-backup-bucket + region: us-east-1 + cloudAccountName: my-backup-cloudaccount + policyName: my-backup-policy + prefix: redis/backups # folder name where your backups will be stored + + gcp: + bucket: my-backup-bucket + cloudAccountName: my-backup-cloudaccount + prefix: redis/backups # folder name where your backups will be stored