Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Optional Helm [Config Connector](https://docs.cloud.google.com/config-connector/docs/overview) resources for GKE
- Determine cloud provider from pv csi driver name
- Optional Helm [ASO](https://azure.github.io/azure-service-operator) Azure Service Operator resources for AKS
- Azure Disk labelling and a testing guide for AKS

## [0.2.1] - 2026-02-26

Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ serde_json = "1.0.149"
reqwest = { version = "0.13", default-features = false, features = [
"json",
"query",
"form",
"rustls-no-provider",
"http2",
] }
Expand Down
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ helm template k8s-cloud-tagger helm/k8s-cloud-tagger/ --set serviceMonitor.enabl

## Label sanitisation

### GCP

Kubernetes label keys and values can contain characters that are not valid in GCP labels.
GCP labels only allow lowercase letters, digits, hyphens, and underscores (`[a-z0-9_-]`),
with keys limited to 63 characters and required to start with a lowercase letter.
Expand All @@ -96,6 +98,22 @@ For more detail on GCP label requirements, see the [Google Cloud labeling best p
| `upgrades.dev/managed-by: k8s-cloud-tagger` | `upgrades-dev-managed-by: k8s-cloud-tagger` |
| `Team: Platform` | `team: platform` |

### Azure

Azure resource tag keys may contain any Unicode character except `<`, `>`, `%`, `&`, `\`, `?`, and `/`.
Keys are limited to 512 characters and values to 256 characters.
k8s-cloud-tagger replaces each disallowed character in a key with a hyphen, and truncates keys and values to their respective limits.
Unlike GCP, Azure tags are not lowercased — tag names are case-insensitive in Azure but case is preserved as supplied, and tag values are case-sensitive.
For more detail on Azure tag requirements, see the [Azure tag limitations](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/tag-resources).

| Kubernetes label | Azure tag |
| --- | --- |
| `app.kubernetes.io/name: frontend` | `app.kubernetes.io-name: frontend` |
| `helm.sh/chart: myapp-1.2.0` | `helm.sh-chart: myapp-1.2.0` |
| `env: production` | `env: production` |
| `upgrades.dev/managed-by: k8s-cloud-tagger` | `upgrades.dev-managed-by: k8s-cloud-tagger` |
| `Team: Platform` | `Team: Platform` |

## Release

1. Check out a new branch
Expand Down
211 changes: 211 additions & 0 deletions docs/azure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
# Azure (AKS)

This document covers deploying `k8s-cloud-tagger` on AKS using
[Workload Identity](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview) for
authentication and [Azure Service Operator](https://azure.github.io/azure-service-operator/) (ASO)
to manage the required Azure resources.

## How it works

The chart sets the `azure.workload.identity/use: "true"` label on the pod template and the
`azure.workload.identity/client-id` annotation on the ServiceAccount. At pod creation time the AKS
Workload Identity webhook injects `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, `AZURE_AUTHORITY_HOST`, and
`AZURE_FEDERATED_TOKEN_FILE` into the pod. The controller uses these to obtain an ARM bearer token
and call the Tags API.

When `azure.serviceOperator.enabled=true`, ASO creates and manages:
- A `UserAssignedIdentity` (the managed identity)
- A `FederatedIdentityCredential` (the OIDC trust binding between the identity and the ServiceAccount)
- A `RoleAssignment` granting [Tag Contributor](https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/management-and-governance#tag-contributor) at subscription scope

All ASO resources are `detach-on-delete` — `helm uninstall` will not delete them in Azure.

> **Note:** The managed identity must be pre-created before `helm install` because its `clientId`
> must be known at install time to annotate the ServiceAccount. ASO will adopt and manage the
> identity going forward.

## 1. Set environment variables

```bash
export RESOURCE_GROUP=
export LOCATION=
export CLUSTER_NAME=
export SUBSCRIPTION_ID=
export TAG= # image tag, e.g. sha-63d1b9b
```

## 2. Create the AKS cluster

```bash
az group create \
--name $RESOURCE_GROUP \
--location $LOCATION

az aks create \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--location $LOCATION \
--node-count 1 \
--node-vm-size Standard_D2s_v3 \
--enable-oidc-issuer \
--enable-workload-identity \
--generate-ssh-keys

az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME
```

## 3. Install cert-manager

Required by ASO.

```bash
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml

kubectl wait --namespace cert-manager \
--for=condition=Ready pod --all --timeout=300s
```

## 4. Install Azure Service Operator

Scoped to only the CRD groups this project needs.

```bash
helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts

helm upgrade --install aso2 aso2/azure-service-operator \
--create-namespace \
--namespace azureserviceoperator-system \
--set crdPattern='resources.azure.com/*;managedidentity.azure.com/*;authorization.azure.com/*'

kubectl wait --namespace azureserviceoperator-system \
--for=condition=Ready pod --all --timeout=300s
```

## 5. Create a service principal for ASO

ASO uses this credential to manage Azure resources on your behalf.

> **Note:** `Owner` is used here for convenience. Fine-grained permissions should be configured
> before production use.

```bash
ASO_SP=$(az ad sp create-for-rbac \
--name "aso-${CLUSTER_NAME}" \
--role Owner \
--scopes "/subscriptions/${SUBSCRIPTION_ID}" \
--output json)

export ASO_CLIENT_ID=$(echo $ASO_SP | jq -r .appId)
export ASO_CLIENT_SECRET=$(echo $ASO_SP | jq -r .password)
export TENANT_ID=$(echo $ASO_SP | jq -r .tenant)
```

## 6. Configure ASO credentials

ASO uses per-namespace credentials. Create the secret in the same namespace as the chart resources.

```bash
kubectl create namespace k8s-cloud-tagger

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: aso-credential
namespace: k8s-cloud-tagger
stringData:
AZURE_SUBSCRIPTION_ID: "${SUBSCRIPTION_ID}"
AZURE_TENANT_ID: "${TENANT_ID}"
AZURE_CLIENT_ID: "${ASO_CLIENT_ID}"
AZURE_CLIENT_SECRET: "${ASO_CLIENT_SECRET}"
EOF
```

## 7. Create the managed identity

The `clientId` must be known before `helm install` to annotate the ServiceAccount.

```bash
az identity create \
--resource-group $RESOURCE_GROUP \
--name k8s-cloud-tagger \
--location $LOCATION

export CLIENT_ID=$(az identity show \
--resource-group $RESOURCE_GROUP \
--name k8s-cloud-tagger \
--query clientId \
--output tsv)

export OIDC_ISSUER_URL=$(az aks show \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--query oidcIssuerProfile.issuerUrl \
--output tsv)
```

## 8. Install the Helm chart

AKS nodes have unrestricted outbound internet access by default — images are pulled directly from
`quay.io` without any private registry setup.

```bash
helm install k8s-cloud-tagger helm/k8s-cloud-tagger \
--namespace k8s-cloud-tagger \
--create-namespace \
--set cloudProvider=azure \
--set azure.clientId="$CLIENT_ID" \
--set azure.serviceOperator.enabled=true \
--set azure.serviceOperator.resourceGroup="$RESOURCE_GROUP" \
--set azure.serviceOperator.location="$LOCATION" \
--set azure.serviceOperator.subscriptionId="$SUBSCRIPTION_ID" \
--set azure.serviceOperator.oidcIssuerUrl="$OIDC_ISSUER_URL" \
--set deployment.env.RUST_BACKTRACE=1 \
--set deployment.env.RUST_LOG="debug" \
--set image.repository="quay.io/upgrades/k8s-cloud-tagger-dev" \
--set image.tag="${TAG}"
```

ASO will reconcile the `FederatedIdentityCredential` and `RoleAssignment` automatically. The
controller pod will start tagging once the role assignment propagates (typically within a minute).

## 9. Verify

Check ASO resources are ready:

```bash
kubectl get userassignedidentity,federatedidentitycredential,roleassignment \
-n k8s-cloud-tagger
```

Check the controller is tagging:

```bash
kubectl logs -n k8s-cloud-tagger \
-l app.kubernetes.io/name=k8s-cloud-tagger \
--tail=20
```

Look for `Azure: tags merged` in the output.

## Cluster management

Scale down to zero (stop paying for compute):

```bash
az aks scale \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--node-count 0
```

Scale back up:

```bash
az aks scale \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--node-count 1
```
3 changes: 3 additions & 0 deletions helm/k8s-cloud-tagger/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ spec:
labels:
{{- include "k8s-cloud-tagger.selectorLabels" . | nindent 8 }}
app.kubernetes.io/component: controller
{{- if eq .Values.cloudProvider "azure" }}
azure.workload.identity/use: "true"
{{- end }}
{{- with .Values.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
Expand Down
8 changes: 8 additions & 0 deletions helm/k8s-cloud-tagger/templates/serviceaccount.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,23 @@ metadata:
labels:
{{- include "k8s-cloud-tagger.labels" . | nindent 4 }}
app.kubernetes.io/component: controller
{{- if eq .Values.cloudProvider "azure" }}
azure.workload.identity/use: "true"
{{- end }}
{{- with .Values.serviceAccount.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
{{- if or .Values.serviceAccount.annotations (eq .Values.cloudProvider "gcp") (eq .Values.cloudProvider "azure") }}
annotations:
{{- with .Values.serviceAccount.annotations }}
{{- toYaml . | nindent 4 }}
{{- end }}
{{- if eq .Values.cloudProvider "gcp" }}
iam.gke.io/gcp-service-account: {{ .Values.gcp.serviceAccount }}@{{ .Values.gcp.projectId }}.iam.gserviceaccount.com
{{- end }}
{{- if eq .Values.cloudProvider "azure" }}
azure.workload.identity/client-id: {{ required "azure.clientId is required when cloudProvider=azure" .Values.azure.clientId }}
{{- end }}
{{- end }}
automountServiceAccountToken: {{ .Values.serviceAccount.automountServiceAccountToken }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{- if and (eq .Values.cloudProvider "azure") (.Values.azure.serviceOperator.enabled) }}
apiVersion: managedidentity.azure.com/v1api20230131
kind: FederatedIdentityCredential
metadata:
name: {{ include "k8s-cloud-tagger.fullname" . }}-wi-binding
annotations:
serviceoperator.azure.com/reconcile-policy: detach-on-delete
spec:
owner:
name: {{ include "k8s-cloud-tagger.fullname" . }}
audiences:
- api://AzureADTokenExchange
issuer: {{ required "azure.serviceOperator.oidcIssuerUrl is required" .Values.azure.serviceOperator.oidcIssuerUrl }}
subject: system:serviceaccount:{{ .Release.Namespace }}:{{ include "k8s-cloud-tagger.fullname" . }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{{- if and (eq .Values.cloudProvider "azure") (.Values.azure.serviceOperator.enabled) }}
apiVersion: managedidentity.azure.com/v1api20230131
kind: UserAssignedIdentity
metadata:
name: {{ include "k8s-cloud-tagger.fullname" . }}
annotations:
serviceoperator.azure.com/reconcile-policy: detach-on-delete
spec:
location: {{ required "azure.serviceOperator.location is required" .Values.azure.serviceOperator.location }}
owner:
name: {{ required "azure.serviceOperator.resourceGroup is required" .Values.azure.serviceOperator.resourceGroup }}
operatorSpec:
configMaps:
principalId:
name: {{ .Values.azure.identityName }}
key: principalId
clientId:
name: {{ .Values.azure.identityName }}
key: clientId
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{{- if and (eq .Values.cloudProvider "azure") (.Values.azure.serviceOperator.enabled) }}
apiVersion: resources.azure.com/v1api20200601
kind: ResourceGroup
metadata:
name: {{ required "azure.serviceOperator.resourceGroup is required" .Values.azure.serviceOperator.resourceGroup }}
annotations:
serviceoperator.azure.com/reconcile-policy: detach-on-delete
spec:
location: {{ required "azure.serviceOperator.location is required" .Values.azure.serviceOperator.location }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{{- if and (eq .Values.cloudProvider "azure") (.Values.azure.serviceOperator.enabled) }}
apiVersion: authorization.azure.com/v1api20220401
kind: RoleAssignment
metadata:
name: {{ include "k8s-cloud-tagger.fullname" . }}-disk-tagger
annotations:
serviceoperator.azure.com/reconcile-policy: detach-on-delete
spec:
owner:
armId: /subscriptions/{{ required "azure.serviceOperator.subscriptionId is required" .Values.azure.serviceOperator.subscriptionId }}
# principalId is read from the UserAssignedIdentity status via a ConfigMap written by ASO
principalIdFromConfig:
name: {{ .Values.azure.identityName }}
key: principalId
principalType: ServicePrincipal
# Tag Contributor built-in role — lets you manage tags on any resource without
# granting access to the resources themselves.
roleDefinitionReference:
armId: /providers/Microsoft.Authorization/roleDefinitions/4a9ae827-6dc8-4573-8ac7-8239d42aa03f
{{- end }}
Loading