Amazon EKS Upgrade Utility (Karpenter-aware fork)

🌐 Language: English · 한국어

eksupgrade is a CLI utility that automates the upgrade process for Amazon EKS clusters — the control plane, managed add-ons, and worker nodes managed by Cluster Autoscaler, self-managed ASGs, and Karpenter.

About this fork. This repository began as a fork of aws-samples/eks-cluster-upgrade, which is no longer actively maintained. It is now an independent project. The headline change is Karpenter-native, drift-based node upgrades (instead of terminating instances), alongside restored Cluster Autoscaler support and safer node draining. See Differences from upstream.

Differences from upstream

Karpenter nodes upgrade via Drift, not termination. The Karpenter controller is left running; after the control plane is upgraded, alias-based EC2NodeClass selectors re-resolve to the AMI for the new Kubernetes version and Karpenter replaces nodes capacity-first, honoring PodDisruptionBudgets, disruption budgets, and karpenter.sh/do-not-disrupt. The old approach (pause controller → terminate EC2) caused a capacity gap, bypassed PDBs, and never actually updated the AMI.
Both autoscalers supported side by side. Cluster Autoscaler is detected (including Helm-named installs, via label selector) and paused/resumed; the Karpenter path is separate and never applies CA semantics.
Owner-aware node draining. Draining skips DaemonSet and static/mirror pods, and refuses to start if an unmanaged pod would be lost (unless --force), so a node is never left half-drained.

Cluster Upgrade

The upgrade runs in the following order to keep the cluster stable at each step:

┌─────────────────────────────────────────────────────────────────┐
│                     EKS Cluster Upgrade Flow                      │
└─────────────────────────────────────────────────────────────────┘

  [1] Control Plane Upgrade
      └─ AWS manages the upgrade; eksupgrade waits until ACTIVE.
         (Karpenter drift begins here for alias-based EC2NodeClasses.)

  [2] Add-on Upgrades  (vpc-cni → kube-proxy → coredns)
      └─ Versions resolved live from the EKS API per cluster version;
         vpc-cni is upgraded step-by-step per minor version.

  [3] Cluster Autoscaler → PAUSE   (Karpenter is left RUNNING)
      └─ CA deployment scaled to 0 so it can't fight node replacement.
         Karpenter must keep running for its drift to replace nodes.

  [4] Managed Node Group Upgrade
      └─ EKS performs a rolling AMI replacement (--parallel optional).

  [5] Self-managed Node Group Upgrade  (ASG-based)
      └─ For each Auto Scaling Group:
           a. Detect AMI type (AL2023 / AL2 / Bottlerocket / Windows / Ubuntu)
           b. Fetch the latest EKS-optimised AMI from SSM
           c. Roll each outdated instance: launch replacement → cordon →
              owner-aware drain (respects PDB unless --force) → terminate

  [6] Karpenter Node Upgrade — via DRIFT  (if Karpenter is detected)
      └─ Inspect each EC2NodeClass:
           • alias selector (e.g. al2023@latest / al2023@vX) → auto-drifts to
             the new Kubernetes version's AMI; the tool only observes + waits
           • id / name / tags selector → will NOT auto-drift; the tool warns
             that those nodes need a manual amiSelectorTerms update
         Then wait (bounded) until the drifting NodePools' nodes are on the
         target version. The controller is never paused or forced.

  [7] Cluster Autoscaler → RESUME
      └─ Restored to its original replica count (also on failure).
         Karpenter needs no resume — it was never paused.

Supported Node Types

Node Type	Managed Node Group	Self-managed (ASG)	Karpenter
Amazon Linux 2023 (AL2023)	✅	✅	✅ (drift)
Bottlerocket	✅	✅	✅ (drift)
Windows Server	✅	✅	✅ (drift)
Ubuntu	✅	✅	✅ (drift)
Amazon Linux 2 (AL2)	✅	✅	n/a¹

¹ AL2 EKS-optimised AMIs reached end of support; EKS 1.32 was the last version to ship them. Use AL2023 for 1.33+.

Supported Kubernetes Versions

Sequential single-minor upgrades only (e.g. 1.34 → 1.35). EKS does not allow jumping multiple minors in one run — to go from 1.34 to 1.36, run the tool twice (1.34 → 1.35, then 1.35 → 1.36).

EKS standard support currently spans roughly 1.33–1.36 (and a few older versions under extended support). The tool resolves add-on versions live from the EKS API, so it tracks whatever EKS currently offers rather than a baked-in table.

Pre-Requisites

You need permissions for both AWS and the Kubernetes cluster.

Install from source (this fork is not published to PyPI).

Recommended — install the eksupgrade command with pipx (isolated, and puts eksupgrade on your PATH):

git clone https://github.com/namejsjeongkr/eksupgrade.git
pipx install ./eksupgrade
eksupgrade --help

To upgrade later after pulling new changes: pipx install --force ./eksupgrade.

Alternative — for development, use Poetry inside the cloned repo. Commands then run as poetry run eksupgrade ...:

git clone https://github.com/namejsjeongkr/eksupgrade.git
cd eksupgrade
poetry install
poetry run eksupgrade --help

AWS permissions — an example minimum policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "iam",
      "Effect": "Allow",
      "Action": [
        "iam:GetRole",
        "sts:GetAccessKeyInfo",
        "sts:GetCallerIdentity",
        "sts:GetSessionToken"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ec2",
      "Effect": "Allow",
      "Action": [
        "autoscaling:CreateLaunchConfiguration",
        "autoscaling:Describe*",
        "autoscaling:SetDesiredCapacity",
        "autoscaling:TerminateInstanceInAutoScalingGroup",
        "autoscaling:UpdateAutoScalingGroup",
        "ec2:Describe*",
        "ec2:TerminateInstances",
        "ssm:GetParameter"
      ],
      "Resource": "*"
    },
    {
      "Sid": "eks",
      "Effect": "Allow",
      "Action": [
        "eks:Describe*",
        "eks:List*",
        "eks:UpdateAddon",
        "eks:UpdateClusterVersion",
        "eks:UpdateNodegroupVersion"
      ],
      "Resource": "*"
    }
  ]
}

Update your local kubeconfig to authenticate to the cluster:

aws eks update-kubeconfig --name <CLUSTER-NAME> --region <REGION>

Usage

Examples below use the eksupgrade command (pipx install). If you installed with Poetry for development, prefix each command with poetry run.

eksupgrade --help

 Usage: eksupgrade [OPTIONS] CLUSTER_NAME CLUSTER_VERSION REGION

 Run eksupgrade against a target cluster.

 Arguments:
   cluster_name      The name of the cluster to be upgraded   [required]
   cluster_version   The target Kubernetes version            [required]
   region            The AWS region of the target cluster     [required]

 Options:
   --max-retry        INTEGER  Retries per upgrade            [default: 2]
   --force                     Force pod eviction (ignores PDB / unmanaged pods)
   --parallel                  Upgrade node groups in parallel
   --latest-addons             Use the latest eligible add-on versions
   --interactive               Prompt for confirmation        [default: on]
   --version                   Show the version and exit
   --help                      Show this message and exit

Example:

eksupgrade my-cluster 1.35 ap-northeast-2

Read-only preflight

Run a read-only assessment without changing anything:

eksupgrade <cluster> <target-version> <region> --preflight --no-interactive

It checks the control plane, add-ons, managed node groups, Karpenter, and PodDisruptionBudget coverage (warning on replicas≥2 workloads with no PDB), prints a summary report, and exits without performing any upgrade. Exit codes: 0 safe (warnings allowed), 1 blocking issues found, 2 the checks could not run.

Known limitations

Karpenter logic is covered by unit tests against mocked CRDs. The Karpenter v1 CRD coordinates and amiSelectorTerms classification are documentation-derived — verify against a real EC2NodeClass before relying on it in production.
Pinned Karpenter selectors (id / name / tags) are warned about, not rewritten — update their amiSelectorTerms manually to upgrade those nodes.
--force deletes pods bypassing PodDisruptionBudgets (this is inherent to --force).
The self-managed ASG path still uses the deprecated CreateLaunchConfiguration API (migration to Launch Templates is planned).

License

Licensed under the MIT-0 License (inherited from the upstream project).

This project is a community fork of aws-samples/eks-cluster-upgrade and is not an AWS service. Support is best-effort via the Issues section.

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.github		.github
eksupgrade		eksupgrade
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.MD		CLAUDE.MD
README.ko.md		README.ko.md
README.md		README.md
UPGRADE_PLAN.md		UPGRADE_PLAN.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amazon EKS Upgrade Utility (Karpenter-aware fork)

Differences from upstream

Cluster Upgrade

Supported Node Types

Supported Kubernetes Versions

Pre-Requisites

Usage

Read-only preflight

Known limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Amazon EKS Upgrade Utility (Karpenter-aware fork)

Differences from upstream

Cluster Upgrade

Supported Node Types

Supported Kubernetes Versions

Pre-Requisites

Usage

Read-only preflight

Known limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages