Nap provisioning best practice blog by wdarko1 · Pull Request #5665 · Azure/AKS

wdarko1 · 2026-03-18T16:10:47Z

Adds a best practices blog for managing node selection, availability.

Add a new blog post on controlling node provisioning outcomes in AKS, covering PDBs, affinity, and topology spread constraints.

Updated language for clarity and precision in Kubernetes provisioning guidance. Enhanced explanations of key concepts and best practices for AKS Node Auto-Provisioning.

Copilot

Pull request overview

Adds a new AKS blog post describing best practices for influencing node provisioning outcomes (PDBs, affinity/anti-affinity, and topology spread constraints) and how Node Auto-Provisioning (NAP) interprets those signals.

Changes:

Adds a new blog post markdown file for NAP/node provisioning best practices.
Includes examples and guidance for topology spread constraints, affinity, and Pod Disruption Budgets.
Describes NAP behavior for node selection, disruption, and topology spread.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new AKS blog post describing best practices for controlling node provisioning outcomes (PDBs, affinity/anti-affinity, topology spread constraints) and how AKS Node Auto-Provisioning (NAP) interprets those signals.

Changes:

Introduces a new best-practices post covering scheduling intent primitives (topology spread, affinity, PDBs).
Adds guidance on how NAP uses these constraints for node selection, scaling, and disruption/consolidation behaviors.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Updated article to refine title and description, adjust publication date, and enhance clarity on Node Auto-Provisioning (NAP) concepts and best practices.

Expanded on the benefits of Node Auto-Provisioning for compute efficiency and added a section on next steps for users to get started.

Copilot

Pull request overview

Adds a new AKS blog post describing best practices for shaping Node Auto-Provisioning (NAP) outcomes via PDBs, affinity/anti-affinity, and topology spread constraints.

Changes:

Adds a long-form best-practices article covering scheduling intent vs. node policy boundaries for AKS NAP.
Provides example manifests and operational guidance for zonal spreading, node affinity, and disruption controls.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

…pology-spread-image.png

Copilot

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 5 comments.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Added a new tag for 'Scheduler' with relevant details.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

website/blog/tags.yml

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Updated text for clarity and consistency throughout the document, including image references and examples.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Copilot · 2026-03-25T20:24:45Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+```yaml
+affinity:
+  nodeAffinity:
+    preferredDuringSchedulingIgnoredDuringExecution:
+    - weight: 100
+      preference:
+        matchExpressions:
+        - key: node.kubernetes.io/instance-type
+          operator: In
+          values: ["Standard_D16ds_v5"]
+```


The nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution YAML example isn’t valid as written: the list item (- weight: 100) should be indented under preferredDuringSchedulingIgnoredDuringExecution, and the matchExpressions list should also be properly indented. Please fix indentation so readers can copy/paste it.

website/blog/2026-03-20-node-provisioning-best-practice/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Copilot · 2026-03-25T20:31:41Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+---
+title: "Controlling Node Provisioning Outcomes on AKS: PDBs, Affinity, and Topology Spread"
+description: "Learn AKS best practices for Node Auto-Provisioning, using PDBs, affinity, and topology spread constraints to achieve predictable, resilient pod scheduling."
+date: 2026-03-30


The front matter date: 2026-03-30 doesn't match the post folder prefix 2026-03-20-..., but the repo authoring guide requires them to match (controls chronology). Please update either the folder name or the front matter date so they align. Also note the date is in the future relative to today (2026-03-25); if the intent is to prevent early publishing, consider using draft: true/unlisted: true or keeping the post off the main branch until the publish date (depending on site config).

Suggested change

date: 2026-03-30

date: 2026-03-20

kaarthis

Left few comments but critical recommendation : PDB is about pod criticality wrt disruption/drain — not scheduling intent. It doesn't belong with nodeSelector, taints, and topologyRecommendation : Split the blog. Keep Part 1-3 (Topology, Affinity, Scheduling Intent) as the scheduling blog. Move Part 4 (PDB + NAP disruption) to a separate "Controlling Disruption with NAP: PDBs, Consolidation Policies, and Node Disruption Budgets" blog. Cross-link between them.

kaarthis · 2026-03-25T21:31:55Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+- How do I express node preferences without accidentally blocking scheduling?
+- If I’m using Node Auto-Provisioning (NAP), how does it interpret the rules I set?
+
+This post will connect NAP with the three most important workload-level tools for shaping predictable node provisioning outcomes on AKS:


What data backs "three most important"? Support cases? Community signal? Or judgment call?

kaarthis · 2026-03-25T21:32:38Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+- If I’m using Node Auto-Provisioning (NAP), how does it interpret the rules I set?
+
+This post will connect NAP with the three most important workload-level tools for shaping predictable node provisioning outcomes on AKS:
+


Is this blog about scheduling or lifecycle? Title = provisioning, Part 4 = eviction. Split?

Updated the blog post to focus on scheduling, and split out the disruption topic into a separate blog post

kaarthis · 2026-03-25T21:32:55Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+          matchExpressions:
+            - key: node.kubernetes.io/instance-type
+              operator: In
+              values: ["Standard_D16ds_v5"]


weight: 100 is unexplained — what's the scale, how does it interact with multiple preferences?

kaarthis · 2026-03-25T21:33:50Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+
+**Practical guidance:**
+
+- For critical workloads that you do not want to be disrupted at all, strictness of "zero eviction" may be intentional — but be deliberate. When you're ready to allow disruption to these workloads, you may have to change the PDBs in the workload deployment file.


maxUnavailable: 0 — this blocks security patching. Reframe as anti-pattern. We should never advocate for Zero eviction anywhere in doc.

kaarthis · 2026-03-25T21:34:13Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+- How do I express node preferences without accidentally blocking scheduling?
+- If I’m using Node Auto-Provisioning (NAP), how does it interpret the rules I set?
+
+This post will connect NAP with the three most important workload-level tools for shaping predictable node provisioning outcomes on AKS:


Why are taints/tolerations not in the top 3 for NAP? They're arguably more scheduling-relevant than PDBs.

kaarthis · 2026-03-25T21:36:09Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+
+Kubernetes describes minAvailable / maxUnavailable as the two key availability knobs, and notes you can only specify one per PDB.
+
+### How NAP handles disruption


How does NAP's consolidation engine interact with PDBs specifically? One line isn't enough.

kaarthis · 2026-03-25T21:36:52Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+      labels:
+        app: web
+    spec:
+      topologySpreadConstraints:


What happens when topology spread and affinity conflict? ( i dont see this combination in the doc. )

Added a matrix describing this behavior with recommendations for users

kaarthis · 2026-03-25T21:37:48Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+What these fields mean (in plain language):
+
+- topologyKey: topology.kubernetes.io/zone → spread across zones (not just nodes).
+- maxSkew: 1 → keep zone counts close (difference between most/least loaded domains can’t exceed 1 when DoNotSchedule).


Link "A good default" section to NAP NodePool zone configuration.

kaarthis · 2026-03-25T21:38:19Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+
+## Background
+
+AKS users want to ensure their workloads schedule, scale, and are disrupted only when (or where) desired. The problem here is Kubernetes can feel complex, and its easy to be unclear what settings to use to accomplish this. Node Auto-Provisioning allows amazing benefits for compute efficiency, but to best utilize it - users need to make sure certain best practices are followed for predictable behavior.


Replace "amazing benefits" with specific efficiency claims.

kaarthis · 2026-03-25T21:40:43Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+Behavior: NAP consolidates too often or voluntarily disrupts too many nodes at once
+Cause: User has not set any guardrails on node disruption behavior.
+
+- Fix: Add PDBs that regulate disruption pace


PDB is about pod criticality wrt disruption/drain — not scheduling intent. It doesn't belong with nodeSelector, taints, and topologyRecommendation : Split the blog. Keep Part 1-3 (Topology, Affinity, Scheduling Intent) as the scheduling blog. Move Part 4 (PDB + NAP disruption) to a separate "Controlling Disruption with NAP: PDBs, Consolidation Policies, and Node Disruption Budgets" blog. Cross-link between them.

Updated the blog post to replace references to Pod Disruption Budgets (PDBs) with taints and tolerations, providing a more accurate description of their role in controlling node provisioning outcomes. The disruption topics (PDBs, consolidation) will be moved to another blog post.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.

Copilot · 2026-03-27T20:50:56Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+      taints:
+        - key: test.com/custom-taint
+          effect: NoSchedule


The NodePool taints example doesn't match the text above it (which references spec.template.spec.taints). As written, the YAML is missing the spec: under template: and the indentation suggests taints is directly under template, which is not the correct NodePool schema path. Please update the example to use the correct field path and structure.

Suggested change

taints:

- key: test.com/custom-taint

effect: NoSchedule

spec:

taints:

- key: test.com/custom-taint

effect: NoSchedule

Copilot · 2026-03-27T20:50:56Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+          effect: NoSchedule
+```
+
+> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.


This line uses Markdown image syntax (![NOTE]) without a URL, which will render as a broken image rather than a note. Use a Docusaurus admonition (for example :::note) or a blockquote note format instead.

Suggested change

> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

> **Note**: Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

Copilot · 2026-03-27T20:50:56Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+Tolerations are a field you place in your workload deployment file to flag what types of tainted nodes these pods can be scheduled to. There are two general behaviors for tolerations:
+
+- `NoSchedule` - strict toleration. Only pods with the proper toleration can be scheduled to the node with a specific taint.
+- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not gauranteed.


Spelling: "gauranteed" should be "guaranteed".

Suggested change

- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not gauranteed.

- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not guaranteed.

Copilot · 2026-03-27T20:50:57Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+---
+title: "Controlling Node Provisioning Outcomes on AKS: PDBs, Affinity, and Topology Spread"
+description: "Learn AKS best practices for Node Auto-Provisioning, using taints and tolerations, affinity, and topology spread constraints to achieve predictable, resilient pod scheduling."
+date: 2026-03-30


The post folder is 2026-03-20-... but the front matter date: is 2026-03-30. The website authoring guide requires the folder date prefix to match the front matter date; please align them by renaming the folder or updating the date: value (and keep the displayed publish date consistent with what you want readers to see).

Suggested change

date: 2026-03-30

date: 2026-03-20

Copilot · 2026-03-27T20:50:57Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+Node auto-provisioning provisions, scales, and manages nodes. NAP senses pending pod pressure, chooses/provisions nodes that satisfy workload specs and NodePool allowed options — and then schedules pods onto those nodes.
+


This description of NAP suggests it "schedules pods onto those nodes". NAP provisions/manages nodes in response to pending pods, but the Kubernetes scheduler is still what schedules pods; consider rewording to avoid implying NAP is the scheduler.

Updated the content for clarity and added links to relevant documentation. Improved explanations of Node Auto-Provisioning, Topology Spread Constraints, and Taints and Tolerations.

Updated content for clarity and added FAQ section.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 9 comments.

Copilot · 2026-03-28T04:28:32Z

website/blog/tags.yml

+scheduler:
+  label: Scheduler
+  permalink: /scheduler
+  description: Scheduling workloads and managing scheduling constraints on AKS.


scheduler tag entry is missing a blank line separator before the next top-level tag (security:), which breaks the consistent spacing pattern used throughout tags.yml and makes the file harder to scan. Add a blank line after the scheduler tag block.

Suggested change

description: Scheduling workloads and managing scheduling constraints on AKS.

description: Scheduling workloads and managing scheduling constraints on AKS.

Copilot · 2026-03-28T04:28:33Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+title: "Controlling Node Provisioning Outcomes on AKS: PDBs, Affinity, and Topology Spread"
+description: "Learn AKS best practices for Node Auto-Provisioning, using taints and tolerations, affinity, and topology spread constraints to achieve predictable, resilient pod scheduling."
+date: 2026-03-30
+authors: ["wilson-darko"]
+tags:
+  - node-auto-provisioning
+  - scheduler
+---
+
+## Background
+
+AKS users want to ensure their workloads schedule, scale, and are disrupted only when (or where) desired. The problem here is Kubernetes can feel complex, and its easy to be unclear what settings to use to accomplish this. Node Auto-Provisioning optimizes bin-packing your compute, but to best utilize it - users need to make sure certain best practices are followed for predictable behavior.
+
+When adopting Kubernetes at scale, the hardest operational questions often aren’t “How do I scale nodes (or VMs)?” — they’re:
+
+- Where will my workload replicas land (zones / nodes)?
+- How do I express node preferences without accidentally blocking scheduling?
+- If I’m using Node Auto-Provisioning (NAP), how does it interpret the rules I set?
+
+This post will connect NAP with three most important workload-level tools for shaping predictable node provisioning outcomes on AKS:
+
+1. **Taints and Tolerations** – control which pods can go to which nodes
+2. **Affinity/Anti-Affinity** – control where workloads can (or should not) run
+3. **Topology Spread Constraints** – control replica distribution across failure domains
+
+Then we’ll connect the dots to explain what AKS Node Auto-Provisioning (NAP) does with those signals to manage your workloads.
+
+If you’re new to these Kubernetes features, this post will give you “good defaults” as a starting point. If you’re already deep into scheduling, treat it as a checklist for the behaviors AKS users most commonly ask about.
+


Title references “PDBs”, but the article body doesn't discuss PodDisruptionBudgets anywhere. Either add a section explaining how PDBs interact with NAP/scheduling/disruptions, or remove PDBs from the title to avoid misleading readers.

Copilot · 2026-03-28T04:28:33Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+
+## Part 2 — Topology Spread Constraints: tool for zone-aware replicas
+
+**Topology Spread Constraints** let you tell the scheduler: “Keep these replicas balanced across domains like zones or nodes.” The Kubernetes documentation describe it as a way to spread pods across failure domains such as regions, zones, nodes, and custom topology keys.


Grammar issue: “The Kubernetes documentation describe it” should be “The Kubernetes documentation describes it”.

Suggested change

**Topology Spread Constraints** let you tell the scheduler: “Keep these replicas balanced across domains like zones or nodes.” The Kubernetes documentation describe it as a way to spread pods across failure domains such as regions, zones, nodes, and custom topology keys.

**Topology Spread Constraints** let you tell the scheduler: “Keep these replicas balanced across domains like zones or nodes.” The Kubernetes documentation describes them as a way to spread pods across failure domains such as regions, zones, nodes, and custom topology keys.

Copilot · 2026-03-28T04:28:33Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+          effect: NoSchedule
+```
+
+> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.


The callout uses > ![NOTE], which isn't a supported admonition/alert syntax in this repo (other posts use :::note or > [!NOTE]). Update this block to a supported format so it renders correctly.

Suggested change

> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

> [!NOTE]

> Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

Copilot · 2026-03-28T04:28:33Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+You can consider enabling features such as [Artifact Stream](https://learn.microsoft.com/en-us/azure/aks/artifact-streaming) which can decrease pod readiness time. 
+
+For more visit our documentation on [performance and scaling best practices](https://learn.microsoft.com/en-us/azure/aks/best-practices-performance-scale).


These Learn links use locale-specific URLs (/en-us/). Repo guidance prefers generic Learn links without locale segments to avoid redirects and keep links consistent. Update them to https://learn.microsoft.com/azure/....

Suggested change

You can consider enabling features such as [Artifact Stream](https://learn.microsoft.com/en-us/azure/aks/artifact-streaming) which can decrease pod readiness time.

For more visit our documentation on [performance and scaling best practices](https://learn.microsoft.com/en-us/azure/aks/best-practices-performance-scale).

You can consider enabling features such as [Artifact Stream](https://learn.microsoft.com/azure/aks/artifact-streaming) which can decrease pod readiness time.

For more visit our documentation on [performance and scaling best practices](https://learn.microsoft.com/azure/aks/best-practices-performance-scale).

Copilot · 2026-03-28T04:28:33Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+        preference:
+          matchExpressions:
+            - key: node.kubernetes.io/instance-type
+              operator: In
+              values: ["Standard_D16ds_v5"]


The preferredDuringSchedulingIgnoredDuringExecution node affinity example is not valid Kubernetes syntax: this field must be a list of terms and each term requires a weight and a preference. As written, readers copying this will get schema/validation errors.

Suggested change

preference:

matchExpressions:

- key: node.kubernetes.io/instance-type

operator: In

values: ["Standard_D16ds_v5"]

- weight: 100

preference:

matchExpressions:

- key: node.kubernetes.io/instance-type

operator: In

values:

- Standard_D16ds_v5

Copilot · 2026-03-28T04:28:34Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+| Topology Spread Configuration | Affinity Configuration | Observed Scheduling Behavior | Recommendation |
+|------------------------------|------------------------|------------------------------|----------------|
+| **Hard** (`whenUnsatisfiable: DoNotSchedule`) | **Hard Node Affinity** (`requiredDuringSchedulingIgnoredDuringExecution`) | Pod remains **Pending** if no node satisfies *both* constraints. The scheduler filters out all nodes that violate either rule. | Use only when you are certain the constraints are always compatible (for example, multi‑zone node affinity plus multi‑zone spread). Avoid mixing single‑zone affinity with multi‑zone spread. |
+| **Soft** (`whenUnsatisfiable: ScheduleAnyway`) | **Hard Node Affinity** (`requiredDuringSchedulingIgnoredDuringExecution`) | Pod schedules only on nodes matching affinity. Topology spread is applied as **best‑effort**, and distribution may be uneven. | ✅ **Recommended default** for most workloads. Enforce strict placement requirements while keeping high availability best‑effort. |
+| **Hard** (`whenUnsatisfiable: DoNotSchedule`) | **Soft Node Affinity** (`preferredDuringSchedulingIgnoredDuringExecution`) | Pod schedules only if topology spread constraints are met. Affinity acts only as a preference among valid nodes. | Use when even distribution across zones or nodes is more important than node‑level preferences. |
+| **Soft** (`whenUnsatisfiable: ScheduleAnyway`) | **Soft Node Affinity** | Pod always schedules. Both constraints only influence scoring; placement is flexible and may be imbalanced. | Suitable for dev/test, batch, or low‑criticality workloads. |
+| **Hard multi‑zone spread** (`whenUnsatisfiable: DoNotSchedule` and `minDomains` >= 2) | **Single‑zone hard affinity** | Pod enters a permanent **Pending** state due to a logical contradiction between constraints. | Align affinity and spread to the same topology domains, or relax one of the constraints. |
+


The markdown table under “The following table lists…” is written with leading || on each row, which creates an empty first column and typically renders incorrectly. Rewrite it using standard markdown table syntax with a single leading | per row.

Copilot · 2026-03-28T04:28:34Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+In the following example shows a taint called `test.com/custom-taint` that is added in the `spec.template.spec.taints` field in a [NodePool CRD](https://learn.microsoft.com/azure/aks/node-auto-provisioning-node-pools):
+
+```yaml
+apiVersion: karpenter.sh/v1
+kind: NodePool
+metadata:
+  name: default
+spec:
+  template:
+      taints:
+        - key: test.com/custom-taint
+          effect: NoSchedule
+```


The NodePool taints YAML snippet doesn't match the field path described in the text (spec.template.spec.taints). The example is missing the spec: level under template and the indentation suggests taints is directly under template, which is inconsistent and likely to confuse readers.

Copilot · 2026-03-28T04:28:34Z

website/blog/2026-03-20-node-provisioning-best-practice/index.md

+Tolerations are a field you place in your workload deployment file to flag what types of tainted nodes these pods can be scheduled to. There are two general behaviors for tolerations:
+
+- `NoSchedule` - strict toleration. Only pods with the proper toleration can be scheduled to the node with a specific taint.
+- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not gauranteed.


Spelling: “gauranteed” should be “guaranteed”.

Suggested change

- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not gauranteed.

- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not guaranteed.

wdarko1 added 3 commits March 17, 2026 22:50

Create blog post on AKS node provisioning best practices

ab03ec4

Add a new blog post on controlling node provisioning outcomes in AKS, covering PDBs, affinity, and topology spread constraints.

Remove unnecessary links and improve clarity

5375382

Refine Kubernetes provisioning guidance and terminology

5d2a855

Updated language for clarity and precision in Kubernetes provisioning guidance. Enhanced explanations of key concepts and best practices for AKS Node Auto-Provisioning.

wdarko1 requested review from a team, alvinli222 and Copilot March 18, 2026 16:10

Copilot started reviewing on behalf of wdarko1 March 18, 2026 16:11 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Apply suggestions from code review

f2d2f36

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 19, 2026 03:04

Copilot AI reviewed Mar 19, 2026

View reviewed changes

colinmixonn requested changes Mar 23, 2026

View reviewed changes

colinmixonn reviewed Mar 23, 2026

View reviewed changes

website/blog/2026-03-20-node-provisioning-best-practice/index.md Show resolved Hide resolved

wdarko1 added 2 commits March 24, 2026 08:16

Revise blog post on AKS node provisioning best practices

386dd88

Updated article to refine title and description, adjust publication date, and enhance clarity on Node Auto-Provisioning (NAP) concepts and best practices.

Enhance Node Provisioning best practices section

00ea6b1

Expanded on the benefits of Node Auto-Provisioning for compute efficiency and added a section on next steps for users to get started.

Copilot AI review requested due to automatic review settings March 25, 2026 18:52

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Copilot started reviewing on behalf of wdarko1 March 25, 2026 19:02 View session

wdarko1 added 2 commits March 25, 2026 12:12

Add files via upload

1cc5a2b

Merge branch 'Azure:master' into nap-provisioning-best-practice-blog

ee10b6a

Copilot AI review requested due to automatic review settings March 25, 2026 19:12

Add diagram to node provisioning blog post

f22f655

Copilot started reviewing on behalf of wdarko1 March 25, 2026 19:16 View session

wdarko1 added 3 commits March 25, 2026 12:17

Add files via upload

ce04104

Update image for NAP topology spread behavior

0955a20

Delete website/blog/2026-03-20-node-provisioning-best-practice/nap-to…

a4473d7

…pology-spread-image.png

wdarko1 requested a review from colinmixonn March 25, 2026 19:18

Copilot AI reviewed Mar 25, 2026

View reviewed changes

wdarko1 and others added 2 commits March 25, 2026 12:52

Add 'Scheduler' tag to blog tags configuration

88667fc

Added a new tag for 'Scheduler' with relevant details.

Apply suggestions from code review

4e3a665

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 25, 2026 20:00

Copilot started reviewing on behalf of wdarko1 March 25, 2026 20:00 View session

Rename nap-topology-spread-image-1.png to hero-image.png

c237d7c

Copilot AI reviewed Mar 25, 2026

View reviewed changes

website/blog/tags.yml Show resolved Hide resolved

website/blog/2026-03-20-node-provisioning-best-practice/index.md Outdated Show resolved Hide resolved

wdarko1 and others added 2 commits March 25, 2026 13:08

Refine content and update image references

6ecab86

Updated text for clarity and consistency throughout the document, including image references and examples.

Update website/blog/2026-03-20-node-provisioning-best-practice/index.md

5f2426e

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 25, 2026 20:10

Copilot started reviewing on behalf of wdarko1 March 25, 2026 20:11 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

website/blog/2026-03-20-node-provisioning-best-practice/index.md Outdated Show resolved Hide resolved

website/blog/2026-03-20-node-provisioning-best-practice/index.md Outdated Show resolved Hide resolved

wdarko1 and others added 2 commits March 25, 2026 13:19

Fix formatting and update node affinity examples

d056e3a

Apply suggestions from code review

251b24c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 25, 2026 20:20

Copilot started reviewing on behalf of wdarko1 March 25, 2026 20:20 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

wdarko1 and others added 2 commits March 25, 2026 13:26

Fix PDB recommendation for NAP node upgrades

fda06d2

Apply suggestions from code review

04b5d11

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 25, 2026 20:28

Copilot started reviewing on behalf of wdarko1 March 25, 2026 20:28 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Fix YAML formatting for node affinity example

0b65776

kaarthis reviewed Mar 25, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings March 27, 2026 20:45

Copilot started reviewing on behalf of wdarko1 March 27, 2026 20:47 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

wdarko1 added 2 commits March 27, 2026 20:54

Enhance clarity and add documentation links in blog

fb12441

Updated the content for clarity and added links to relevant documentation. Improved explanations of Node Auto-Provisioning, Topology Spread Constraints, and Taints and Tolerations.

Refine node provisioning best practices and add FAQ

341adb9

Updated content for clarity and added FAQ section.

Copilot AI review requested due to automatic review settings March 28, 2026 04:20

Copilot started reviewing on behalf of wdarko1 March 28, 2026 04:25 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

		- If I’m using Node Auto-Provisioning (NAP), how does it interpret the rules I set?

		This post will connect NAP with the three most important workload-level tools for shaping predictable node provisioning outcomes on AKS:


		Practical guidance:

		- For critical workloads that you do not want to be disrupted at all, strictness of "zero eviction" may be intentional — but be deliberate. When you're ready to allow disruption to these workloads, you may have to change the PDBs in the workload deployment file.


		Kubernetes describes minAvailable / maxUnavailable as the two key availability knobs, and notes you can only specify one per PDB.

		### How NAP handles disruption


		## Background

		AKS users want to ensure their workloads schedule, scale, and are disrupted only when (or where) desired. The problem here is Kubernetes can feel complex, and its easy to be unclear what settings to use to accomplish this. Node Auto-Provisioning allows amazing benefits for compute efficiency, but to best utilize it - users need to make sure certain best practices are followed for predictable behavior.

	> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.
	> Note: Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

	- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not gauranteed.
	- `PreferNoSchedule` - less strict toleration. AKS will _try_ to avoid placing pods that don't tolerate this node's taint, but it's not guaranteed.

		Node auto-provisioning provisions, scales, and manages nodes. NAP senses pending pod pressure, chooses/provisions nodes that satisfy workload specs and NodePool allowed options — and then schedules pods onto those nodes.

	description: Scheduling workloads and managing scheduling constraints on AKS.
	description: Scheduling workloads and managing scheduling constraints on AKS.


		## Part 2 — Topology Spread Constraints: tool for zone-aware replicas

		Topology Spread Constraints let you tell the scheduler: “Keep these replicas balanced across domains like zones or nodes.” The Kubernetes documentation describe it as a way to spread pods across failure domains such as regions, zones, nodes, and custom topology keys.

	> ![NOTE] Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.
	> [!NOTE]
	> Taints can prevent pods from being scheduled to these nodes if they are not tolerated by the pods. A proper toleration must be added to your specific pods to allow them to be scheduled to nodes that are based on this NodePool CRD.

		You can consider enabling features such as [Artifact Stream](https://learn.microsoft.com/en-us/azure/aks/artifact-streaming) which can decrease pod readiness time.

		For more visit our documentation on [performance and scaling best practices](https://learn.microsoft.com/en-us/azure/aks/best-practices-performance-scale).

Conversation

wdarko1 commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment