Phase B — Karpenter-native provider for Scaleway Elastic Metal (no-image + rescue + dd)

## Context

Phase A (Kamaji + CAPI + Karpenter on Scaleway VMs) is shipping. The next step is **bare-metal autoscaling** so that sustained workloads (>12h/day) can migrate automatically from cloud VMs to Scaleway Elastic Metal via Karpenter consolidation.

No OSS CAPI provider supports Scaleway Elastic Metal today (CAPS v0.2.1 is VM-only). Matchbox cannot reach Scaleway EM's managed network. The cleanest path is a **native Karpenter CloudProvider** that wraps the Scaleway EM API directly — no CAPI indirection.

See [ADR-025 §3.5](../blob/main/docs/adr/025-kamaji-detailed-architecture.md) and [ADR-024](../blob/main/docs/adr/024-hybrid-autoscaling-architecture.md) for the architectural rationale.

## Scope

New repo (or subfolder): `providers/karpenter-scaleway-em/`

```
providers/karpenter-scaleway-em/
├── cmd/karpenter-scaleway-em/       # main.go — registers with karpenter core
├── pkg/
│   ├── apis/v1alpha1/               # ScalewayElasticMetalNodeClass CRD
│   ├── cloudprovider/               # 7 methods of pkg/cloudprovider/types.go
│   │   ├── create.go
│   │   ├── delete.go
│   │   ├── get_list.go
│   │   ├── instance_types.go        # EM offer catalog → InstanceType map
│   │   ├── drift.go
│   │   └── repair.go
│   └── talos/                       # rescue + dd orchestration (reusable package)
│       ├── install.go               # SSH rescue → curl | xz | dd
│       ├── config.go                # render machine config per NodeClaim
│       └── wait.go                  # poll Scaleway + Talos API states
├── charts/karpenter-scaleway-em/    # Helm chart
└── Makefile + go.mod + Dockerfile
```

## Implementation approach

**Chosen path**: `no-preinstalled-image + rescue + dd Talos RAW`

Rationale (vs cloud-init+kexec and BMC+ISO): fastest (~10-12 min vs ~14/30), fully API-driven (SSH key auto-injected in rescue), atomic disk write, debuggable via SSH fallback.

### Create(NodeClaim) flow

1. `POST /baremetal/v1/zones/{zone}/servers` with `install: null`
2. Poll `GET /servers/{id}` until `status: ready` (~3-5 min)
3. `POST /servers/{id}/reboot` with `boot_type: rescue` (~3-5 min)
4. SSH `rescue@<ip>`:
   - Auto-detect disk: `lsblk -dno NAME,TYPE | awk '$2=="disk"{print "/dev/"$1; exit}'`
   - `curl -fsSL <talos.raw.xz> | xz -d | dd of=$DISK bs=4M oflag=direct`
5. `POST /servers/{id}/reboot` with `boot_type: normal`
6. Poll Talos API :50000 (maintenance mode) for ~3 min
7. `talosctl apply-config` with rendered config (kubelet join token, labels, taints, hostname)
8. Return `NodeClaim` hydrated with `providerID: scaleway-em://{zone}/{server_id}`

Karpenter core takes over — watches Node Ready, binds NodeClaim.

### Delete/List/Drift/InstanceTypes

- `Delete`: `DELETE /servers/{id}` + optional cooldown (EM billed monthly — `disruption.consolidateAfter: 1h` minimum)
- `List/Get`: filter by tag `karpenter.sh/nodepool=<name>`
- `IsDrifted`: compare live server image hash (stored in tags) vs NodeClass
- `GetInstanceTypes`: static catalog mapping Scaleway EM offers (`EM-I220E`, `EM-L520E`, etc.) to Karpenter InstanceType resources (CPU, memory, zones, price)

## Key gotchas

- [ ] **Disk name variability** — `nvme0n1` (most offers), `sda` (legacy A-series). Auto-detect via `lsblk`.
- [ ] **Interface name mismatch** between rescue (`eth0`) and Talos kernel (`enpXsY`). Need to detect via `talosctl --insecure get links` post-boot or use MAC-based match in machine config.
- [ ] **Out-of-stock** — surface as `UnavailableOffering` so Karpenter falls back to VM pools.
- [ ] **Quota** — default 5 servers/project, needs ticket to raise.
- [ ] **IPv6** — Scaleway gives /128, not /64. Use static config.
- [ ] **Monthly billing** — no refund on Delete. Enforce min-lifetime via NodePool `disruption.consolidateAfter: 1h`.
- [ ] **Talos image hosting** — use Garage S3 bucket (already in `stacks/storage`) for the `.raw.xz` files. One per schematic-sha7.

## Estimated effort

| Task | Days |
|---|---|
| Skeleton (copy from `kwok` provider) + CRD + `main.go` | 2-3 |
| Scaleway SDK wrappers (7 methods) | 3-4 |
| Rescue+dd orchestration (SSH robust, retry, timeouts) | 3-5 |
| Talos machine config rendering per-node | 2 |
| Out-of-stock + drift + pricing catalog | 2 |
| E2E tests on fr-par-2 + doc + OSS release | 3-4 |
| **Total** | **15-20 days (3-4 weeks)** |

## Prerequisites (to do before starting)

- [ ] **Decision trigger**: Phase A stable in prod for 2-3 months
- [ ] **Cost justification**: measure ≥5 nodes saturated >12h/day, OR massive workloads (DB, Spark, ML training) that justify I/O on bare-metal
- [ ] **Stock / quota check**: raise EM quota via Scaleway support ticket
- [ ] **1-2 test servers** pre-provisioned manually to validate rescue+dd flow before coding

## Alternatives considered

| Approach | Effort | Verdict |
|---|---|---|
| Upstream PR to CAPS adding `ScalewayElasticMetalMachine` | 25-35 days + Scaleway review cycle | Slower, uncertain timeline, more CAPI ceremony for no functional gain |
| Siderolabs Omni BMIP (commercial) | Negligible | License cost, vendor lock-in |
| Matchbox | N/A | Incompatible — Scaleway EM doesn't expose DHCP/PXE control |
| Cloud-init + kexec on Debian | Similar effort | Slower (~14 min), riskier (kexec fails = rescue mandatory) |
| BMC + ISO manual | N/A | Not scriptable (HTML5/Java KVM) |
| **Native Karpenter + rescue + dd** | **15-20 days** | **✅ Winner** |

## References

- ADR-024: Hybrid autoscaling architecture
- ADR-025: Kamaji detailed architecture
- [Scaleway Elastic Metal API](https://www.scaleway.com/en/developers/api/elastic-metal/)
- [scaleway-sdk-go/api/baremetal/v1](https://github.com/scaleway/scaleway-sdk-go/tree/master/api/baremetal/v1)
- [Karpenter CloudProvider interface](https://github.com/kubernetes-sigs/karpenter/blob/main/pkg/cloudprovider/types.go#L72)
- [Karpenter kwok reference provider](https://github.com/kubernetes-sigs/karpenter/tree/main/kwok/cloudprovider)
- [Talos Image Factory](https://factory.talos.dev)

---

**Related tasks**: supersedes task #20 (bare-metal CAPI path decision) — outcome: path D (custom Karpenter provider) chosen over A (CAPS upstream) and B (Matchbox, which is incompatible with Scaleway EM).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase B — Karpenter-native provider for Scaleway Elastic Metal (no-image + rescue + dd) #1

Context

Scope

Implementation approach

Create(NodeClaim) flow

Delete/List/Drift/InstanceTypes

Key gotchas

Estimated effort

Prerequisites (to do before starting)

Alternatives considered

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Task	Days
Skeleton (copy from `kwok` provider) + CRD + `main.go`	2-3
Scaleway SDK wrappers (7 methods)	3-4
Rescue+dd orchestration (SSH robust, retry, timeouts)	3-5
Talos machine config rendering per-node	2
Out-of-stock + drift + pricing catalog	2
E2E tests on fr-par-2 + doc + OSS release	3-4
Total	15-20 days (3-4 weeks)

Approach	Effort	Verdict
Upstream PR to CAPS adding `ScalewayElasticMetalMachine`	25-35 days + Scaleway review cycle	Slower, uncertain timeline, more CAPI ceremony for no functional gain
Siderolabs Omni BMIP (commercial)	Negligible	License cost, vendor lock-in
Matchbox	N/A	Incompatible — Scaleway EM doesn't expose DHCP/PXE control
Cloud-init + kexec on Debian	Similar effort	Slower (~14 min), riskier (kexec fails = rescue mandatory)
BMC + ISO manual	N/A	Not scriptable (HTML5/Java KVM)
Native Karpenter + rescue + dd	15-20 days	✅ Winner

Phase B — Karpenter-native provider for Scaleway Elastic Metal (no-image + rescue + dd) #1

Description

Context

Scope

Implementation approach

Create(NodeClaim) flow

Delete/List/Drift/InstanceTypes

Key gotchas

Estimated effort

Prerequisites (to do before starting)

Alternatives considered

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions