Skip to content

Use real LIS CSI EKS addon for instance store metrics integration test#675

Merged
nathalapooja merged 4 commits intomainfrom
liscsi-integ-test-real-addon
Apr 30, 2026
Merged

Use real LIS CSI EKS addon for instance store metrics integration test#675
nathalapooja merged 4 commits intomainfrom
liscsi-integ-test-real-addon

Conversation

@nathalapooja
Copy link
Copy Markdown
Contributor

Summary

Replace mock CSI driver with real aws-ec2-local-instance-store-csi-driver EKS addon for NVMe instance store metrics integration testing.

Changes

  • terraform: Install LIS CSI addon with metrics enabled, deploy IO workload with ephemeral volume on ec2-instance-store-sc StorageClass
  • terraform: Use i7i.xlarge (NVMe instance store), K8s 1.33, AL2023 AMI
  • terraform: Add rollout wait and debug output for CWAgent image patch
  • terraform: Align providers.tf and resource ordering with EBS CSI test pattern
  • generator: Add k8sVersion override support to testConfig struct
  • generator: Set liscsi test overrides (k8sVersion=1.33, i7i.xlarge, AL2023)
  • test: Increase agent run duration to 5 minutes for metric propagation
  • test: Remove mock-lis-csi.yaml (no longer needed)

Testing

All 9 metrics validated across 3 dimension sets (27 metric series) + EMF logs ✅

EKS_LIS_CSI Successful
ClusterName Successful (9/9 metrics)
ClusterName-InstanceId-NodeName Successful (9/9 metrics)
ClusterName-InstanceId-NodeName-VolumeId Successful (9/9 metrics)
emf-logs Successful
PASS TestLISCSISuite (305.02s)

Successful run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/25131500439/job/73658793882

Dependencies

  • Requires aws/amazon-cloudwatch-agent#2104 (contrib bump with LIS scraper label fix)
  • Requires aws-ec2-local-instance-store-csi-driver addon allowlisted for the CI account

@nathalapooja nathalapooja requested a review from a team as a code owner April 29, 2026 20:36
Comment thread generator/test_case_generator.go Outdated
targets: map[string]map[string]struct{}{"arc": {"amd64": {}}},
instanceType: "i7i.xlarge",
ami: "AL2023_x86_64_STANDARD",
k8sVersion: "1.33",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a override for the k8sVersion at this level? Given we already have to maintain the default in the variables.tf?

Also mind using 1.35 given thats the latest.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no — removing the k8sVersion from the test case generator would break CI. Here's why:

The CI workflow always passes the k8s version from the test matrix to Terraform explicitly:

terraform apply -var="k8s_version=${{ matrix.arrays.k8sVersion }}"

If k8sVersion is missing from the Go test config, the generated test matrix row will inherit the default "1.31" from eks_daemon_test_matrix.json (the shared matrix for all EKS daemon tests). That means liscsi would get k8s_version=1.31 passed to Terraform, overriding the 1.35 default in variables.tf.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fine right? We just update what is in the test matrix then to be 1.35 for all our tests.

Replace mock CSI driver with real aws-ec2-local-instance-store-csi-driver
EKS addon for NVMe instance store metrics integration testing.

Changes:
- terraform: Install LIS CSI addon with metrics enabled, deploy IO workload
  with ephemeral volume on ec2-instance-store-sc StorageClass
- terraform: Use i7i.xlarge (NVMe instance store), K8s 1.33, AL2023 AMI
- terraform: Add rollout wait and debug output for CWAgent image patch
- terraform: Align providers.tf and resource ordering with EBS CSI test
- generator: Add k8sVersion override support to testConfig struct
- generator: Set liscsi test overrides (k8sVersion=1.33, i7i.xlarge, AL2023)
- test: Increase agent run duration to 5 minutes for metric propagation
- test: Remove mock-lis-csi.yaml (no longer needed)

Tested: All 9 metrics validated across 3 dimension sets (27 series) + EMF logs
Run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/25131500439/job/73658793882
@nathalapooja nathalapooja force-pushed the liscsi-integ-test-real-addon branch from e495a51 to 73318ec Compare April 30, 2026 17:11
@nathalapooja nathalapooja merged commit 9bdc034 into main Apr 30, 2026
6 checks passed
@nathalapooja nathalapooja deleted the liscsi-integ-test-real-addon branch April 30, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants