Feat/RW-1123 by stewartshea · Pull Request #800 · runwhen-contrib/runwhen-local

stewartshea · 2026-05-28T14:49:44Z

Updated the Azure resource type registry to include new resource types such as Azure Cosmos SQL databases, MySQL flexible servers, PostgreSQL databases, and Redis caches, improving the breadth of resource discovery.
Refactored the Azure API indexer to support selective indexing, ensuring that only relevant resources are processed based on workspace configuration.
Improved error handling for Azure Service Principal configuration, providing clearer feedback when required fields are missing.
Enhanced documentation to clarify the new resource types and indexing logic, ensuring developers have up-to-date information on the indexing process and available resources.
Updated Dockerfile and Python dependencies to support the new Azure SDK features, ensuring compatibility and stability in resource management.

Note

Low Risk
Changes are limited to workflows, test harnesses, and Helm values; no production application logic in this diff.

Overview
This PR aligns RunWhen Local integration tests and CI with the Workspace Builder on port 8000 (replacing 8081), drops MkDocs cheat-sheet stamping from image build workflows, and adds live Azure azureapi indexer validation.

Port and packaging: Docker/Helm/Taskfile fixtures map host ports to 8000; chart comments describe the REST service instead of a separate cheat sheet. EKS Helm values drop cheatSheet.disabled.

CI: ado-ci-test path filters include azureapi, azure_common, and resource_writer sources. merge_to_main and pr_open no longer run sed on mkdocs.yml for version/date.

Azure backend tests: Adds diff_resource_dump.py to compare resource-dump.yaml from cloudquery vs azureapi (ignoring _cq_* metadata). multi-subscription-aks gains run-backend-equivalence-test. New .test/azure/no-aks-resources fixture (cheap Terraform RGs/storage/KV) with Taskfile CI for baseline discovery, selective per-RG LOD, excludeTags, and cross-backend equivalence via resources.sqlite and indexer log counters.

^{Reviewed by Cursor Bugbot for commit 39ca6e5. Bugbot is set up for automated code reviews on this repo. Configure here.}

…sage and enhance documentation - Updated CI workflows to include additional indexers for Azure API resources. - Changed port mappings from 8081 to 8000 across various Taskfiles and documentation for consistency. - Revised documentation to reflect the new architecture and clarify the role of the Workspace Builder and REST API. - Removed references to the Cheat Sheet in favor of Discovery Output in documentation for improved clarity.

- Added functionality to persist workspace artifacts in the SQLite database, including a new `workspace_artifacts` table to store various rendered outputs. - Implemented checks for the presence of workspace artifacts and SLX files during validation, improving error handling and user feedback. - Updated documentation to include details on the new `ResourceWriter` and `Resource store query API`, enhancing clarity for developers. - Enhanced the Workspace Explorer UI to better display indexed resources and artifacts, improving user experience. - Refactored related code to streamline the handling of Skill overlays and artifact rendering, ensuring consistency across the application.

- Updated the `AzureResourceTypeSpec` class to support two collector methods: `collector_all` for subscription-wide listings and `collector_in_rg` for resource-group scoped listings. - Enhanced documentation to clarify the purpose and usage of the new collector methods and the process for adding new Azure resource types. - Implemented selective indexing logic to drop resources with an effective Level of Detail (LOD) of NONE before reaching the writer, improving efficiency and clarity in resource management. - Added functions to extract resource-group and subscription IDs from ARM IDs, facilitating better resource scoping and management. - Improved overall code structure and readability, ensuring compatibility with existing callers while introducing new functionality.

- Deleted the GitHub Issues documentation for requesting commands, reporting bugs, and requesting features. - Removed the roadmap documentation that outlined project plans. - Cleared the SUMMARY file which contained the table of contents for the documentation. - Eliminated the introduction documentation for the User Guide. - Deleted various image assets and diagrams that were no longer in use. - Added new documentation for container development and high-level architecture, enhancing clarity on the project's structure and usage. - Introduced workspace generation statistics documentation to provide insights into the resource discovery and generation process. - Implemented support for namespace-level LODs in AKS clusters, allowing for more granular control over resource management.

- Updated the Azure resource type registry to include new resource types such as Azure Cosmos SQL databases, MySQL flexible servers, PostgreSQL databases, and Redis caches, improving the breadth of resource discovery. - Refactored the Azure API indexer to support selective indexing, ensuring that only relevant resources are processed based on workspace configuration. - Improved error handling for Azure Service Principal configuration, providing clearer feedback when required fields are missing. - Enhanced documentation to clarify the new resource types and indexing logic, ensuring developers have up-to-date information on the indexing process and available resources. - Updated Dockerfile and Python dependencies to support the new Azure SDK features, ensuring compatibility and stability in resource management.

- Updated Python version badge from 3.10 to 3.14. - Reworked the project description to emphasize its role as a discovery tool for cloud and Kubernetes infrastructure, introducing the concept of "Skills." - Restructured the Table of Contents for better navigation and clarity. - Added detailed sections on discovery, skill tailoring, and the local explorer UI, enhancing user understanding of functionality. - Removed outdated sections and streamlined content to focus on current features and usage.

cursor · 2026-05-28T19:02:33Z

+
+
+class DiffError(Exception):
+    pass


Unused DiffError exception class never raised

Low Severity

DiffError is defined but never raised or referenced anywhere in diff_resource_dump.py. It appears to be leftover scaffolding from an earlier design where errors were raised rather than collected into the differences list.

^{Reviewed by Cursor Bugbot for commit 0ed9cf5. Configure here.}

Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # src/VERSION

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 39ca6e5. Configure here.}

cursor · 2026-05-28T19:17:10Z

+          fi
+        done
+        # Verify the indexer logged that selective discovery actually ran.
+        if grep -qE "selective discovery, in-scope RGs" run_sh_output.log container_logs.log 2>/dev/null; then


Assertions grep missing container_logs.log file

Medium Severity

The assert-selective and assert-tag-filter tasks grep both run_sh_output.log and container_logs.log for required indexer log patterns (e.g. selective discovery, skipped_tag_filter, skipped_lod_filter). However, the run-rwl-discovery task never creates container_logs.log — unlike the ADO fixture which explicitly runs docker logs "$CONTAINER_NAME" > container_logs.log 2>&1. If the indexer summary line is emitted by the container's main process rather than the exec'd run.sh, it won't appear in run_sh_output.log, and the assertion will incorrectly fail because container_logs.log doesn't exist.

Additional Locations (2)

.test/azure/no-aks-resources/Taskfile.yaml#L346-L347

.test/azure/no-aks-resources/Taskfile.yaml#L387-L388

^{Reviewed by Cursor Bugbot for commit 39ca6e5. Configure here.}

…S smoke workflow - Have every workspaceInfo.yaml emitted by .test/azure/aks-and-k8s/, .test/azure/aks-helm-installed-mi/, and .test/azure/aks-helm-installed-sp/ set 'azureIndexerBackend: azureapi' so the existing AKS/helm matrices exercise the native indexer instead of the legacy CloudQuery path. - Make .test/azure/no-aks-resources/'s build-rwl tasks use a relative Dockerfile path so they work in CI runners as well as the dev container. - Add .github/workflows/test-azure-indexer.yaml: a single-job, AKS-free smoke test that provisions the no-aks-resources terraform fixture, runs the existing ci-test-azureapi-baseline / -selective / -tag-filter tasks, then tears the infra down. Triggers on src/** + fixture changes and is also workflow_dispatch-able for ad-hoc verification. Co-authored-by: Cursor <cursoragent@cursor.com>

… invocation in no-aks asserts - src/indexers/azureapi_resource_types.py: the subscription-wide `_collect_redis_caches_all` collector was calling `RedisManagementClient.redis.list()`, which doesn't exist on `RedisOperations`. Switch to `list_by_subscription()`, the actual pager exposed by current azure-mgmt-redis. This was surfacing as a non-fatal "no attribute 'list'" warning during full-coverage runs. - .test/azure/no-aks-resources/Taskfile.yaml: the `assert-baseline`, `assert-selective`, `assert-tag-filter`, and `generate-selective-config` tasks were running `terraform show -json terraform/terraform.tfstate` from the test root, but `terraform init` only ran inside ./terraform/. Without the provider plugin cache in the working dir, terraform aborts with "Failed to load plugin schemas" and jq errors out parsing the empty stdout. Switch to `terraform -chdir=terraform show -json terraform.tfstate` and capture the JSON once per task instead of re-shelling out per output. Discovery itself was already passing end-to-end (76 SLXs against the new azureapi indexer); this fixes the assertion step in the new Discovery Azure Indexer Tests workflow. Co-authored-by: Cursor <cursoragent@cursor.com>

The Azure SDK indexer was deciding which typed (rich-payload) collectors to invoke by walking AZURE_RESOURCE_TYPE_SPECS and checking whether each spec's resource_type_name or cloudquery_table_name appeared in the set of names referenced by loaded gen rules. That misses any gen rule that references a registered *alias* of a typed spec. Concretely: contrib gen rules reference `azure_keyvault_keyvault`, but the Key Vault typed spec is canonicalized as `azure_keyvault_vaults` (rwl alias) with `azure_keyvault_keyvaults` as its CQ table. Neither matches, so the typed Key Vault collector was being skipped, and Key Vault resources never landed in the resource store. The same pattern silently affected any other aliased typed type. Refactor the selection loop to dispatch every accessed name through ``find_spec`` (which already canonicalizes aliases) and bucket the result as either typed or generic. Mandatory typed specs (today just resource_group) are still seeded up-front. Behaviour for non-aliased gen-rule references is unchanged. Also captures docker-side indexer stdout into container_logs.log in .test/azure/no-aks-resources/Taskfile.yaml so the assert-* tasks have a fresh, deterministic file to grep for "selective discovery" / "skipped_*_filter" markers (the FastAPI process emits those to docker stdout, not to run_sh_output.log). Locally, all three end-to-end scenarios now pass against live infra: task ci-test-azureapi-baseline -> 5/5 resources (incl. KeyVault) task ci-test-azureapi-selective -> keep present, drop absent, mode logged task ci-test-azureapi-tag-filter -> keep present, drop absent, skipped>0 Co-authored-by: Cursor <cursoragent@cursor.com>

The check-and-cleanup-terraform wrapper used `tee /dev/tty` to mirror the infra-check output to the human running the task. GitHub Actions runners have no controlling tty, so the pipe failed with `tee: /dev/tty: No such device or address`, the wrapper exited non-zero, and `terraform destroy` never ran. The job was left red despite all three azureapi assertions passing, and Azure resources from the run were leaked. Replace `| tee /dev/tty` with `echo "$out"` after capturing the check output, which works in both interactive shells and CI. Co-authored-by: Cursor <cursoragent@cursor.com>

- Added a section in the README to introduce the built-in Model Context Protocol (MCP) server, detailing its functionality and usage for AI agents. - Updated the documentation structure to include references to the MCP server in relevant sections, improving discoverability. - Introduced a lifespan management for the MCP server in the FastAPI application, allowing for better integration and control over its lifecycle. - Added new dependencies related to the MCP server in the poetry.lock and pyproject.toml files, ensuring compatibility with the latest features.

- Introduced documentation for the GCP indexer, detailing its functionality and configuration options, including the new `gcpIndexerBackend` settings. - Updated the `README.md` to include a link to the GCP indexer internals. - Modified the component initialization in `component.py` to include the `gcpapi` indexer. - Enhanced the `run.py` and `run.sh` scripts to support the new GCP indexer backend configuration. - Updated dependencies in `pyproject.toml` and `poetry.lock` to include necessary Google Cloud libraries.

- Introduced the native AWS indexer (`awsapi`) alongside the existing CloudQuery-backed indexer, allowing for more direct resource discovery using the AWS Cloud Control API and `boto3` SDK. - Updated the `aws.md` documentation to detail the new indexing options and configuration in `workspaceInfo.yaml`. - Enhanced the `Taskfile.yaml` with new tasks for generating baseline configurations and asserting AWS resource discovery. - Modified component initialization to include the `awsapi` indexer and updated relevant scripts to support the new backend configuration. - Added links to AWS indexer internals in the architecture documentation for better discoverability. - Improved handling of AWS credentials in the indexing process to support both file paths and inline content.

stewartshea added 5 commits May 26, 2026 02:26

stewartshea marked this pull request as ready for review May 28, 2026 18:49

cursor Bot reviewed May 28, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into feat/RW-1123

39ca6e5

Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # src/VERSION

stewartshea marked this pull request as draft May 28, 2026 19:12

cursor Bot reviewed May 28, 2026

View reviewed changes

stewartshea and others added 8 commits May 28, 2026 19:37

remove uplaod test until it is refactored after crdless

e10baa9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/RW-1123#800

Feat/RW-1123#800
stewartshea wants to merge 15 commits into
mainfrom
feat/RW-1123

stewartshea commented May 28, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot May 28, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stewartshea commented May 28, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Unused DiffError exception class never raised

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Assertions grep missing container_logs.log file

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

stewartshea commented May 28, 2026 •

edited by cursor Bot

Loading

Unused `DiffError` exception class never raised

Assertions grep missing `container_logs.log` file