Skip to content

infra PR2: add GCP Terraform baseline modules#17

Merged
hudsonaikins merged 12 commits into
mainfrom
codex/infra-pr2-terraform-gcp-modules
Feb 26, 2026
Merged

infra PR2: add GCP Terraform baseline modules#17
hudsonaikins merged 12 commits into
mainfrom
codex/infra-pr2-terraform-gcp-modules

Conversation

@hudsonaikins
Copy link
Copy Markdown
Contributor

@hudsonaikins hudsonaikins commented Feb 26, 2026

Summary

  • add infra/terraform baseline for GCP with pinned provider/core versions
  • add reusable modules:
    • network (VPC, subnet, baseline firewall)
    • runner_vm (Compute Engine runner + service account support)
    • secrets (Secret Manager containers + IAM access grants)
    • observability (log metric + optional alert policy)
  • commit root .terraform.lock.hcl
  • add module contract documentation in infra/terraform/README.md

Validation

  • terraform init -backend=false (in infra/terraform)
  • terraform fmt -check -recursive (in infra/terraform)
  • terraform validate (root + each module)

Greptile Summary

This PR establishes a solid Terraform foundation for GCP infrastructure with four modular components:

Core Changes:

  • network module: Creates VPC with custom subnet, private Google access, and firewall rules for internal traffic and SSH
  • runner_vm module: Provisions Compute Engine instances with optional service accounts, Docker bootstrap script, shielded instance config, and optional public IPs
  • secrets module: Manages Secret Manager containers and grants runner service account access via IAM bindings
  • observability module: Sets up log-based metrics for error tracking with optional alert policies

Notable Security Features:

  • Shielded instance configuration enabled (secure boot, vTPM, integrity monitoring)
  • Restrictive default service account scopes (logging, monitoring, storage read-only)
  • Hardened precondition validating service account email when using existing accounts
  • Startup script adds debian/ubuntu users to docker group for proper permissions

Documentation:

  • Comprehensive README with module contracts, input/output specifications, and usage examples
  • Provider lock file committed for reproducible deployments
  • Version constraints pinned (Terraform >= 1.5.0, Google provider ~> 5.0)

The code addresses many concerns from previous review threads - the secrets module exists, docker permissions are handled, public IP is optional, and service account scopes are appropriately restrictive.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk - well-structured infrastructure code with good security practices
  • Score reflects solid implementation with comprehensive documentation and security best practices. Deducted one point for lack of input validations and because this is baseline infrastructure requiring environment-specific configuration (backend, actual deployment) in follow-up PRs. The code has addressed previous review concerns about docker permissions, service account handling, and public IP assignment.
  • No files require special attention - all modules are well-implemented with appropriate defaults and documentation

Important Files Changed

Filename Overview
infra/terraform/README.md Comprehensive documentation for all four modules with input/output contracts, usage examples, and architectural intent
infra/terraform/versions.tf Root module version constraints requiring Terraform >= 1.5.0 and Google provider ~> 5.0
infra/terraform/modules/network/main.tf Creates VPC, subnet with private Google access, and firewall rules for internal traffic and SSH. Internal firewall allows all ICMP from subnet CIDR which could be more restrictive.
infra/terraform/modules/runner_vm/main.tf Creates compute instance with optional service account, shielded instance config, and hardened precondition that validates service_account_email when not creating a new account
infra/terraform/modules/secrets/main.tf Creates Secret Manager secret containers with auto-replication and grants secretAccessor role to runner service account
infra/terraform/modules/observability/main.tf Creates log-based metric for runner errors with optional alert policy that triggers on threshold breach

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[VPC Network] --> B[Subnet with Private Google Access]
    B --> C[Runner VM Instance]
    C --> D[Service Account]
    D --> E[IAM Scopes: logging, monitoring, storage]
    C --> F[Shielded Instance Config]
    C --> G[Optional Public IP]
    H[Secret Manager Secrets] --> I[IAM Binding]
    I --> D
    J[Log Metric] --> K{Error Logs?}
    K -->|Yes| L[Alert Policy]
    L --> M[Notification Channels]
    N[Firewall Rules] --> B
    N --> O[SSH Access]
    N --> P[Internal ICMP + Ports]
    C --> Q[Docker Bootstrap Script]
    Q --> R[Add users to docker group]
Loading

Last reviewed commit: 77ce7ee

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 26, 2026

📚 Documentation Status

ℹ️ No documentation-related changes detected


This comment is automatically generated by the documentation workflow.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 files reviewed, 14 comments

Edit Code Review Agent Settings | Greptile

Comment thread infra/terraform/README.md
Comment on lines +51 to +57
- `project_id` (string)
- `secret_ids` (set(string))
- `runner_service_account_email` (string)

Outputs:
- `secret_ids`
- `secret_resource_ids`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The secrets module is documented but wasn't actually added in this PR

Suggested change
- `project_id` (string)
- `secret_ids` (set(string))
- `runner_service_account_email` (string)
Outputs:
- `secret_ids`
- `secret_resource_ids`
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 51-57

Comment:
The `secrets` module is documented but wasn't actually added in this PR

```suggestion
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
Comment on lines +14 to +29
## Module contracts

### `network`
Inputs:
- `project_id` (string)
- `region` (string)
- `network_name` (string)
- `subnet_name` (string)
- `subnet_cidr` (string)
- `allow_ssh_cidrs` (list(string), default `[]`)

Outputs:
- `network_name`
- `network_self_link`
- `subnetwork_name`
- `subnetwork_self_link`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable_private_google_access is missing from the documented inputs (used in modules/network/variables.tf and modules/network/main.tf:14)

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 14-29

Comment:
`enable_private_google_access` is missing from the documented inputs (used in `modules/network/variables.tf` and `modules/network/main.tf:14`)

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
Comment on lines +31 to +46
### `runner_vm`
Inputs:
- `project_id` (string)
- `zone` (string)
- `instance_name` (string)
- `machine_type` (string, default `e2-standard-2`)
- `network_self_link` (string)
- `subnetwork_self_link` (string)
- `create_service_account` (bool, default `true`)
- `service_account_email` (string, optional when `create_service_account=true`)
- `startup_script` (string, optional)

Outputs:
- `instance_name`
- `instance_self_link`
- `instance_external_ip`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing several variables from documentation: service_account_id, metadata, tags, boot_image, boot_disk_size_gb, boot_disk_type (all defined in modules/runner_vm/variables.tf)

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 31-46

Comment:
Missing several variables from documentation: `service_account_id`, `metadata`, `tags`, `boot_image`, `boot_disk_size_gb`, `boot_disk_type` (all defined in `modules/runner_vm/variables.tf`)

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread infra/terraform/README.md

- `modules/network`: VPC, subnet, and baseline firewall rules.
- `modules/runner_vm`: Compute Engine runner VM with Docker-friendly bootstrap hooks.
- `modules/secrets`: Secret Manager secret containers and runner service account access grants.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modules/secrets module is documented here but doesn't exist in this PR. Either remove this documentation or add the missing module.

Suggested change
- `modules/secrets`: Secret Manager secret containers and runner service account access grants.
- `modules/observability`: Log-based metric and optional alert policy for runtime errors.
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 11

Comment:
The `modules/secrets` module is documented here but doesn't exist in this PR. Either remove this documentation or add the missing module.

```suggestion
- `modules/observability`: Log-based metric and optional alert policy for runtime errors.
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
Comment on lines +49 to +58
### `secrets`
Inputs:
- `project_id` (string)
- `secret_ids` (set(string))
- `runner_service_account_email` (string)

Outputs:
- `secret_ids`
- `secret_resource_ids`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation for secrets module exists but the module itself is missing from this PR. Remove this entire section or implement the module.

Suggested change
### `secrets`
Inputs:
- `project_id` (string)
- `secret_ids` (set(string))
- `runner_service_account_email` (string)
Outputs:
- `secret_ids`
- `secret_resource_ids`
### `observability`
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 49-58

Comment:
Documentation for `secrets` module exists but the module itself is missing from this PR. Remove this entire section or implement the module.

```suggestion
### `observability`
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
- `network_name` (string)
- `subnet_name` (string)
- `subnet_cidr` (string)
- `allow_ssh_cidrs` (list(string), default `[]`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing enable_private_google_access input (bool, default true) from the module contract

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 23

Comment:
Missing `enable_private_google_access` input (bool, default `true`) from the module contract

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
Comment on lines +31 to +42
### `runner_vm`
Inputs:
- `project_id` (string)
- `zone` (string)
- `instance_name` (string)
- `machine_type` (string, default `e2-standard-2`)
- `network_self_link` (string)
- `subnetwork_self_link` (string)
- `create_service_account` (bool, default `true`)
- `service_account_email` (string, optional when `create_service_account=true`)
- `startup_script` (string, optional)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing several inputs from the module contract: service_account_id, metadata, tags, boot_image, boot_disk_size_gb, boot_disk_type. These are all optional with defaults but should be documented for completeness.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 31-42

Comment:
Missing several inputs from the module contract: `service_account_id`, `metadata`, `tags`, `boot_image`, `boot_disk_size_gb`, `boot_disk_type`. These are all optional with defaults but should be documented for completeness.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.


service_account {
email = local.resolved_service_account_email
scopes = ["https://www.googleapis.com/auth/cloud-platform"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using cloud-platform scope grants full access to all GCP APIs. Consider using more restrictive scopes based on actual needs (e.g., compute-rw, logging-write, monitoring-write) to follow the principle of least privilege.

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/runner_vm/main.tf
Line: 41

Comment:
Using `cloud-platform` scope grants full access to all GCP APIs. Consider using more restrictive scopes based on actual needs (e.g., `compute-rw`, `logging-write`, `monitoring-write`) to follow the principle of least privilege.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md Outdated
- `create_service_account` (bool, default `true`)
- `service_account_id` (string, default `neural-runner`)
- `service_account_email` (string, required when `create_service_account=false`)
- `service_account_scopes` (list(string), default least-privilege runner scopes)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation claims "least-privilege runner scopes" but the actual default includes full secretmanager scope (line 57 of modules/runner_vm/variables.tf), which grants access to all Secret Manager operations

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 43

Comment:
documentation claims "least-privilege runner scopes" but the actual default includes full `secretmanager` scope (line 57 of `modules/runner_vm/variables.tf`), which grants access to all Secret Manager operations

How can I resolve this? If you propose a fix, please make it concise.

network = var.network_self_link
subnetwork = var.subnetwork_self_link

access_config {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always assigns a public IP - consider making this optional via a variable for deployments that don't require external access

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/runner_vm/main.tf
Line: 32

Comment:
always assigns a public IP - consider making this optional via a variable for deployments that don't require external access

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +61 to +72
variable "startup_script" {
description = "Startup script for VM bootstrap"
type = string
default = <<-EOT
#!/usr/bin/env bash
set -euo pipefail

apt-get update
apt-get install -y docker.io
systemctl enable docker
systemctl start docker
EOT
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startup script installs docker but doesn't add any users to the docker group - only root can run docker commands without sudo

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/runner_vm/variables.tf
Line: 61-72

Comment:
startup script installs docker but doesn't add any users to the `docker` group - only root can run docker commands without sudo

How can I resolve this? If you propose a fix, please make it concise.


output "instance_external_ip" {
description = "External IP address for the runner VM"
value = google_compute_instance.runner.network_interface[0].access_config[0].nat_ip
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

access_config[0] will error when assign_public_ip=false (empty list)

Suggested change
value = google_compute_instance.runner.network_interface[0].access_config[0].nat_ip
value = length(google_compute_instance.runner.network_interface[0].access_config) > 0 ? google_compute_instance.runner.network_interface[0].access_config[0].nat_ip : null
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/runner_vm/outputs.tf
Line: 13

Comment:
`access_config[0]` will error when `assign_public_ip=false` (empty list)

```suggestion
  value       = length(google_compute_instance.runner.network_interface[0].access_config) > 0 ? google_compute_instance.runner.network_interface[0].access_config[0].nat_ip : null
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +17 to +34
resource "google_compute_firewall" "allow_internal" {
project = var.project_id
name = "${var.network_name}-allow-internal"
network = google_compute_network.this.name

allow {
protocol = "tcp"
}

allow {
protocol = "udp"
}

allow {
protocol = "icmp"
}

source_ranges = [var.subnet_cidr]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firewall rule allows ALL TCP and UDP ports from subnet CIDR without restrictions - consider limiting to specific ports needed for internal services

Suggested change
resource "google_compute_firewall" "allow_internal" {
project = var.project_id
name = "${var.network_name}-allow-internal"
network = google_compute_network.this.name
allow {
protocol = "tcp"
}
allow {
protocol = "udp"
}
allow {
protocol = "icmp"
}
source_ranges = [var.subnet_cidr]
resource "google_compute_firewall" "allow_internal" {
project = var.project_id
name = "${var.network_name}-allow-internal"
network = google_compute_network.this.name
allow {
protocol = "icmp"
}
# Add specific port ranges as needed, e.g.:
# allow {
# protocol = "tcp"
# ports = ["80", "443", "8080"]
# }
source_ranges = [var.subnet_cidr]
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/network/main.tf
Line: 17-34

Comment:
firewall rule allows ALL TCP and UDP ports from subnet CIDR without restrictions - consider limiting to specific ports needed for internal services

```suggestion
resource "google_compute_firewall" "allow_internal" {
  project = var.project_id
  name    = "${var.network_name}-allow-internal"
  network = google_compute_network.this.name

  allow {
    protocol = "icmp"
  }

  # Add specific port ranges as needed, e.g.:
  # allow {
  #   protocol = "tcp"
  #   ports    = ["80", "443", "8080"]
  # }

  source_ranges = [var.subnet_cidr]
}
```

How can I resolve this? If you propose a fix, please make it concise.


filter = <<-EOT
resource.type="gce_instance"
resource.labels.instance_id:*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing whitespace after resource.labels.instance_id:*

Suggested change
resource.labels.instance_id:*
resource.labels.instance_id:*

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/observability/main.tf
Line: 7

Comment:
trailing whitespace after `resource.labels.instance_id:*`

```suggestion
    resource.labels.instance_id:*
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@@ -1,3 +1,5 @@
"""Helpers for resolving Polymarket US auth credentials from env/files."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

polymarket auth changes (docstrings/formatting) are unrelated to the Terraform infrastructure PR - consider separating into a dedicated commit for cleaner git history

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: neural/auth/polymarket_us_env.py
Line: 1

Comment:
polymarket auth changes (docstrings/formatting) are unrelated to the Terraform infrastructure PR - consider separating into a dedicated commit for cleaner git history

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.


lifecycle {
precondition {
condition = var.create_service_account || var.service_account_email != null
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precondition allows empty string for service_account_email (empty string != null), which would cause runtime failure. Add empty string check:

Suggested change
condition = var.create_service_account || var.service_account_email != null
condition = var.create_service_account || (var.service_account_email != null && var.service_account_email != "")
Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/runner_vm/main.tf
Line: 55

Comment:
precondition allows empty string for `service_account_email` (empty string != null), which would cause runtime failure. Add empty string check:

```suggestion
      condition     = var.create_service_account || (var.service_account_email != null && var.service_account_email != "")
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md

- `modules/network`: VPC, subnet, and baseline firewall rules.
- `modules/runner_vm`: Compute Engine runner VM with Docker-friendly bootstrap hooks.
- `modules/secrets`: Secret Manager secret containers and runner service account access grants.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modules/secrets is documented but doesn't exist in this PR - remove from list or add the module

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 11

Comment:
`modules/secrets` is documented but doesn't exist in this PR - remove from list or add the module

How can I resolve this? If you propose a fix, please make it concise.

Comment thread infra/terraform/README.md
Comment on lines +60 to +68
### `secrets`
Inputs:
- `project_id` (string)
- `secret_ids` (set(string))
- `runner_service_account_email` (string)

Outputs:
- `secret_ids`
- `secret_resource_ids`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entire secrets module section should be removed - the module doesn't exist in this PR

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/README.md
Line: 60-68

Comment:
entire `secrets` module section should be removed - the module doesn't exist in this PR

How can I resolve this? If you propose a fix, please make it concise.

@hudsonaikins hudsonaikins merged commit 73bb517 into main Feb 26, 2026
8 checks passed
Comment on lines +45 to +58
resource "google_compute_firewall" "allow_ssh" {
count = length(var.allow_ssh_cidrs) > 0 ? 1 : 0

project = var.project_id
name = "${var.network_name}-allow-ssh"
network = google_compute_network.this.name

allow {
protocol = "tcp"
ports = ["22"]
}

source_ranges = var.allow_ssh_cidrs
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSH firewall rule applies to all instances in the VPC - add target_tags to restrict SSH access to only tagged instances (e.g., instances with the "neural-runner" tag)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: infra/terraform/modules/network/main.tf
Line: 45-58

Comment:
SSH firewall rule applies to all instances in the VPC - add `target_tags` to restrict SSH access to only tagged instances (e.g., instances with the "neural-runner" tag)

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant