Skip to content

fix: validate repo parameter before cache and GitHub API calls#1767

Merged
Priyanshu-byte-coder merged 1 commit into
Priyanshu-byte-coder:mainfrom
Ridanshi:fix/repo-analytics-input-validation
May 31, 2026
Merged

fix: validate repo parameter before cache and GitHub API calls#1767
Priyanshu-byte-coder merged 1 commit into
Priyanshu-byte-coder:mainfrom
Ridanshi:fix/repo-analytics-input-validation

Conversation

@Ridanshi
Copy link
Copy Markdown
Contributor

Closes #1700

Problem

GET /api/metrics/repo-analytics accepted arbitrary values for the ?repo= query parameter and interpolated them directly into GitHub API URL paths and cache keys with no format validation:

GET /api/metrics/repo-analytics?repo=octocat/Hello-World/issues

would silently construct and call:

GET https://api.github.com/repos/octocat/Hello-World/issues
GET https://api.github.com/repos/octocat/Hello-World/issues/contributors
GET https://api.github.com/repos/octocat/Hello-World/issues/languages
GET https://api.github.com/repos/octocat/Hello-World/issues/stats/commit_activity

These are entirely different GitHub API endpoints (issue lists, not repository metadata). The only guard was if (!repoParam), which passes for any non-empty string.

Consequences confirmed by code reading:

  1. Extra path segments reach unintended GitHub API endpoints (the reported case: owner/repo/issues)
  2. Path traversal patterns such as owner/.. are accepted and forwarded
  3. Cache pollution: the raw value lands in the cache key (repo-analytics-${repoParam}), so malformed inputs create permanent cache entries for non-existent resources
  4. Unnecessary rate-limit consumption: GitHub API quota is spent on requests that will always fail or return wrong data

Root cause

Three lines in src/app/api/metrics/repo-analytics/route.ts:

// Line 23 — raw input in cache key
const key = metricsCacheKey(, `repo-analytics-${repoParam}` as any, );

// Lines 27, 34, 40, 57 — raw input in URL paths
fetch(`${GITHUB_API}/repos/${repoParam}`, )
fetch(`${GITHUB_API}/repos/${repoParam}/contributors?per_page=10`, )
fetch(`${GITHUB_API}/repos/${repoParam}/languages`, )
fetch(`${GITHUB_API}/repos/${repoParam}/stats/commit_activity`, )

Fix

src/app/api/metrics/repo-analytics/route.ts

Introduces parseRepoParam(raw) — validates the raw string against a strict regex before any other processing:

REPO_IDENTIFIER_RE = /^([a-zA-Z0-9](?:[a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?)\/([a-zA-Z0-9._-]{1,100})$/

Rules enforced:

  • owner: 1–39 chars, alphanumeric + hyphens, cannot start or end with a hyphen (mirrors GitHub's account name rules)
  • repo: 1–100 chars, alphanumeric, dots, hyphens, underscores
  • Exactly one slash between them — any extra segment is rejected
  • . and .. are explicitly excluded as repo names even though they match the character set

On validation failure → 400 Bad Request is returned immediately, before the cache key is computed or any fetch() is called.

GitHub API URLs are then built from encodeURIComponent(owner)/encodeURIComponent(repo) so each path segment is individually URL-safe.

Request lifecycle (after fix)

Request arrives
  → parseRepoParam()  ← new validation gate
      ├── invalid  →  400 (no cache key, no fetch)
      └── valid    →  metricsCacheKey()  →  withMetricsCache()  →  fetch()

Tests

test/repo-analytics-validation.test.ts — 37 new tests:

Category Cases
Valid inputs Standard, single-char, max-length owner (39), max-length repo (100), dots, hyphens, underscores, whitespace trimming
Missing segments No slash, empty string, whitespace-only, trailing slash, leading slash
Extra path segments (regression) Three-segment path (owner/repo/issues), four-segment path
Path traversal owner/.., owner/., ../../../admin, double leading slash
Invalid characters Spaces, @, ?, null bytes
Length violations Owner 40 chars, repo 101 chars
Hyphen rules Owner starts with hyphen, owner ends with hyphen
Route 401/400 integration No session, missing param, each malformed category
Fetch/cache not called All 400 paths confirmed to skip fetch() and withMetricsCache()
Valid path proceeds withMetricsCache is called; cache key uses validated, trimmed form
Cache key correctness Raw whitespace-padded input maps to the same clean cache key

All 37 pass. The one pre-existing failure in test/dateUtils.test.ts (timezone boundary) is unrelated to this change.

The repo-analytics endpoint accepted arbitrary values for the ?repo=
query parameter and inserted them directly into GitHub API URLs and
cache keys without any format validation:

  GET /api/metrics/repo-analytics?repo=octocat/Hello-World/issues

would produce requests to:

  GET https://api.github.com/repos/octocat/Hello-World/issues
  GET https://api.github.com/repos/octocat/Hello-World/issues/contributors
  ...

constructing entirely different GitHub API endpoints than intended.
Extra path segments, path traversal patterns ("owner/.."), and other
malformed values were also accepted, consuming GitHub API rate-limit
quota and polluting the metrics cache.

Changes to src/app/api/metrics/repo-analytics/route.ts:
- Export parseRepoParam() — validates the raw string against a strict
  regex that accepts exactly "owner/repo" where:
    * owner: 1–39 chars, alphanumeric + hyphens, not starting/ending
      with a hyphen (mirrors GitHub's username rules)
    * repo: 1–100 chars, alphanumeric + dots + hyphens + underscores
    * exactly one slash separator, no additional segments
  Additionally rejects "." and ".." as repo names.
- Return 400 immediately when validation fails — before computing the
  cache key or calling fetch().
- Build GitHub API URLs from encodeURIComponent(owner)/encodeURIComponent(repo)
  so each path segment is individually safe even if the regex is ever
  relaxed in the future.
- Use the validated, trimmed owner/repo pair in the cache key rather than
  the raw query-string value.

test/repo-analytics-validation.test.ts — 37 new tests:
  parseRepoParam unit tests:
  - valid standard/edge cases (single chars, hyphens, dots, max lengths)
  - whitespace trimming
  - missing/empty segments
  - extra path segments (the reported issue, regression for Priyanshu-byte-coder#1700)
  - path traversal: "..", ".", "../../admin"
  - invalid characters: spaces, @, ?, null bytes
  - length violations: owner >39, repo >101
  - owner hyphen rules: leading/trailing hyphens

  GET route integration tests:
  - 401 without session
  - 400 for missing parameter (no fetch or cache calls)
  - 400 for three-segment path — regression for Priyanshu-byte-coder#1700
  - 400 for bare name, path traversal, hyphen violations, whitespace
  - none of the 400 paths call fetch() or withMetricsCache()
  - valid owner/repo reaches withMetricsCache
  - cache key uses validated form, not raw input
  - invalid input never generates a cache key

Closes Priyanshu-byte-coder#1700
@vercel
Copy link
Copy Markdown

vercel Bot commented May 31, 2026

@Ridanshi is attempting to deploy a commit to the PRIYANSHU DOSHI's projects Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions Bot added gssoc26 GSSoC 2026 contribution type:bug GSSoC type bonus: bug fix type:testing GSSoC type bonus: tests (+10 pts) labels May 31, 2026
@github-actions
Copy link
Copy Markdown

GSSoC Label Checklist 🏷️

@Priyanshu-byte-coder — please apply the appropriate labels before merging:

Difficulty (pick one):

  • level:beginner — 20 pts
  • level:intermediate — 35 pts
  • level:advanced — 55 pts
  • level:critical — 80 pts

Quality (optional):

  • quality:clean — ×1.2 multiplier
  • quality:exceptional — ×1.5 multiplier

Validation (required to score):

  • gssoc:approved — counts for points
  • gssoc:invalid / gssoc:spam / gssoc:ai-slop — does not score

Type labels (type:*) are auto-detected from files and title. Review and adjust if needed.
Points formula: (difficulty × quality_multiplier) + type_bonus

@Priyanshu-byte-coder Priyanshu-byte-coder added level2 GSSoC Level 2 - Medium complexity (25 points) gssoc:approved GSSoC: PR approved for scoring labels May 31, 2026
@Priyanshu-byte-coder Priyanshu-byte-coder merged commit 9878cb9 into Priyanshu-byte-coder:main May 31, 2026
4 of 5 checks passed
@github-actions
Copy link
Copy Markdown

🎉 Merged! Thanks for contributing to DevTrack.

If the project has been useful to you, a ⭐ star on the repo is the easiest way to support it — it helps DevTrack get discovered by more developers.

Keep an eye on open issues for your next contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gssoc:approved GSSoC: PR approved for scoring gssoc26 GSSoC 2026 contribution level2 GSSoC Level 2 - Medium complexity (25 points) type:bug GSSoC type bonus: bug fix type:testing GSSoC type bonus: tests (+10 pts)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Repo analytics accepts unvalidated repo query values

2 participants