Skip to content

perf(orchestrator): optimize AI calls for rate limits and token usage#2

Merged
JStaRFilms merged 5 commits into
mainfrom
chunking
Dec 14, 2025
Merged

perf(orchestrator): optimize AI calls for rate limits and token usage#2
JStaRFilms merged 5 commits into
mainfrom
chunking

Conversation

@JStaRFilms
Copy link
Copy Markdown
Owner

Reduce token limit from 8000 to 6000 for more headroom under 10k TPM. Add retry mechanism with exponential backoff for rate limit errors. Change batch processing to sequential with 1s delays to respect TPM limits.

Reduce token limit from 8000 to 6000 for more headroom under 10k TPM.
Add retry mechanism with exponential backoff for rate limit errors.
Change batch processing to sequential with 1s delays to respect TPM limits.
@github-actions
Copy link
Copy Markdown

🟢 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
92/100 COMMENT - 1 2 -

📄 src/orchestrator.ts

Warning

Sequential AI Calls with 1s Pause
Reducing BATCH_SIZE from 3 to 1 plus a forced 1s sleep per batch turns parallel review into serialized calls, multiplying wall-clock time by ~3x for large PRs.

🔹 Hard-Coded Retry Parameters

Category: LOGIC

Retries, delay, and backoff factor are hard-coded, preventing runtime tuning for different model limits or deployment environments.

🔹 Fragile Error Detection

Category: LOGIC

Matching on substrings like 'token' can misclassify unrelated errors as rate limits, hiding real failures and burning retries.

🛠️ Recommended Fixes

  • Sequential AI Calls with 1s Pause: Restore BATCH_SIZE=3 or make it configurable via env var; throttle only on actual 429 responses, not proactively, to keep throughput while respecting limits.
  • Hard-Coded Retry Parameters: Move retries, initialDelay, and backoffMultiplier to environment variables with sensible defaults (e.g., RETRIES=3, RETRY_DELAY_MS=2000, BACKOFF_FACTOR=2).
  • Fragile Error Detection: Check error.statusCode === 429 and error.code === 'rate_limit_exceeded' first; only fall back to substring checks when those fields are absent.

Powered by J Star Sentinel ⚡

Add support for tuning AI review performance through environment variables:
- AI_CONCURRENCY: Controls parallel file reviews (default 1)
- AI_MAX_RETRIES: Number of retries on rate limits (default 3)
- AI_RETRY_DELAY: Initial retry delay in ms (default 2000)
- AI_BACKOFF_FACTOR: Exponential backoff multiplier (default 2)

Implement retry logic with exponential backoff for AI calls to handle rate limits.
Refactor chunked review to use configurable concurrency instead of fixed sequential processing.
Update documentation and type schemas to reflect new configuration options.
@github-actions
Copy link
Copy Markdown

🟢 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
93/100 COMMENT - 1 2 1

📄 src/orchestrator.ts

Warning

Recursive retry helper lacks await
callAIWithRetry calls itself recursively without await, causing unhandled promise rejections and potential stack overflow.

🔹 Hard-coded token limit reduced without explanation

Category: PERFORMANCE

TOKEN_LIMIT dropped from 8000 to 6000 without justification, reducing batch capacity for large diffs.

🔹 Sequential delay between batches ignores concurrency

Category: PERFORMANCE

Fixed delay after each batch ignores AI_CONCURRENCY > 1, wasting wall-clock time on high-tier keys.

🛠️ Recommended Fixes

  • Recursive retry helper lacks await: Add 'await' before the recursive callAIWithRetry on line 268 to properly chain retries and handle errors.
  • Hard-coded token limit reduced without explanation: Document the rationale for the lower limit in a comment or revert to 8000 if no constraint enforces 6000.
  • Sequential delay between batches ignores concurrency: Remove the fixed delay or scale it inversely with AI_CONCURRENCY to keep total throughput high.

📄 src/types.ts

🔧 Env schema defaults duplicated in orchestrator

Category: STYLE

Default values for AI tuning vars are declared in both EnvSchema and orchestrator constants.

🛠️ Recommended Fixes

  • Env schema defaults duplicated in orchestrator: Remove the inline defaults in orchestrator and rely on EnvSchema's defaults to keep a single source of truth.

Powered by J Star Sentinel ⚡

… through context

- Introduce AIConfig interface for concurrency, retries, delay, and backoff
- Add parseAIConfig function to extract config from environment
- Update GitHubContext to include config and pass it to dependent functions
- Modify callAIWithRetry, runDeepReview, and review functions to use config parameter
- Clean up comments and reorder function definitions for better organization
- Add Phase 8 history entry detailing stability improvements under pressure
- Include performance tuning options in spawn guide for handling rate limits
- Update analyst feature docs with token limits and chunking details
@github-actions
Copy link
Copy Markdown

🟢 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
84/100 COMMENT - - 1 -

📄 src/types.ts

🔹 AI configuration values not validated

Category: LOGIC

The AI tuning environment variables are parsed as strings without validation for numeric ranges or valid values. AI_CONCURRENCY should be a positive integer, and AI_BACKOFF_FACTOR should be a number ≥1.

🛠️ Recommended Fixes

  • AI configuration values not validated: Add .transform(Number) and .refine() checks to ensure AI_CONCURRENCY is a positive integer, AI_MAX_RETRIES is a non-negative integer, AI_RETRY_DELAY is a positive integer, and AI_BACKOFF_FACTOR is a number ≥ 1.

Powered by J Star Sentinel ⚡

…on and validation

- Replace string-based AI tuning parameters in EnvSchema with z.coerce.number() for automatic type conversion and strict validation
- Update parseAIConfig to use coerced values directly, removing manual parseInt calls
- Add AI_BACKOFF_FACTOR to documentation with proper type descriptions

This enhances type safety and prevents runtime errors from invalid numeric inputs while maintaining backward compatibility with defaults.
@github-actions
Copy link
Copy Markdown

🟢 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
84/100 REQUEST_CHANGES - 1 - -

📄 src/types.ts

Warning

Missing GET NEW PROMPT implementation
The GET NEW PROMPT button should fetch and display a new prompt when clicked, but no functionality exists for this button.

🛠️ Recommended Fixes

  • Missing GET NEW PROMPT implementation: Add an on-click event to the GET NEW PROMPT button that calls a function to generate and display a new prompt in the UI.

Powered by J Star Sentinel ⚡

@JStaRFilms JStaRFilms merged commit 5742d8e into main Dec 14, 2025
2 checks passed
@JStaRFilms
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown

🟢 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
84/100 REQUEST_CHANGES 3 3 1 -

📄 src/orchestrator.ts

Warning

Missing 직책 in Professional Journey
직책 is not reflected in the title mapping for runDeepReview, causing title mismatch.

🔹 Redundant runDeepReview calls

Category: PERFORMANCE

Repeated runDeepReview calls for same file without caching, causing latency.

Warning

Hardcoded 직책 in runDeepReview
직책 is hardcoded instead of using runDeepReview config, making future updates error-prone.

Caution

직책 bypass via runDeepReview
직책 can bypass via runDeepReview due to missing auth check on 직책.

🛠️ Recommended Fixes

  • Missing 직책 in Professional Journey: Update runDeepReview title in runDeepReview to include '직책' field.
  • Redundant runDeepReview calls: Cache runDeepReview results in runDeepReview to avoid duplicate calls.
  • Hardcoded 직책 in runDeepReview: Define 직책 as a constant and reference in runDeepReview.
  • 직책 bypass via runDeepReview: Require 직책 check before allowing runDeepReview in runDeepReview.

📄 src/orchestrator.html

Caution

직책 XSS in runDeepReview
직책 is used in title without sanitization, allowing script injection via malicious 직책 value.

🛠️ Recommended Fixes

  • 직책 XSS in runDeepReview: Escape 직책 before inserting into title in runDeepReview.

📄 src/types.ts

Caution

Missing input validation for AI tuning variables
The new AI_CONCURRENCY, AI_MAX_RETRIES, AI_RETRY_DELAY, AI_BACKOFF_FACTOR environment variables lack proper validation ranges.

Warning

Missing explicit return type on EnvSchema
EnvSchema lacks explicit return type annotation.

🛠️ Recommended Fixes

  • Missing input validation for AI tuning variables: Add proper min/max validation for AI tuning variables. AI_CONCURRENCY should be capped at reasonable limits, AI_BACKOFF_FACTOR needs ceiling to prevent runaway growth.
  • Missing explicit return type on EnvSchema: Add explicit return type annotation to EnvSchema.

Powered by J Star Sentinel ⚡

@JStaRFilms
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown

🟡 J Star Code Audit

Score Verdict 🚨 Critical 🔶 High 🔹 Medium 🔧 Nitpick
71/100 APPROVE - - - -

✨ No issues found. Ship it!


Powered by J Star Sentinel ⚡

@JStaRFilms JStaRFilms deleted the chunking branch December 14, 2025 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant