perf(orchestrator): optimize AI calls for rate limits and token usage#2
Conversation
Reduce token limit from 8000 to 6000 for more headroom under 10k TPM. Add retry mechanism with exponential backoff for rate limit errors. Change batch processing to sequential with 1s delays to respect TPM limits.
🟢 J Star Code Audit
📄 src/orchestrator.tsWarning Sequential AI Calls with 1s Pause 🔹 Hard-Coded Retry ParametersCategory: LOGIC Retries, delay, and backoff factor are hard-coded, preventing runtime tuning for different model limits or deployment environments. 🔹 Fragile Error DetectionCategory: LOGIC Matching on substrings like 'token' can misclassify unrelated errors as rate limits, hiding real failures and burning retries. 🛠️ Recommended Fixes
Powered by J Star Sentinel ⚡ |
Add support for tuning AI review performance through environment variables: - AI_CONCURRENCY: Controls parallel file reviews (default 1) - AI_MAX_RETRIES: Number of retries on rate limits (default 3) - AI_RETRY_DELAY: Initial retry delay in ms (default 2000) - AI_BACKOFF_FACTOR: Exponential backoff multiplier (default 2) Implement retry logic with exponential backoff for AI calls to handle rate limits. Refactor chunked review to use configurable concurrency instead of fixed sequential processing. Update documentation and type schemas to reflect new configuration options.
🟢 J Star Code Audit
📄 src/orchestrator.tsWarning Recursive retry helper lacks await 🔹 Hard-coded token limit reduced without explanationCategory: PERFORMANCE TOKEN_LIMIT dropped from 8000 to 6000 without justification, reducing batch capacity for large diffs. 🔹 Sequential delay between batches ignores concurrencyCategory: PERFORMANCE Fixed delay after each batch ignores AI_CONCURRENCY > 1, wasting wall-clock time on high-tier keys. 🛠️ Recommended Fixes
📄 src/types.ts🔧 Env schema defaults duplicated in orchestratorCategory: STYLE Default values for AI tuning vars are declared in both EnvSchema and orchestrator constants. 🛠️ Recommended Fixes
Powered by J Star Sentinel ⚡ |
… through context - Introduce AIConfig interface for concurrency, retries, delay, and backoff - Add parseAIConfig function to extract config from environment - Update GitHubContext to include config and pass it to dependent functions - Modify callAIWithRetry, runDeepReview, and review functions to use config parameter - Clean up comments and reorder function definitions for better organization
- Add Phase 8 history entry detailing stability improvements under pressure - Include performance tuning options in spawn guide for handling rate limits - Update analyst feature docs with token limits and chunking details
🟢 J Star Code Audit
📄 src/types.ts🔹 AI configuration values not validatedCategory: LOGIC The AI tuning environment variables are parsed as strings without validation for numeric ranges or valid values. AI_CONCURRENCY should be a positive integer, and AI_BACKOFF_FACTOR should be a number ≥1. 🛠️ Recommended Fixes
Powered by J Star Sentinel ⚡ |
…on and validation - Replace string-based AI tuning parameters in EnvSchema with z.coerce.number() for automatic type conversion and strict validation - Update parseAIConfig to use coerced values directly, removing manual parseInt calls - Add AI_BACKOFF_FACTOR to documentation with proper type descriptions This enhances type safety and prevents runtime errors from invalid numeric inputs while maintaining backward compatibility with defaults.
🟢 J Star Code Audit
📄 src/types.tsWarning Missing GET NEW PROMPT implementation 🛠️ Recommended Fixes
Powered by J Star Sentinel ⚡ |
|
/review |
🟢 J Star Code Audit
📄 src/orchestrator.tsWarning Missing 직책 in Professional Journey 🔹 Redundant runDeepReview callsCategory: PERFORMANCE Repeated runDeepReview calls for same file without caching, causing latency. Warning Hardcoded 직책 in runDeepReview Caution 직책 bypass via runDeepReview 🛠️ Recommended Fixes
📄 src/orchestrator.htmlCaution 직책 XSS in runDeepReview 🛠️ Recommended Fixes
📄 src/types.tsCaution Missing input validation for AI tuning variables Warning Missing explicit return type on EnvSchema 🛠️ Recommended Fixes
Powered by J Star Sentinel ⚡ |
|
/review |
🟡 J Star Code Audit
✨ No issues found. Ship it!Powered by J Star Sentinel ⚡ |
Reduce token limit from 8000 to 6000 for more headroom under 10k TPM. Add retry mechanism with exponential backoff for rate limit errors. Change batch processing to sequential with 1s delays to respect TPM limits.