From ba79b7c4f06d8149a305d849eab592b8d69224dd Mon Sep 17 00:00:00 2001 From: Daniel Bisgrove Date: Wed, 25 Feb 2026 13:44:53 -0500 Subject: [PATCH 1/5] Adding PR Risk Assessment to the review so we have a quick way of telling if the PR needs a senior review. I will say that if it says low-risk and you want a senior review, nothing is stopping you from asking for one. This isa just a guide. --- .claude/commands/pr-review.md | 156 ++++++++++++++++++++++++++++++++++ 1 file changed, 156 insertions(+) diff --git a/.claude/commands/pr-review.md b/.claude/commands/pr-review.md index 4e5ef64fbc..7a632c164e 100644 --- a/.claude/commands/pr-review.md +++ b/.claude/commands/pr-review.md @@ -71,6 +71,162 @@ List EVERY file changed in this PR (relative path). For each file, include: Do not skip any file. If any file can't be read, state it and continue. +### Stage 1.5 — PR Risk Assessment & Review Recommendation + +Analyze the PR changes to determine the appropriate reviewer level and display a clear recommendation. + +#### Step 1: Calculate Risk Score + +Start with a base score of 0, then add points based on these criteria: + +**Critical File Patterns (High Risk: +3 points each)** + +- `pages/api/auth/[...nextauth].page.ts` - NextAuth configuration +- `pages/api/auth/helpers.ts` - JWT validation, session management +- `pages/api/auth/impersonate/**/*` - User impersonation system +- `pages/api/graphql-rest.page.ts` - REST to GraphQL proxy layer +- `pages/api/Schema/index.ts` - Schema registry +- `src/lib/apollo/client.ts` - Apollo client setup +- `src/lib/apollo/link.ts` - GraphQL routing logic +- `src/lib/apollo/cache.ts` - Cache policies +- `next.config.ts` - Next.js/build configuration +- `.env*` - Environment files (if changed, automatic senior review) + +**High-Risk File Patterns (+2 points each)** + +- `pages/api/Schema/**/resolvers.ts` - GraphQL resolvers +- `**/*.graphql` (excluding `**/*.test.*` and `__tests__/**`) - Schema definitions +- `pages/api/Schema/Settings/Organizations/**/*` - Organization management +- `pages/api/Schema/Settings/Integrations/**/*` - Third-party integrations +- `pages/api/Schema/donations/**/*` - Donation processing +- `pages/api/Schema/reports/financialAccounts/**/*` - Financial reporting +- `pages/api/Schema/Settings/Preferences/ExportData/**/*` - Data export +- `src/components/Shared/MultiPageLayout/**/*` - Main app layout +- `src/components/Shared/Header/**/*` - Global navigation +- `src/components/Shared/Filters/**/*` - Shared filtering logic +- `src/components/Shared/Forms/**/*` - Shared form components +- `src/components/Settings/Admin/**/*` - Admin functionality +- Any file with `Context` in the name under core features (not report-specific) + +**Medium-Risk File Patterns (+1 point each)** + +- `pages/accountLists/**/*` - Main application pages +- `pages/api/**/*` (not already counted) - API endpoints +- `src/components/Settings/integrations/**/*` - Integration UI +- `src/components/Reports/**/Context/**/*` - Report state management +- `src/hooks/**/*` - Custom hooks +- `src/lib/**/*.ts` - Utility functions + +**Low-Risk Files (0 points)** + +- `**/*.test.tsx` or `**/*.test.ts` - Test files only +- `*.md` - Documentation +- `public/locales/**/*` - Translation files +- Style-only changes with no logic + +**Change Volume Modifier** + +- <50 lines total: +0 points +- 50-200 lines: +1 point +- 200-500 lines: +2 points +- 500+ lines: +3 points + +**Scope Multiplier** +Apply after calculating base score: + +- Single domain (e.g., only tests): 1.0x +- Multiple domains (e.g., components + API): 1.3x +- Cross-cutting (e.g., auth + GraphQL + build): 1.7x + +**Special Pattern Detection (additional points)** + +- New npm package in `package.json`: +2 points +- Updated critical package (@apollo/\*, next, react, next-auth): +3 points +- New file in `src/hooks/`: +1 point (sets pattern) +- New file in `src/components/Shared/`: +1 point (sets pattern) +- Changes to `src/graphql/rootFields.generated.ts`: +3 points + +#### Step 2: Determine Day of Week + +Run: `date +%A` + +#### Step 3: Calculate Final Recommendation + +Based on the risk score and day of week, determine the required reviewer level: + +**Monday-Thursday:** + +- Score 1-3: Junior or Mid-level can review +- Score 4-6: Mid-level recommended, Senior optional +- Score 7-8: Senior recommended +- Score 9-10: Senior (Caleb Cox) required + +**Friday:** + +- Score 1-3: Junior/Mid can review BUT suggest waiting until Monday +- Score 4-6: Senior recommended for Friday merge +- Score 7-10: Senior (Caleb Cox) required, strongly suggest waiting until Monday + +**Saturday/Sunday:** + +- All scores: Treat as Friday + add extra weekend deployment warning + +#### Step 4: Display Risk Assessment Report + +Print the following report at the beginning of your review (before the deep review): + +``` +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +📊 PR RISK ASSESSMENT +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +Risk Score: [X]/10 +Risk Level: [LOW | MEDIUM | HIGH | CRITICAL] + +Files Changed: [N] +Lines Changed: +[X] -[Y] + +Risk Factors Detected: +[List each risk factor found with specific file references] +• [e.g., "Authentication logic (pages/api/auth/helpers.ts)"] +• [e.g., "GraphQL schema changes (3 .graphql files)"] +• [e.g., "Large changeset (350+ lines)"] + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +👥 REVIEW RECOMMENDATION +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +Required Reviewer Level: [JUNIOR/MID-LEVEL | MID-LEVEL/SENIOR | SENIOR (Caleb Cox)] + +Reasoning: [1-2 sentence explanation] + +[IF FRIDAY AND SCORE <= 6] +⚠️ FRIDAY DEPLOYMENT NOTICE +This PR is being reviewed on Friday. Options: + 1. Proceed with review and merge (approved for Friday deployment) + 2. Wait until Monday for safer deployment window + +[IF FRIDAY AND SCORE >= 7] +⚠️ HIGH-RISK FRIDAY DEPLOYMENT WARNING +This PR contains high-risk changes. Recommendations: + • Senior (Caleb Cox) review required + • Strongly consider waiting until Monday to merge + • If urgent, ensure monitoring plan is in place + +[IF WEEKEND] +⚠️ WEEKEND DEPLOYMENT WARNING +Consider waiting until Monday for deployment unless this is an urgent hotfix. + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +``` + +#### Important Notes + +- If any `.env` file is changed, immediately flag as CRITICAL and require senior review +- If `package.json` dependencies change, list which packages and their risk level +- Be specific about which high-risk files triggered the assessment +- The risk score helps guide the decision but use judgment for edge cases + ### Stage 2 — Deep Review (File-by-File) IMPORTANT: Only review files that appear in the git diff from Stage 1. Do not review files that are not part of this PR. From 1ad43960c9e6007c877d5a9935b9ed6314d9a8b2 Mon Sep 17 00:00:00 2001 From: Daniel Bisgrove Date: Wed, 25 Feb 2026 13:46:10 -0500 Subject: [PATCH 2/5] Multi-agent review with debate and consensus. This costs $0.80 - $2.50 each time to run, but will catch a lot of things. --- .claude/commands/agent-review.md | 1098 ++++++++++++++++++++++++++++++ 1 file changed, 1098 insertions(+) create mode 100644 .claude/commands/agent-review.md diff --git a/.claude/commands/agent-review.md b/.claude/commands/agent-review.md new file mode 100644 index 0000000000..685a8c9cbe --- /dev/null +++ b/.claude/commands/agent-review.md @@ -0,0 +1,1098 @@ +--- +name: agent-review +description: Multi-agent PR review with debate and consensus +approve_tools: + - Bash(gh:*) +--- + +# Multi-Agent PR Code Review + +This command spawns 5 specialized review agents that independently analyze the PR, debate their findings, and post a consensus review to GitHub. + +**Operating in multi-agent review mode with debate rounds.** + +--- + +## Stage 0 — Context Gathering & Risk Assessment + +### Gather PR Context + +First, get all the PR information we need: + +```bash +# Check if we're in a PR branch +gh pr view --json number,title,baseRefName,headRefName,additions,deletions,changedFiles 2>/dev/null || echo "Not in a PR branch, using main as base" + +# Get the day of week for Friday warnings +DAY_OF_WEEK=$(date +%A) +echo "Today is: $DAY_OF_WEEK" +``` + +If GitHub CLI works, get the diff using PR refs: + +```bash +BASE_REF=$(gh pr view --json baseRefOid -q .baseRefOid 2>/dev/null) +HEAD_REF=$(gh pr view --json headRefOid -q .headRefOid 2>/dev/null) + +if [ -n "$BASE_REF" ] && [ -n "$HEAD_REF" ]; then + git diff $BASE_REF..$HEAD_REF --name-only > /tmp/changed_files.txt + git diff $BASE_REF..$HEAD_REF --stat + git diff $BASE_REF..$HEAD_REF > /tmp/pr_diff.txt +else + # Fallback to comparing against main + BASE_COMMIT=$(git merge-base HEAD main) + git diff $BASE_COMMIT..HEAD --name-only > /tmp/changed_files.txt + git diff $BASE_COMMIT..HEAD --stat + git diff $BASE_COMMIT..HEAD > /tmp/pr_diff.txt +fi +``` + +### Read Project Standards + +Read `.claude/CLAUDE.md` to understand the project's coding standards and conventions. This context will be shared with all agents. + +### Calculate Risk Score + +Now calculate the initial risk score using the algorithm from the existing `/pr-review` command: + +**Process:** + +1. Read the list of changed files from `/tmp/changed_files.txt` +2. Count lines changed from the diff stat +3. Apply the risk scoring algorithm: + +**Critical File Patterns (+3 points each):** + +- `pages/api/auth/[...nextauth].page.ts` +- `pages/api/auth/helpers.ts` +- `pages/api/auth/impersonate/` +- `pages/api/graphql-rest.page.ts` +- `pages/api/Schema/index.ts` +- `src/lib/apollo/client.ts` +- `src/lib/apollo/link.ts` +- `src/lib/apollo/cache.ts` +- `next.config.ts` +- `.env` files + +**High-Risk Patterns (+2 points each):** + +- `pages/api/Schema/**/resolvers.ts` +- `**/*.graphql` (excluding tests) +- Financial/donation code +- Organization management +- Shared components + +**Medium-Risk (+1 point each):** + +- Main app pages +- Custom hooks +- Utility functions + +**Change Volume:** + +- <50 lines: +0 +- 50-200 lines: +1 +- 200-500 lines: +2 +- 500+ lines: +3 + +**Scope Multiplier:** + +- Single domain: 1.0x +- Multiple domains: 1.3x +- Cross-cutting: 1.7x + +Calculate the final risk score (0-10) and determine the required reviewer level based on day of week. + +Display a summary: + +``` +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +📊 PR RISK ASSESSMENT +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +Risk Score: [X]/10 +Risk Level: [LOW | MEDIUM | HIGH | CRITICAL] +Day: [DAY_OF_WEEK] + +Files Changed: [N] +Lines Changed: +[X] -[Y] + +Risk Factors: +• [List detected risk factors] + +Required Reviewer: [JUNIOR/MID | MID/SENIOR | SENIOR (Caleb Cox)] + +[IF FRIDAY: Display Friday warning based on risk level] + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +``` + +--- + +## Stage 1 — Launch Specialized Review Agents (Parallel) + +Now launch all 5 specialized review agents in parallel using the Task tool. + +**IMPORTANT:** Use a SINGLE message with multiple Task tool invocations to run them in parallel. + +Display: "🚀 Launching 5 specialized review agents in parallel..." + +### Agent 1: Security Review 🔒 + +Use the Task tool with: + +- **description**: "Security code review" +- **subagent_type**: "general-purpose" +- **model**: "sonnet" +- **prompt**: + +``` +You are the Security Review Agent for MPDX code review. + +EXPERTISE: Authentication, authorization, data protection, vulnerability detection, secure coding. + +MISSION: Review this PR for security vulnerabilities. + +CONTEXT: +- Risk Score: [calculated score]/10 +- Day: [day of week] +- Changed Files: [N] + +CRITICAL FOCUS: +- Authentication (pages/api/auth/, session handling) +- JWT validation, user impersonation security +- API authorization, secrets exposure +- Input validation, XSS, SQL injection, CSRF +- Cookie security, CORS configuration + +Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt + +OUTPUT FORMAT: + +## 🔒 Security Agent Review + +### Critical Security Issues (BLOCKING) +[Issues that MUST be fixed - be specific with file:line] +- **File:Line** - Issue description + - Risk: High/Critical + - Impact: What could happen + - Fix: How to fix + +### Security Concerns (IMPORTANT) +[Issues that should be fixed] +- **File:Line** - Concern + - Risk: Medium + - Recommendation: Action + +### Security Suggestions +[Nice-to-have improvements] + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low +- Areas needing deeper analysis: [list] + +GUIDELINES: +- Be specific with file:line references +- Explain WHY it's a risk, not just WHAT +- Consider MPDX context (donor data, financial info) +- Don't flag if clearly handled elsewhere +- Focus on practical risks, not theoretical +``` + +### Agent 2: Architecture Review 🏗️ + +Use the Task tool with: + +- **description**: "Architecture code review" +- **subagent_type**: "general-purpose" +- **model**: "sonnet" +- **prompt**: + +``` +You are the Architecture Review Agent for MPDX code review. + +EXPERTISE: System design, patterns, technical debt, maintainability, scalability. + +MISSION: Review this PR for architectural concerns and design issues. + +CONTEXT: +- Risk Score: [calculated score]/10 +- Day: [day of week] +- Changed Files: [N] + +CRITICAL FOCUS: +- GraphQL schema design (pages/api/Schema/, .graphql files) +- Apollo Client cache (src/lib/apollo/) +- Next.js configuration (next.config.ts) +- Component architecture, state management +- API design, pattern consistency +- Technical debt creation/reduction + +Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt +Also read .claude/CLAUDE.md for project patterns. + +OUTPUT FORMAT: + +## 🏗️ Architecture Agent Review + +### Critical Architecture Issues (BLOCKING) +- **File:Line** - Issue + - Problem: What's architecturally wrong + - Impact: Long-term consequences + - Alternative: Better approach + +### Architecture Concerns (IMPORTANT) +- **File:Line** - Concern + - Issue: Description + - Recommendation: How to improve + +### Architecture Suggestions +[Better patterns] + +### Technical Debt +- Debt Added: [what new debt] +- Debt Removed: [what debt fixed] +- Net Impact: Better/Worse/Neutral + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low + +GUIDELINES: +- Focus on long-term maintainability +- Identify pattern inconsistencies +- Consider scalability +- Balance pragmatism vs purity +- Reference CLAUDE.md standards +``` + +### Agent 3: Data Integrity Review 💾 + +Use the Task tool with: + +- **description**: "Data integrity review" +- **subagent_type**: "general-purpose" +- **model**: "sonnet" +- **prompt**: + +``` +You are the Data Integrity Review Agent for MPDX code review. + +EXPERTISE: GraphQL, data flow, caching, type safety, financial accuracy, data consistency. + +MISSION: Review this PR for data correctness and integrity. + +CONTEXT: +- Risk Score: [calculated score]/10 +- Day: [day of week] +- Changed Files: [N] + +CRITICAL FOCUS: +- GraphQL queries/mutations +- Apollo cache normalization (must have `id` fields) +- Data fetching patterns, pagination +- Financial calculations (donations, pledges, amounts) +- Data consistency across updates +- Optimistic responses, type safety +- Dual GraphQL server architecture + +Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt + +OUTPUT FORMAT: + +## 💾 Data Integrity Agent Review + +### Critical Data Issues (BLOCKING) +- **File:Line** - Issue + - Problem: Data integrity concern + - Impact: What could go wrong + - Fix: Required action + +### Data Concerns (IMPORTANT) +- **File:Line** - Concern + - Issue: Description + - Recommendation: Fix + +### Data Suggestions +[Better data handling] + +### GraphQL Specific +- Missing `id` fields: [list] +- Cache policy issues: [concerns] +- Fragment reuse: [opportunities] + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low +- Financial review: Reviewed/Not Applicable + +GUIDELINES: +- Financial accuracy is CRITICAL +- Check cache updates +- Verify pagination +- Ensure type safety +- Consider data consistency +``` + +### Agent 4: Testing & Quality Review 🧪 + +Use the Task tool with: + +- **description**: "Testing and quality review" +- **subagent_type**: "general-purpose" +- **model**: "haiku" +- **prompt**: + +``` +You are the Testing & Quality Review Agent for MPDX code review. + +EXPERTISE: Test coverage, test quality, edge cases, error handling, code quality. + +MISSION: Review this PR for testing adequacy and code quality. + +CONTEXT: +- Risk Score: [calculated score]/10 +- Day: [day of week] +- Changed Files: [N] + +CRITICAL FOCUS: +- Test coverage for new code +- Test quality and maintainability +- Edge case handling, error states +- Integration test needs +- Mock data usage (prefer shared mockData) +- Type safety (avoid `any` types) +- Code quality (unused imports, console.logs) + +Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt + +OUTPUT FORMAT: + +## 🧪 Testing & Quality Agent Review + +### Critical Testing Gaps (BLOCKING) +- **File:Line** - Gap + - Missing: What's not tested + - Risk: Why critical + - Required: What tests to add + +### Testing Concerns (IMPORTANT) +- **File:Line** - Concern + - Issue: Description + - Recommendation: Improvement + +### Code Quality Issues +- Unused imports: [list] +- Console.logs: [list] +- Type safety: [any types] +- Other: [issues] + +### Testing Suggestions +[Improvements] + +### Coverage Assessment +- New code tested: Yes/Partial/No +- Edge cases: [what's covered] +- Missing tests: [critical gaps] + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low + +GUIDELINES: +- Critical paths MUST have tests +- Don't require tests for trivial code +- Focus on edge cases and errors +- Check test quality, not just existence +- Verify mocks are realistic +``` + +### Agent 5: UX Review 👤 + +Use the Task tool with: + +- **description**: "UX and accessibility review" +- **subagent_type**: "general-purpose" +- **model**: "haiku" +- **prompt**: + +``` +You are the User Experience Review Agent for MPDX code review. + +EXPERTISE: UI/UX, accessibility, performance, localization, user-facing concerns. + +MISSION: Review this PR for user experience and usability. + +CONTEXT: +- Risk Score: [calculated score]/10 +- Day: [day of week] +- Changed Files: [N] + +CRITICAL FOCUS: +- Component usability, intuitiveness +- Loading states (must show for async) +- Error messages (user-friendly, localized) +- Accessibility (ARIA, keyboard nav) +- Performance (re-renders, heavy calculations) +- Localization (all text uses `t()` function) +- Responsive design +- Form validation, error display + +Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt + +OUTPUT FORMAT: + +## 👤 UX Agent Review + +### Critical UX Issues (BLOCKING) +- **File:Line** - Issue + - Problem: UX concern + - User Impact: How affects users + - Fix: Required action + +### UX Concerns (IMPORTANT) +- **File:Line** - Concern + - Issue: Description + - Recommendation: Improvement + +### Accessibility Issues +- Missing ARIA: [list] +- Keyboard nav: [issues] +- Screen reader: [concerns] + +### Localization Issues +- Hardcoded strings: [not using t()] +- Missing translations: [keys] + +### Performance Concerns +- Re-render issues: [list] +- Heavy calculations: [list] + +### UX Suggestions +[Improvements] + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low + +GUIDELINES: +- Put yourself in user's shoes +- Consider error scenarios +- Check text is localized +- Verify loading states exist +- Consider accessibility +``` + +After launching all 5 agents, display: + +``` +✅ All 5 agents launched in parallel +⏳ Waiting for agents to complete their reviews... +``` + +--- + +## Stage 2 — Collect Agent Reports + +Wait for all agents to complete and display their progress: + +``` +Agent Reviews Complete: +✅ 🔒 Security Agent - Found [X] critical, [Y] concerns +✅ 🏗️ Architecture Agent - Found [X] critical, [Y] concerns +✅ 💾 Data Integrity Agent - Found [X] critical, [Y] concerns +✅ 🧪 Testing Agent - Found [X] critical, [Y] concerns +✅ 👤 UX Agent - Found [X] critical, [Y] concerns +``` + +Parse each agent's output and extract: + +- Critical issues (BLOCKING) +- Important concerns +- Suggestions +- Questions for other agents +- Confidence level + +Store these in structured format for the debate rounds. + +--- + +## Stage 3 — Cross-Examination Debate (Round 1) + +Now facilitate the first debate round where agents challenge each other. + +Display: "🗣️ Starting cross-examination debate round..." + +For each of the 5 agents, launch a new Task with their original findings plus all other agents' findings: + +### Debate Prompt Template + +Use the Task tool for each agent with: + +- **description**: "[Agent name] cross-examination" +- **subagent_type**: "general-purpose" +- **model**: (same as original agent) +- **prompt**: + +``` +You are the [Agent Name] in the cross-examination debate phase. + +YOUR ORIGINAL FINDINGS: +[Paste that agent's original review output] + +OTHER AGENTS' FINDINGS: + +🔒 SECURITY AGENT FOUND: +[Security agent's findings] + +🏗️ ARCHITECTURE AGENT FOUND: +[Architecture agent's findings] + +💾 DATA INTEGRITY AGENT FOUND: +[Data Integrity agent's findings] + +🧪 TESTING AGENT FOUND: +[Testing agent's findings] + +👤 UX AGENT FOUND: +[UX agent's findings] + +MISSION: Review other agents' findings from your specialized perspective. + +DEBATE ACTIONS: +1. **CHALLENGE** - Disagree with a finding (max 3 challenges) +2. **SUPPORT** - Strongly agree and add context +3. **EXPAND** - Build on a finding with additional concerns +4. **QUESTION** - Ask for clarification + +RULES: +- Maximum 3 challenges (focus on important disagreements) +- Provide specific reasoning and evidence +- Reference file:line when possible +- Be constructive, not combative + +OUTPUT FORMAT: + +## [Agent Name] - Cross-Examination + +### Challenges +- **Challenge to [Agent X] re: [finding]** + - Why I disagree: [reasoning] + - Evidence: [supporting evidence] + - Revised view: [your assessment] + +### Strong Support +- **Support for [Agent X] re: [finding]** + - Additional context: [your perspective] + - Added concerns: [related issues] + +### Expansions +- **Building on [Agent X]'s [topic]**: + - [Your additional concerns] + +### Questions +- **To [Agent X]**: [question] + - Why asking: [reason] + +### Summary +- Challenges: [N] +- Supports: [N] +- Key disagreements: [main contentions] +``` + +Launch all 5 debate agents in parallel. + +Display progress: + +``` +✅ All agents engaged in cross-examination +⏳ Waiting for debate round 1 to complete... +``` + +--- + +## Stage 4 — Rebuttals (Debate Round 2) + +Collect all challenges from Stage 3 and give each original agent a chance to respond. + +Display: "🔄 Starting rebuttal round..." + +For each agent that received challenges: + +Use the Task tool with: + +- **description**: "[Agent name] rebuttal" +- **subagent_type**: "general-purpose" +- **model**: (same as original) +- **prompt**: + +``` +You are the [Agent Name] responding to challenges from debate round 1. + +YOUR ORIGINAL FINDINGS: +[Their original findings] + +CHALLENGES RAISED AGAINST YOU: + +[List each challenge with the challenging agent's name and reasoning] + +MISSION: Respond to each challenge. + +RESPONSE OPTIONS: +1. **DEFEND** - Additional evidence supports your finding +2. **CONCEDE** - Acknowledge challenge, downgrade/remove finding +3. **REVISE** - Update finding based on new perspective +4. **ESCALATE** - Flag as unresolved, needs human senior review + +OUTPUT FORMAT: + +## [Agent Name] - Rebuttals + +### Response to Challenge #1 from [Agent] +- Decision: DEFEND/CONCEDE/REVISE/ESCALATE +- Reasoning: [explanation] +- Updated Finding (if revised): + - Severity: Critical/Important/Suggestion + - Description: [updated] + +### Response to Challenge #2 +[Same format] + +### Summary +- Defended: [N] +- Conceded: [N] +- Revised: [N] +- Escalated: [N] +``` + +Launch rebuttal tasks for all challenged agents. + +Display: + +``` +✅ Rebuttal round complete +📊 Synthesizing consensus... +``` + +--- + +## Stage 5 — Consensus Synthesis + +Now analyze all findings, debates, and resolutions to build consensus. + +**Process:** + +1. Collect all final findings (original + revised from rebuttals) +2. Group by similarity (same file:line or same general issue) +3. Count agent agreement for each finding +4. Classify by consensus level + +**Consensus Levels:** + +- **Unanimous** (4-5 agents agree) → BLOCKING / HIGH PRIORITY +- **Majority** (3 agents agree) → IMPORTANT / MEDIUM PRIORITY +- **Minority** (1-2 agents) → SUGGESTION / LOW PRIORITY +- **Unresolved Debate** (agents couldn't agree) → NEEDS HUMAN REVIEW + +For each grouped finding, determine: + +- Final severity: BLOCKING / IMPORTANT / SUGGESTION +- Which agents flagged it +- Debate summary (if there was disagreement) +- Consensus strength + +Display a summary: + +``` +📊 Consensus Analysis: +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +Blocking Issues (Unanimous): [N] +Important Concerns (Majority): [N] +Suggestions (Minority): [N] +Unresolved Debates: [N] + +Total Findings: [N] +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +``` + +--- + +## Stage 6 — Generate Review Report + +Create the comprehensive review report in markdown format: + +```markdown +# 🤖 Multi-Agent Code Review Report + +**Generated**: [timestamp] +**Agents**: 5 specialized reviewers with debate rounds + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 📊 RISK ASSESSMENT + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Risk Score**: [X]/10 - [LOW/MEDIUM/HIGH/CRITICAL] +**Day**: [day of week] +**Files Changed**: [N] (+[X] -[Y] lines) + +**Risk Factors**: +[List specific factors detected] + +**Required Reviewer**: [JUNIOR/MID | MID/SENIOR | SENIOR (Caleb Cox)] + +[IF FRIDAY/WEEKEND] +⚠️ **[DAY] DEPLOYMENT WARNING** +[Appropriate warning based on risk score] + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 🚫 BLOCKING ISSUES + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Must be fixed before merge** (Unanimous: 4-5 agents) + +[FOR EACH BLOCKING ISSUE:] + +### [Issue Title] + +**File**: `[file:line]` +**Flagged by**: [Agent 1, Agent 2, Agent 3, ...] + +**Problem**: +[Detailed description from consensus] + +**Agent Perspectives**: + +- **[Agent 1]**: [Their specific concern] +- **[Agent 2]**: [Their specific concern] + +**Debate Summary**: + +- [Summary of any challenges and resolutions] +- Final consensus: BLOCKING + +**Required Action**: +[Specific steps to fix] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## ⚠️ IMPORTANT CONCERNS + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Should be addressed before merge** (Majority: 3 agents) + +[FOR EACH IMPORTANT CONCERN:] + +### [Concern Title] + +**File**: `[file:line]` +**Flagged by**: [Agents] + +**Issue**: +[Description] + +**Debate Summary**: + +- [Summary of debate if any] +- Recommendation: Fix before merge + +**Suggested Action**: +[How to address] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 💡 SUGGESTIONS + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Nice-to-have improvements** (Minority: 1-2 agents) + +[List suggestions by category] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 🤔 UNRESOLVED DEBATES + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Requires senior developer judgment** + +[FOR EACH UNRESOLVED DEBATE:] + +### [Debate Topic] + +**Context**: [What the debate is about] + +**Positions**: + +**[Agent 1 argues]**: +[Their position with reasoning] + +**[Agent 2 counters]**: +[Their counter-position] + +**Other agents**: + +- [Agent 3]: [Position] +- [Agent 4]: [Position] + +**Why needs human review**: +[Explanation] + +**Recommendation**: +Senior developer (Caleb Cox) should decide based on [considerations] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 📝 REVIEW SUMMARY + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +| Agent | Critical | Important | Suggestions | Confidence | +| ----------------- | -------- | --------- | ----------- | ---------- | +| 🔒 Security | [N] | [N] | [N] | [H/M/L] | +| 🏗️ Architecture | [N] | [N] | [N] | [H/M/L] | +| 💾 Data Integrity | [N] | [N] | [N] | [H/M/L] | +| 🧪 Testing | [N] | [N] | [N] | [H/M/L] | +| 👤 UX | [N] | [N] | [N] | [H/M/L] | +| **Total** | **[N]** | **[N]** | **[N]** | - | + +**Debate Statistics**: + +- Total challenges raised: [N] +- Challenges defended: [N] +- Challenges conceded: [N] +- Findings revised: [N] +- Escalated to human: [N] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 🎯 RECOMMENDED NEXT STEPS + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Immediate Actions** (Blockers): +[FOR EACH BLOCKING ISSUE:] + +- [ ] Fix [issue] at [file:line] + +**Important Actions** (Before merge): +[FOR EACH IMPORTANT CONCERN:] + +- [ ] Address [concern] at [file:line] + +**Human Review Needed**: +[FOR EACH UNRESOLVED DEBATE:] + +- [ ] Senior developer to resolve: [debate topic] + +**Optional Improvements**: +[FOR EACH SUGGESTION:] + +- Consider [suggestion] + +--- + +
+📋 Full Agent Reports (click to expand) + +## 🔒 Security Agent - Complete Report + +[Full original report] + +## 🏗️ Architecture Agent - Complete Report + +[Full original report] + +## 💾 Data Integrity Agent - Complete Report + +[Full original report] + +## 🧪 Testing & Quality Agent - Complete Report + +[Full original report] + +## 👤 UX Agent - Complete Report + +[Full original report] + +
+ +--- + +_🤖 Generated by MPDX Multi-Agent Review System_ +_Review time: [X] minutes | Agents: Security, Architecture, Data, Testing, UX_ +``` + +Save this to `/tmp/agent_review_report.md` + +--- + +## Stage 7 — Post to GitHub (Optional) + +Ask the user: + +``` +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +✅ MULTI-AGENT REVIEW COMPLETE +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +Found: +• [N] BLOCKING issues (unanimous) +• [N] IMPORTANT concerns (majority) +• [N] Suggestions (minority) +• [N] Unresolved debates (needs senior review) + +Review report saved to: /tmp/agent_review_report.md + +Would you like to post this review to GitHub? + +1. Post full review (all findings + debates + agent reports) +2. Post summary only (blocking + important + recommendations) +3. Don't post (review locally only) +4. Show me the report first + +Please respond: 1, 2, 3, or 4 +``` + +If user chooses 1 (full review): + +```bash +PR_NUM=$(gh pr view --json number -q .number 2>/dev/null) +if [ -n "$PR_NUM" ]; then + gh pr comment $PR_NUM --body-file /tmp/agent_review_report.md + echo "✅ Full review posted to PR #$PR_NUM" +else + echo "❌ No PR found. Run this from a PR branch or use 'gh pr view' to check." +fi +``` + +If user chooses 2 (summary only): +Create a condensed version with just: + +- Risk assessment +- Blocking issues +- Important concerns +- Unresolved debates +- Recommended next steps + +```bash +# Extract summary sections and post +PR_NUM=$(gh pr view --json number -q .number 2>/dev/null) +if [ -n "$PR_NUM" ]; then + # Create summary version + gh pr comment $PR_NUM --body-file /tmp/agent_review_summary.md + echo "✅ Summary posted to PR #$PR_NUM" +fi +``` + +If user chooses 3 (don't post): + +``` +Review complete! Report available at: /tmp/agent_review_report.md + +You can: +- Read it with: cat /tmp/agent_review_report.md +- Post later with: gh pr comment [PR#] --body-file /tmp/agent_review_report.md +``` + +If user chooses 4 (show report): +Display the full report in the terminal, then re-ask the posting question. + +--- + +## Summary + +Display final summary: + +``` +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +🎉 MULTI-AGENT REVIEW SESSION COMPLETE +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Review Statistics**: +- Agents deployed: 5 +- Debate rounds: 2 +- Total findings: [N] +- Consensus rate: [X]% +- Review time: [X] minutes + +**Key Outcomes**: +✅ [N] blocking issues identified +✅ [N] important concerns flagged +✅ [N] suggestions for improvement +⚠️ [N] unresolved debates (need senior review) + +**Next Steps**: +1. Address blocking issues before merge +2. Review important concerns +3. Get senior input on unresolved debates +4. Consider suggestions for code quality + +Thank you for using Multi-Agent Code Review! 🤖 +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +``` + +--- + +## Notes + +**Performance**: + +- Stage 1 agents run in parallel (~2-3 min) +- Stage 3 debate runs in parallel (~1-2 min) +- Stage 4 rebuttals run in parallel (~1 min) +- Total time: 4-6 minutes typical PR + +**Cost Estimation**: + +- 5 agents × 2 rounds = 10 major LLM calls +- Plus orchestrator synthesis +- Estimated: $0.80-$2.50 per review +- Cost varies with PR size and model selection + +**Model Configuration**: + +- Security: Sonnet (deep reasoning needed) +- Architecture: Sonnet (system thinking needed) +- Data: Sonnet (precision needed) +- Testing: Haiku (faster, cost-effective) +- UX: Haiku (faster, cost-effective) + +**Limitations**: + +- Agents can't see full codebase (only diff) +- May miss context from related files +- Debate reduces but doesn't eliminate false positives +- Unresolved debates still need human judgment + +--- + +_Multi-Agent Code Review System v1.0_ +_See `.claude/AGENT_BASED_CODE_REVIEW.md` for full documentation_ From ce37c250a601d71069a5122186d176c409b67427 Mon Sep 17 00:00:00 2001 From: Daniel Bisgrove Date: Thu, 26 Feb 2026 15:58:10 -0500 Subject: [PATCH 3/5] use opus --- .claude/commands/agent-review.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/.claude/commands/agent-review.md b/.claude/commands/agent-review.md index 685a8c9cbe..8d5289b0d8 100644 --- a/.claude/commands/agent-review.md +++ b/.claude/commands/agent-review.md @@ -143,7 +143,7 @@ Use the Task tool with: - **description**: "Security code review" - **subagent_type**: "general-purpose" -- **model**: "sonnet" +- **model**: "opus" - **prompt**: ``` @@ -208,7 +208,7 @@ Use the Task tool with: - **description**: "Architecture code review" - **subagent_type**: "general-purpose" -- **model**: "sonnet" +- **model**: "opus" - **prompt**: ``` @@ -277,7 +277,7 @@ Use the Task tool with: - **description**: "Data integrity review" - **subagent_type**: "general-purpose" -- **model**: "sonnet" +- **model**: "opus" - **prompt**: ``` @@ -347,7 +347,7 @@ Use the Task tool with: - **description**: "Testing and quality review" - **subagent_type**: "general-purpose" -- **model**: "haiku" +- **model**: "opus" - **prompt**: ``` @@ -422,7 +422,7 @@ Use the Task tool with: - **description**: "UX and accessibility review" - **subagent_type**: "general-purpose" -- **model**: "haiku" +- **model**: "opus" - **prompt**: ``` @@ -1079,11 +1079,11 @@ Thank you for using Multi-Agent Code Review! 🤖 **Model Configuration**: -- Security: Sonnet (deep reasoning needed) -- Architecture: Sonnet (system thinking needed) -- Data: Sonnet (precision needed) -- Testing: Haiku (faster, cost-effective) -- UX: Haiku (faster, cost-effective) +- Security: Opus (highest quality reasoning needed) +- Architecture: Opus (deepest system thinking needed) +- Data: Opus (maximum precision needed) +- Testing: Opus (comprehensive analysis) +- UX: Opus (thorough accessibility review) **Limitations**: From 2819eb108a742029ae99f4acb291c292cd67465b Mon Sep 17 00:00:00 2001 From: Daniel Bisgrove Date: Thu, 26 Feb 2026 17:45:00 -0500 Subject: [PATCH 4/5] better reviews --- .claude/commands/agent-review.md | 1694 ++++++++++++++++++++++++------ 1 file changed, 1387 insertions(+), 307 deletions(-) diff --git a/.claude/commands/agent-review.md b/.claude/commands/agent-review.md index 8d5289b0d8..5b264c2a63 100644 --- a/.claude/commands/agent-review.md +++ b/.claude/commands/agent-review.md @@ -1,15 +1,82 @@ --- name: agent-review -description: Multi-agent PR review with debate and consensus +description: Multi-agent PR review with smart selection and automated fixes approve_tools: - Bash(gh:*) --- -# Multi-Agent PR Code Review +# Multi-Agent PR Code Review v3.0 🚀 -This command spawns 5 specialized review agents that independently analyze the PR, debate their findings, and post a consensus review to GitHub. +AI-powered code review with smart agent selection, automated fixes, and quality metrics. -**Operating in multi-agent review mode with debate rounds.** +**💰 COST**: +- `/agent-review quick` - $0.50 (2 min, 3 agents, Haiku) +- `/agent-review` or `/agent-review standard` - $2-4 (5 min, smart selection) +- `/agent-review deep` - $6-10 (10 min, all 7 agents, Opus) + +**Usage**: +```bash +/agent-review # Standard mode (smart selection, recommended) +/agent-review quick # Quick feedback for simple PRs +/agent-review deep # Comprehensive analysis for critical changes +``` + +--- + +## Stage 0A — Parse Review Mode & Initialize + +### Determine Review Mode + +Check command argument to determine mode: + +```bash +# Get mode from argument (default to standard) +MODE="${1:-standard}" + +echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" +case "$MODE" in + quick) + echo "🏃 QUICK REVIEW MODE" + echo "• 3 agents (Testing, Standards, UX)" + echo "• Model: Haiku (fast, cost-effective)" + echo "• Time: ~2 minutes" + echo "• Cost: ~$0.50" + MODEL="haiku" + AGENT_MODE="quick" + ;; + deep) + echo "🔬 DEEP REVIEW MODE" + echo "• All 7 agents" + echo "• Model: Opus (maximum quality)" + echo "• Time: ~10 minutes" + echo "• Cost: ~$6-10" + MODEL="opus" + AGENT_MODE="deep" + ;; + standard|*) + echo "⚡ STANDARD REVIEW MODE (Recommended)" + echo "• Smart agent selection based on changes" + echo "• Model: Sonnet/Opus (balanced)" + echo "• Time: ~5 minutes" + echo "• Cost: ~$2-4" + MODEL="smart" + AGENT_MODE="standard" + ;; +esac +echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" +echo "" +``` + +### Initialize Directories + +```bash +# Create directories for metrics and fixes +mkdir -p .claude/review-history +mkdir -p .claude/review-metrics +mkdir -p .claude/pr-metrics +mkdir -p /tmp/automated_fixes +mkdir -p /tmp/dependency_analysis +``` --- @@ -23,7 +90,7 @@ First, get all the PR information we need: # Check if we're in a PR branch gh pr view --json number,title,baseRefName,headRefName,additions,deletions,changedFiles 2>/dev/null || echo "Not in a PR branch, using main as base" -# Get the day of week for Friday warnings +# Get the day of week for reviewer recommendations DAY_OF_WEEK=$(date +%A) echo "Today is: $DAY_OF_WEEK" ``` @@ -36,32 +103,40 @@ HEAD_REF=$(gh pr view --json headRefOid -q .headRefOid 2>/dev/null) if [ -n "$BASE_REF" ] && [ -n "$HEAD_REF" ]; then git diff $BASE_REF..$HEAD_REF --name-only > /tmp/changed_files.txt - git diff $BASE_REF..$HEAD_REF --stat + git diff $BASE_REF..$HEAD_REF --stat > /tmp/diff_stat.txt git diff $BASE_REF..$HEAD_REF > /tmp/pr_diff.txt else # Fallback to comparing against main BASE_COMMIT=$(git merge-base HEAD main) git diff $BASE_COMMIT..HEAD --name-only > /tmp/changed_files.txt - git diff $BASE_COMMIT..HEAD --stat + git diff $BASE_COMMIT..HEAD --stat > /tmp/diff_stat.txt git diff $BASE_COMMIT..HEAD > /tmp/pr_diff.txt fi + +# Get full content of changed files for agents to read +mkdir -p /tmp/changed_file_contents +while IFS= read -r file; do + if [ -f "$file" ]; then + cp "$file" "/tmp/changed_file_contents/$(basename "$file")" + fi +done < /tmp/changed_files.txt ``` ### Read Project Standards -Read `.claude/CLAUDE.md` to understand the project's coding standards and conventions. This context will be shared with all agents. +Read `CLAUDE.md` to understand the project's coding standards and conventions. This context will be shared with all agents. ### Calculate Risk Score -Now calculate the initial risk score using the algorithm from the existing `/pr-review` command: +Now calculate the risk score with improved algorithm: **Process:** 1. Read the list of changed files from `/tmp/changed_files.txt` -2. Count lines changed from the diff stat +2. Count lines changed from `/tmp/diff_stat.txt` 3. Apply the risk scoring algorithm: -**Critical File Patterns (+3 points each):** +**Critical File Patterns (+4 points each):** - `pages/api/auth/[...nextauth].page.ts` - `pages/api/auth/helpers.ts` @@ -73,69 +148,191 @@ Now calculate the initial risk score using the algorithm from the existing `/pr- - `src/lib/apollo/cache.ts` - `next.config.ts` - `.env` files +- Database migrations +- Payment processing code -**High-Risk Patterns (+2 points each):** +**High-Risk Patterns (+3 points each):** - `pages/api/Schema/**/resolvers.ts` - `**/*.graphql` (excluding tests) -- Financial/donation code +- Financial/donation code (`**/Donation**`, `**/Pledge**`, `**/Gift**`) - Organization management -- Shared components +- Shared components (`src/components/Shared/**`) +- Authentication flows +- Data synchronization code -**Medium-Risk (+1 point each):** +**Medium-Risk (+2 points each):** - Main app pages - Custom hooks -- Utility functions +- Utility functions with business logic +- Report generation +- Export/import features + +**Low-Risk (+1 point each):** -**Change Volume:** +- UI-only components +- Styling changes +- Test files +- Documentation + +**Change Volume Multiplier:** - <50 lines: +0 - 50-200 lines: +1 - 200-500 lines: +2 -- 500+ lines: +3 +- 500-1000 lines: +3 +- 1000+ lines: +4 **Scope Multiplier:** -- Single domain: 1.0x -- Multiple domains: 1.3x -- Cross-cutting: 1.7x +- Single file: 1.0x +- Single feature area: 1.0x +- Multiple related features: 1.3x +- Cross-cutting changes: 1.7x +- Core infrastructure: 2.0x -Calculate the final risk score (0-10) and determine the required reviewer level based on day of week. +**Final Risk Level Classification:** -Display a summary: +- 0-3 points: **LOW** → Entry-level+ can review +- 4-6 points: **MEDIUM** → Entry-level+ can review +- 7-9 points: **HIGH** → Experienced dev+ should review +- 10+ points: **CRITICAL** → Senior dev (Caleb Cox) must review + +Calculate and display the summary: ``` ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 📊 PR RISK ASSESSMENT ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -Risk Score: [X]/10 +Risk Score: [X]/[max] Risk Level: [LOW | MEDIUM | HIGH | CRITICAL] Day: [DAY_OF_WEEK] Files Changed: [N] Lines Changed: +[X] -[Y] -Risk Factors: -• [List detected risk factors] +Risk Factors Detected: +• [List specific risk factors found] -Required Reviewer: [JUNIOR/MID | MID/SENIOR | SENIOR (Caleb Cox)] +Required Reviewer Level: +[LOW/MEDIUM]: ✅ Entry-level or above can review +[HIGH]: ⚠️ Experienced developer or above should review +[CRITICAL]: 🚨 Senior developer (Caleb Cox) must review -[IF FRIDAY: Display Friday warning based on risk level] +💰 Estimated Review Cost: $[X.XX] (using Opus for all agents) + +[IF FRIDAY/WEEKEND: Display appropriate warning based on risk level] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ``` --- +## Stage 0B — Smart Agent Selection (Standard Mode Only) + +If `AGENT_MODE="standard"`, analyze which agents are actually needed: + +```bash +if [ "$AGENT_MODE" = "standard" ]; then + echo "🤖 Analyzing PR to select relevant agents..." + echo "" + + # Initialize agent list + SELECTED_AGENTS=() + + # Always include these + SELECTED_AGENTS+=("Architecture" "Testing" "Standards") + echo "✅ Architecture Agent - Always included" + echo "✅ Testing Agent - Always included" + echo "✅ Standards Agent - Always included" + + # Security Agent - if auth/API code changed + if grep -q -E "(pages/api/auth|session|jwt|impersonate|authentication)" /tmp/changed_files.txt 2>/dev/null; then + SELECTED_AGENTS+=("Security") + echo "✅ Security Agent - Auth/API code detected" + SECURITY_NEEDED=true + else + echo "❌ Security Agent - No auth/API changes (saved ~\$1.50)" + SECURITY_NEEDED=false + fi + + # Data Integrity Agent - if GraphQL or Apollo changes + if grep -q -E "(\.graphql|apollo|src/lib/apollo)" /tmp/changed_files.txt 2>/dev/null; then + SELECTED_AGENTS+=("Data") + echo "✅ Data Integrity Agent - GraphQL/Apollo changes detected" + DATA_NEEDED=true + else + echo "❌ Data Integrity Agent - No GraphQL changes (saved ~\$1.00)" + DATA_NEEDED=false + fi + + # UX Agent - if UI components changed + if grep -q -E "(\.tsx|components/.*\.tsx)" /tmp/changed_files.txt 2>/dev/null; then + SELECTED_AGENTS+=("UX") + echo "✅ UX Agent - UI components modified" + UX_NEEDED=true + else + echo "❌ UX Agent - No UI changes (saved ~\$1.00)" + UX_NEEDED=false + fi + + # Financial Agent - if financial code changed + if grep -q -iE "(donation|pledge|gift|amount|currency|balance|financial)" /tmp/pr_diff.txt 2>/dev/null; then + SELECTED_AGENTS+=("Financial") + echo "✅ Financial Agent - Financial code detected" + FINANCIAL_NEEDED=true + else + echo "❌ Financial Agent - No financial code (saved ~\$1.50)" + FINANCIAL_NEEDED=false + fi + + echo "" + echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" + echo "Selected: ${#SELECTED_AGENTS[@]} of 7 agents" + SAVED_COST=$(( (7 - ${#SELECTED_AGENTS[@]}) * 1 )) + echo "Estimated savings: ~\$$SAVED_COST" + echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━" + echo "" + + # Save for later stages + echo "${SELECTED_AGENTS[@]}" > /tmp/selected_agents.txt +elif [ "$AGENT_MODE" = "quick" ]; then + # Quick mode: only 3 agents + echo "Testing UX Standards" > /tmp/selected_agents.txt + SECURITY_NEEDED=false + DATA_NEEDED=false + UX_NEEDED=true + FINANCIAL_NEEDED=false +elif [ "$AGENT_MODE" = "deep" ]; then + # Deep mode: all 7 agents + echo "Security Architecture Data Testing UX Financial Standards" > /tmp/selected_agents.txt + SECURITY_NEEDED=true + DATA_NEEDED=true + UX_NEEDED=true + FINANCIAL_NEEDED=true +fi +``` + +--- + ## Stage 1 — Launch Specialized Review Agents (Parallel) -Now launch all 5 specialized review agents in parallel using the Task tool. +Now launch the selected review agents in parallel using the Task tool. **IMPORTANT:** Use a SINGLE message with multiple Task tool invocations to run them in parallel. -Display: "🚀 Launching 5 specialized review agents in parallel..." +Read `/tmp/selected_agents.txt` to determine which agents to launch. + +Display: "🚀 Launching [N] specialized review agents in parallel..." + +**Note**: Only launch agents that are needed based on the mode and smart selection. Check the variables: +- `$SECURITY_NEEDED` - Launch Security Agent if true +- `$DATA_NEEDED` - Launch Data Integrity Agent if true +- `$UX_NEEDED` - Launch UX Agent if true +- `$FINANCIAL_NEEDED` - Launch Financial Agent if true +- Always launch: Architecture, Testing, Standards (in all modes except quick which uses Testing, UX, Standards) ### Agent 1: Security Review 🔒 @@ -154,9 +351,17 @@ EXPERTISE: Authentication, authorization, data protection, vulnerability detecti MISSION: Review this PR for security vulnerabilities. CONTEXT: -- Risk Score: [calculated score]/10 +- Risk Score: [calculated score]/[max] +- Risk Level: [LOW/MEDIUM/HIGH/CRITICAL] - Day: [day of week] - Changed Files: [N] +- Lines Changed: +[X] -[Y] + +INSTRUCTIONS: +1. Read /tmp/pr_diff.txt for the diff +2. Read /tmp/changed_files.txt for the list of changed files +3. For EACH changed file, read the FULL file content (not just the diff) to understand context +4. Search for related security-critical files (auth, API, permissions) CRITICAL FOCUS: - Authentication (pages/api/auth/, session handling) @@ -164,28 +369,32 @@ CRITICAL FOCUS: - API authorization, secrets exposure - Input validation, XSS, SQL injection, CSRF - Cookie security, CORS configuration - -Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt +- Data access controls (ensure users can only access their own data) OUTPUT FORMAT: ## 🔒 Security Agent Review -### Critical Security Issues (BLOCKING) +### Critical Security Issues (BLOCKING) - Severity: 10/10 [Issues that MUST be fixed - be specific with file:line] - **File:Line** - Issue description - - Risk: High/Critical + - Severity: 10/10 + - Risk: What attack vector this enables - Impact: What could happen - - Fix: How to fix + - Fix: Specific code change needed -### Security Concerns (IMPORTANT) +### Security Concerns (IMPORTANT) - Severity: 6-9/10 [Issues that should be fixed] - **File:Line** - Concern - - Risk: Medium - - Recommendation: Action + - Severity: [6-9]/10 + - Risk: Potential vulnerability + - Recommendation: How to fix -### Security Suggestions +### Security Suggestions - Severity: 3-5/10 [Nice-to-have improvements] +- Improvement suggestion + - Severity: [3-5]/10 + - Benefit: Why this matters ### Questions for Other Agents - **To [Agent]**: Question @@ -194,12 +403,64 @@ OUTPUT FORMAT: - Overall: High/Medium/Low - Areas needing deeper analysis: [list] +CODEBASE CONTEXT SEARCH: +Before flagging an issue, search for how similar code is handled in the codebase: +1. Use Grep tool to find similar patterns +2. Check if this pattern is used consistently +3. Reference existing good examples +4. Don't flag patterns used across the codebase + +Example: +- Found: Potential auth bypass +- Search: grep -r "requireAuth" src/ +- Result: Pattern used consistently +- Decision: Check if this file also uses it + +AUTOMATED FIX GENERATION: +When you find fixable security issues, provide automated fixes: + +Format: +### Automated Fix #N: [Issue Title] +**File**: `path/to/file.tsx:42` +**Issue**: [Brief description] +**Fix Type**: auto-fixable +**Confidence**: High/Medium/Low +**Category**: security + +```diff +- [old code with vulnerability] ++ [new code with fix] +``` + +**Apply command**: +```bash +cat > /tmp/automated_fixes/fix_N_security.sh << 'EOF' +#!/bin/bash +# Fix: [description] +# File: path/to/file.tsx + +# [Bash commands to apply fix using sed or other tools] +sed -i '' 's/vulnerable_pattern/secure_pattern/g' path/to/file.tsx +EOF +chmod +x /tmp/automated_fixes/fix_N_security.sh +``` + +FIXABLE SECURITY ISSUES: +- Missing authentication checks +- Exposed sensitive data +- Missing input validation +- Insecure session handling + GUIDELINES: - Be specific with file:line references +- Rate severity on 1-10 scale for consensus - Explain WHY it's a risk, not just WHAT -- Consider MPDX context (donor data, financial info) +- Consider MPDX context (donor data, financial info, PII) - Don't flag if clearly handled elsewhere - Focus on practical risks, not theoretical +- READ THE FULL FILES for context, not just the diff +- Search codebase before flagging to avoid false positives +- Generate automated fixes for simple security improvements ``` ### Agent 2: Architecture Review 🏗️ @@ -219,10 +480,16 @@ EXPERTISE: System design, patterns, technical debt, maintainability, scalability MISSION: Review this PR for architectural concerns and design issues. CONTEXT: -- Risk Score: [calculated score]/10 -- Day: [day of week] +- Risk Score: [calculated score]/[max] +- Risk Level: [LOW/MEDIUM/HIGH/CRITICAL] - Changed Files: [N] +INSTRUCTIONS: +1. Read /tmp/pr_diff.txt and /tmp/changed_files.txt +2. Read FULL content of changed files for context +3. Read CLAUDE.md for project patterns +4. Search for usage patterns of modified components/functions + CRITICAL FOCUS: - GraphQL schema design (pages/api/Schema/, .graphql files) - Apollo Client cache (src/lib/apollo/) @@ -230,45 +497,67 @@ CRITICAL FOCUS: - Component architecture, state management - API design, pattern consistency - Technical debt creation/reduction - -Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt -Also read .claude/CLAUDE.md for project patterns. +- Scalability concerns OUTPUT FORMAT: ## 🏗️ Architecture Agent Review -### Critical Architecture Issues (BLOCKING) +### Critical Architecture Issues (BLOCKING) - Severity: 10/10 - **File:Line** - Issue + - Severity: 10/10 - Problem: What's architecturally wrong - Impact: Long-term consequences - Alternative: Better approach -### Architecture Concerns (IMPORTANT) +### Architecture Concerns (IMPORTANT) - Severity: 6-9/10 - **File:Line** - Concern + - Severity: [6-9]/10 - Issue: Description - Recommendation: How to improve -### Architecture Suggestions -[Better patterns] +### Architecture Suggestions - Severity: 3-5/10 +[Better patterns and approaches] +- Severity: [3-5]/10 -### Technical Debt +### Technical Debt Analysis - Debt Added: [what new debt] - Debt Removed: [what debt fixed] - Net Impact: Better/Worse/Neutral +### Pattern Compliance +- Follows CLAUDE.md standards: Yes/No/Partial +- Violations: [list] + ### Questions for Other Agents - **To [Agent]**: Question ### Confidence - Overall: High/Medium/Low +CODEBASE CONTEXT SEARCH: +Before flagging architectural issues, search for existing patterns: +1. Use Grep to find how similar problems are solved +2. Check if pattern is used consistently across codebase +3. Reference good examples to suggest +4. Don't flag patterns that match established architecture + +AUTOMATED FIX GENERATION: +Generate fixes for common architectural issues: +- Inconsistent file naming +- Missing exports +- Improper component structure + +Format same as Security Agent above with category: architecture + GUIDELINES: +- Rate severity on 1-10 scale - Focus on long-term maintainability -- Identify pattern inconsistencies +- Identify pattern inconsistencies vs CLAUDE.md - Consider scalability - Balance pragmatism vs purity -- Reference CLAUDE.md standards +- Search codebase before flagging inconsistencies +- Generate fixes for structural issues ``` ### Agent 3: Data Integrity Review 💾 @@ -288,57 +577,87 @@ EXPERTISE: GraphQL, data flow, caching, type safety, financial accuracy, data co MISSION: Review this PR for data correctness and integrity. CONTEXT: -- Risk Score: [calculated score]/10 -- Day: [day of week] +- Risk Score: [calculated score]/[max] - Changed Files: [N] +INSTRUCTIONS: +1. Read diff and changed files +2. Read FULL files for data flow context +3. Search for related GraphQL operations +4. Check for financial calculation changes + CRITICAL FOCUS: -- GraphQL queries/mutations -- Apollo cache normalization (must have `id` fields) +- GraphQL queries/mutations (check for `id` fields!) +- Apollo cache normalization - Data fetching patterns, pagination -- Financial calculations (donations, pledges, amounts) +- Financial calculations (donations, pledges, amounts) - CRITICAL! - Data consistency across updates - Optimistic responses, type safety - Dual GraphQL server architecture - -Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt +- Currency handling, rounding OUTPUT FORMAT: ## 💾 Data Integrity Agent Review -### Critical Data Issues (BLOCKING) +### Critical Data Issues (BLOCKING) - Severity: 10/10 - **File:Line** - Issue + - Severity: 10/10 - Problem: Data integrity concern - Impact: What could go wrong - Fix: Required action -### Data Concerns (IMPORTANT) +### Data Concerns (IMPORTANT) - Severity: 6-9/10 - **File:Line** - Concern + - Severity: [6-9]/10 - Issue: Description - Recommendation: Fix -### Data Suggestions +### Data Suggestions - Severity: 3-5/10 [Better data handling] -### GraphQL Specific +### GraphQL Specific Checks - Missing `id` fields: [list] - Cache policy issues: [concerns] -- Fragment reuse: [opportunities] +- Fragment reuse opportunities: [list] +- Pagination properly handled: Yes/No + +### Financial Accuracy Review +- Financial calculations reviewed: Yes/No/N/A +- Currency handling correct: Yes/No/N/A +- Rounding issues: None/[issues] ### Questions for Other Agents - **To [Agent]**: Question ### Confidence - Overall: High/Medium/Low -- Financial review: Reviewed/Not Applicable +- Financial review confidence: High/Medium/Low/N/A + +CODEBASE CONTEXT SEARCH: +Search for GraphQL patterns before flagging: +1. Find similar queries/mutations +2. Check if id fields are consistently used +3. Look for established cache update patterns +4. Reference good examples + +AUTOMATED FIX GENERATION: +Generate fixes for data integrity issues: +- Missing id fields in GraphQL queries +- Missing __typename fields +- Incorrect cache updates +- Type inconsistencies + +Category: graphql or data-integrity GUIDELINES: -- Financial accuracy is CRITICAL -- Check cache updates -- Verify pagination +- Financial accuracy is CRITICAL - flag ANY doubts +- Check cache updates match data changes +- Verify pagination logic - Ensure type safety - Consider data consistency +- Search for similar GraphQL patterns before flagging +- Generate fixes for missing fields ``` ### Agent 4: Testing & Quality Review 🧪 @@ -358,10 +677,15 @@ EXPERTISE: Test coverage, test quality, edge cases, error handling, code quality MISSION: Review this PR for testing adequacy and code quality. CONTEXT: -- Risk Score: [calculated score]/10 -- Day: [day of week] +- Risk Score: [calculated score]/[max] - Changed Files: [N] +INSTRUCTIONS: +1. Read diff and changed files +2. For each modified component/function, check if tests exist +3. Read test files to assess quality +4. Search for error handling patterns + CRITICAL FOCUS: - Test coverage for new code - Test quality and maintainability @@ -370,37 +694,39 @@ CRITICAL FOCUS: - Mock data usage (prefer shared mockData) - Type safety (avoid `any` types) - Code quality (unused imports, console.logs) - -Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt +- Error boundaries and fallbacks OUTPUT FORMAT: ## 🧪 Testing & Quality Agent Review -### Critical Testing Gaps (BLOCKING) +### Critical Testing Gaps (BLOCKING) - Severity: 10/10 - **File:Line** - Gap + - Severity: 10/10 - Missing: What's not tested - Risk: Why critical - Required: What tests to add -### Testing Concerns (IMPORTANT) +### Testing Concerns (IMPORTANT) - Severity: 6-9/10 - **File:Line** - Concern + - Severity: [6-9]/10 - Issue: Description - Recommendation: Improvement -### Code Quality Issues -- Unused imports: [list] -- Console.logs: [list] -- Type safety: [any types] -- Other: [issues] +### Code Quality Issues - Severity: varies +- Unused imports: [list with file:line] +- Console.logs: [list with file:line] +- Type safety issues (`any` types): [list] +- Other issues: [list] -### Testing Suggestions +### Testing Suggestions - Severity: 3-5/10 [Improvements] ### Coverage Assessment - New code tested: Yes/Partial/No -- Edge cases: [what's covered] -- Missing tests: [critical gaps] +- Edge cases covered: [list what's covered] +- Error handling tested: Yes/Partial/No +- Missing critical tests: [list] ### Questions for Other Agents - **To [Agent]**: Question @@ -408,12 +734,31 @@ OUTPUT FORMAT: ### Confidence - Overall: High/Medium/Low +CODEBASE CONTEXT SEARCH: +Search for testing patterns: +1. Find similar component tests +2. Check how mocks are typically structured +3. Look for established test patterns +4. Reference good test examples + +AUTOMATED FIX GENERATION: +Generate fixes for common testing issues: +- Unused imports (can be auto-removed) +- Console.logs in production code +- Missing test skeletons (generate basic structure) +- Type issues (add explicit types) + +Categories: imports, tests, types, code-quality + GUIDELINES: -- Critical paths MUST have tests -- Don't require tests for trivial code -- Focus on edge cases and errors +- Critical business logic MUST have tests +- Don't require tests for trivial UI-only changes +- Focus on edge cases and error paths - Check test quality, not just existence - Verify mocks are realistic +- Flag console.logs in non-test code +- Generate fixes for code quality issues +- Provide test skeleton templates ``` ### Agent 5: UX Review 👤 @@ -433,51 +778,59 @@ EXPERTISE: UI/UX, accessibility, performance, localization, user-facing concerns MISSION: Review this PR for user experience and usability. CONTEXT: -- Risk Score: [calculated score]/10 -- Day: [day of week] +- Risk Score: [calculated score]/[max] - Changed Files: [N] +INSTRUCTIONS: +1. Read diff and full changed files +2. Look for user-facing changes +3. Check for localization compliance +4. Review loading/error states + CRITICAL FOCUS: - Component usability, intuitiveness -- Loading states (must show for async) +- Loading states (MUST show for async operations) - Error messages (user-friendly, localized) -- Accessibility (ARIA, keyboard nav) -- Performance (re-renders, heavy calculations) -- Localization (all text uses `t()` function) +- Accessibility (ARIA, keyboard nav, screen readers) +- Performance (re-renders, heavy calculations, useMemo) +- Localization (ALL user-visible text uses `t()` function) - Responsive design - Form validation, error display - -Read the git diff from /tmp/pr_diff.txt and the file list from /tmp/changed_files.txt +- Empty states, null handling OUTPUT FORMAT: ## 👤 UX Agent Review -### Critical UX Issues (BLOCKING) +### Critical UX Issues (BLOCKING) - Severity: 10/10 - **File:Line** - Issue + - Severity: 10/10 - Problem: UX concern - User Impact: How affects users - Fix: Required action -### UX Concerns (IMPORTANT) +### UX Concerns (IMPORTANT) - Severity: 6-9/10 - **File:Line** - Concern + - Severity: [6-9]/10 - Issue: Description - Recommendation: Improvement ### Accessibility Issues -- Missing ARIA: [list] -- Keyboard nav: [issues] -- Screen reader: [concerns] +- Missing ARIA labels: [list with file:line] +- Keyboard navigation: [issues] +- Screen reader support: [concerns] +- Color contrast: [issues] ### Localization Issues -- Hardcoded strings: [not using t()] -- Missing translations: [keys] +- Hardcoded strings (not using t()): [list with file:line] +- Missing translation keys: [list] ### Performance Concerns -- Re-render issues: [list] -- Heavy calculations: [list] +- Unnecessary re-renders: [list] +- Missing useMemo/useCallback: [list] +- Heavy calculations in render: [list] -### UX Suggestions +### UX Suggestions - Severity: 3-5/10 [Improvements] ### Questions for Other Agents @@ -486,19 +839,318 @@ OUTPUT FORMAT: ### Confidence - Overall: High/Medium/Low +CODEBASE CONTEXT SEARCH: +Search for UX patterns: +1. Find similar components for UX patterns +2. Check how loading states are typically shown +3. Look for localization patterns +4. Reference accessible components + +AUTOMATED FIX GENERATION: +Generate fixes for UX issues: +- Missing localization (wrap in t()) +- Hardcoded strings +- Missing ARIA labels +- Simple accessibility improvements + +Category: localization, accessibility, ux + GUIDELINES: - Put yourself in user's shoes - Consider error scenarios -- Check text is localized -- Verify loading states exist +- ALL user-visible text MUST use t() +- Verify loading states exist for async - Consider accessibility +- Think about mobile users +- Generate automated localization fixes +- Provide ARIA attribute additions +``` + +### Agent 6: Financial Accuracy Review 💰 + +Use the Task tool with: + +- **description**: "Financial accuracy review" +- **subagent_type**: "general-purpose" +- **model**: "opus" +- **prompt**: + +``` +You are the Financial Accuracy Review Agent for MPDX code review. + +EXPERTISE: Financial calculations, currency handling, donation tracking, pledge management, monetary accuracy. + +MISSION: Review this PR for financial calculation accuracy and monetary data integrity. + +CONTEXT: +- Risk Score: [calculated score]/[max] +- Changed Files: [N] + +INSTRUCTIONS: +1. Read diff and full changed files +2. Search for financial-related terms: donation, pledge, gift, amount, currency, balance, total, payment +3. Look for arithmetic operations on monetary values +4. Check for currency conversion + +CRITICAL FOCUS: +- Donation amount calculations +- Pledge tracking accuracy +- Gift processing +- Currency handling (USD, CAD, etc.) +- Rounding (financial rounding to 2 decimals) +- Sum/aggregate calculations +- Balance calculations +- Tax calculations +- Report totals +- Data type precision (use Decimal, not Float) + +OUTPUT FORMAT: + +## 💰 Financial Accuracy Agent Review + +### Critical Financial Issues (BLOCKING) - Severity: 10/10 +[These MUST be fixed - money errors are unacceptable] +- **File:Line** - Issue + - Severity: 10/10 + - Problem: Financial calculation error + - Impact: Incorrect donor data / financial reports + - Fix: Required correction + +### Financial Concerns (IMPORTANT) - Severity: 6-9/10 +- **File:Line** - Concern + - Severity: [6-9]/10 + - Issue: Potential accuracy problem + - Recommendation: How to fix + +### Financial Suggestions - Severity: 3-5/10 +[Better financial handling practices] + +### Financial Checklist +- Currency handling correct: Yes/No/N/A +- Rounding to 2 decimals: Yes/No/N/A +- Using appropriate numeric types: Yes/No/N/A +- Aggregations accurate: Yes/No/N/A +- Financial tests present: Yes/No/N/A + +### Questions for Other Agents +- **To Data Integrity**: [questions about data flow] +- **To Testing**: [questions about financial test coverage] + +### Confidence +- Overall: High/Medium/Low +- Calculations reviewed: [list what was checked] + +CODEBASE CONTEXT SEARCH: +Search for financial patterns: +1. Find similar financial calculations +2. Check established rounding practices +3. Look for currency handling patterns +4. Reference correct implementations + +AUTOMATED FIX GENERATION: +Generate fixes for financial issues: +- Incorrect rounding (fix to 2 decimals) +- Missing currency validation +- Type precision issues +- Calculation errors + +Category: financial + +GUIDELINES: +- Financial errors are CRITICAL - be thorough +- Even small rounding errors matter +- Check ALL arithmetic on money +- Verify currency is consistent +- Flag any uncertainty - better safe than sorry +- Consider tax implications +- Think about edge cases (negative amounts, zero, very large numbers) +- Search for similar calculations before flagging +- Generate fixes for rounding and precision issues + +IMPORTANT: If you don't see any financial code, just note "No financial code in this PR" and skip to confidence section. ``` -After launching all 5 agents, display: +### Agent 7: MPDX Standards Compliance Review 📋 + +Use the Task tool with: + +- **description**: "MPDX standards compliance" +- **subagent_type**: "general-purpose" +- **model**: "opus" +- **prompt**: ``` -✅ All 5 agents launched in parallel +You are the MPDX Standards Compliance Review Agent. + +EXPERTISE: MPDX-specific coding standards, patterns, and conventions from CLAUDE.md. + +MISSION: Verify this PR follows MPDX project standards and conventions. + +CONTEXT: +- Risk Score: [calculated score]/[max] +- Changed Files: [N] + +INSTRUCTIONS: +1. Read CLAUDE.md thoroughly +2. Read diff and changed files +3. Check each standard against the code + +MPDX STANDARDS CHECKLIST: + +**File Naming:** +- [ ] Components use PascalCase (e.g., ContactDetails.tsx) +- [ ] Pages use kebab-case with .page.tsx +- [ ] Tests use same name as file + .test.tsx +- [ ] GraphQL files use PascalCase .graphql + +**Exports:** +- [ ] Uses named exports (NO default exports) +- [ ] Component exports: `export const ComponentName: React.FC = () => {}` + +**GraphQL:** +- [ ] All queries/mutations have `id` fields for cache normalization +- [ ] Operation names are descriptive (not starting with "Get" or "Load") +- [ ] `yarn gql` was run (check for .generated.ts files) +- [ ] Pagination handled for `nodes` fields + +**Localization:** +- [ ] All user-visible text uses `t()` from useTranslation +- [ ] No hardcoded English strings + +**Testing:** +- [ ] Uses GqlMockedProvider for GraphQL mocking +- [ ] Test describes what it tests clearly +- [ ] Proper async handling with findBy/waitFor + +**Code Quality:** +- [ ] No console.logs (except in error handlers) +- [ ] No unused imports +- [ ] TypeScript types (no `any` unless justified) +- [ ] Proper error handling + +**Package Management:** +- [ ] Uses yarn (not npm) + +OUTPUT FORMAT: + +## 📋 MPDX Standards Compliance Review + +### Standards Violations (BLOCKING) - Severity: 8-10/10 +[Clear violations of project standards] +- **File:Line** - Violation + - Severity: [8-10]/10 + - Standard: What standard violated + - Issue: What's wrong + - Fix: How to fix + +### Standards Concerns (IMPORTANT) - Severity: 5-7/10 +- **File:Line** - Concern + - Severity: [5-7]/10 + - Issue: Doesn't quite follow standards + - Recommendation: How to align + +### Standards Checklist Results +**File Naming**: ✅/⚠️/❌ +**Exports**: ✅/⚠️/❌ +**GraphQL**: ✅/⚠️/❌ (or N/A) +**Localization**: ✅/⚠️/❌ +**Testing**: ✅/⚠️/❌ +**Code Quality**: ✅/⚠️/❌ +**Package Management**: ✅/⚠️/❌ + +### Pattern Deviations +[List any deviations from CLAUDE.md patterns] + +### Suggestions for Better Alignment +[How to better follow MPDX patterns] + +### Questions for Other Agents +- **To [Agent]**: Question + +### Confidence +- Overall: High/Medium/Low +- Standards knowledge: Complete/Partial + +CODEBASE CONTEXT SEARCH: +Search for standard patterns: +1. Check how similar files are structured +2. Look for naming conventions +3. Find export patterns +4. Reference compliant examples + +AUTOMATED FIX GENERATION: +Generate fixes for standards violations: +- Incorrect file naming +- Default exports (convert to named) +- Missing yarn usage +- Import/export inconsistencies + +Category: standards + +GUIDELINES: +- Reference specific sections of CLAUDE.md +- Distinguish between violations and preferences +- Be constructive, not pedantic +- Explain WHY standards matter +- Search codebase for patterns before flagging +- Generate fixes for simple standards violations +``` + +After launching selected agents, display: + +``` +✅ All [N] agents launched in parallel ⏳ Waiting for agents to complete their reviews... +💰 Estimated cost: $[X.XX] +``` + +--- + +## Stage 1B — Dependency Impact Analysis (Parallel) + +While agents are running, analyze dependency impact in parallel: + +```bash +echo "🔍 Analyzing dependency impact..." +echo "" + +# For each changed TypeScript/TSX file, find dependents +while IFS= read -r changed_file; do + # Skip non-code files + [[ ! "$changed_file" =~ \.(ts|tsx|js|jsx)$ ]] && continue + + # Extract filename without extension + filename=$(basename "$changed_file" | sed 's/\.[^.]*$//') + + # Search for imports of this file + grep -r "from.*['\"].*$filename['\"]" src/ \ + --include="*.ts" --include="*.tsx" \ + 2>/dev/null | cut -d: -f1 | sort -u > "/tmp/dependents_${filename}.txt" + + dependent_count=$(wc -l < "/tmp/dependents_${filename}.txt" 2>/dev/null || echo 0) + + if [ "$dependent_count" -gt 15 ]; then + echo "🚨 CRITICAL IMPACT: $changed_file has $dependent_count dependents" | tee -a /tmp/dependency_impact.txt + elif [ "$dependent_count" -gt 10 ]; then + echo "⚠️ HIGH IMPACT: $changed_file has $dependent_count dependents" | tee -a /tmp/dependency_impact.txt + elif [ "$dependent_count" -gt 5 ]; then + echo "📊 MEDIUM IMPACT: $changed_file has $dependent_count dependents" | tee -a /tmp/dependency_impact.txt + fi +done < /tmp/changed_files.txt + +echo "" | tee -a /tmp/dependency_impact.txt + +# Check for breaking changes (removed exports) +echo "Checking for breaking changes..." | tee -a /tmp/dependency_impact.txt +git diff $BASE_REF..$HEAD_REF 2>/dev/null | grep "^-export" | grep -v "^---" > /tmp/breaking_changes.txt 2>/dev/null || true + +if [ -s /tmp/breaking_changes.txt ]; then + echo "⚠️ BREAKING CHANGES DETECTED:" | tee -a /tmp/dependency_impact.txt + cat /tmp/breaking_changes.txt | tee -a /tmp/dependency_impact.txt +fi + +echo "✅ Dependency analysis complete" +echo "" ``` --- @@ -514,12 +1166,14 @@ Agent Reviews Complete: ✅ 💾 Data Integrity Agent - Found [X] critical, [Y] concerns ✅ 🧪 Testing Agent - Found [X] critical, [Y] concerns ✅ 👤 UX Agent - Found [X] critical, [Y] concerns +✅ 💰 Financial Agent - Found [X] critical, [Y] concerns +✅ 📋 Standards Agent - Found [X] violations, [Y] concerns ``` Parse each agent's output and extract: -- Critical issues (BLOCKING) -- Important concerns +- Critical issues with severity scores +- Important concerns with severity scores - Suggestions - Questions for other agents - Confidence level @@ -528,13 +1182,64 @@ Store these in structured format for the debate rounds. --- +## Stage 2B — Extract & Organize Automated Fixes + +Parse agent outputs for automated fixes: + +```bash +echo "🔧 Extracting automated fixes from agent reports..." +echo "" + +# Count fixes found +FIX_COUNT=$(find /tmp/automated_fixes -name "*.sh" 2>/dev/null | wc -l | tr -d ' ') + +if [ "$FIX_COUNT" -gt 0 ]; then + echo "Found $FIX_COUNT automated fixes" + echo "" + + # Organize by category and confidence + echo "By Category:" > /tmp/fix_summary.txt + for category in localization types imports graphql tests security architecture data-integrity ux accessibility financial standards code-quality; do + count=$(find /tmp/automated_fixes -name "*_${category}.sh" 2>/dev/null | wc -l | tr -d ' ') + if [ "$count" -gt 0 ]; then + echo " • $category: $count fixes" | tee -a /tmp/fix_summary.txt + fi + done + + echo "" | tee -a /tmp/fix_summary.txt + cat /tmp/fix_summary.txt + + # Create master apply script + cat > /tmp/automated_fixes/apply_all.sh << 'EOF' +#!/bin/bash +echo "Applying all automated fixes..." +for fix in /tmp/automated_fixes/fix_*.sh; do + if [ -f "$fix" ]; then + echo "Applying: $(basename $fix)" + bash "$fix" + fi +done +echo "✅ All fixes applied" +echo "Review changes with: git diff" +echo "To undo: git checkout ." +EOF + chmod +x /tmp/automated_fixes/apply_all.sh +else + echo "No automated fixes available" +fi + +echo "" +``` + +--- + ## Stage 3 — Cross-Examination Debate (Round 1) Now facilitate the first debate round where agents challenge each other. Display: "🗣️ Starting cross-examination debate round..." -For each of the 5 agents, launch a new Task with their original findings plus all other agents' findings: +For each of the 7 agents, launch a new Task with their original findings plus all other agents' findings. ### Debate Prompt Template @@ -542,37 +1247,25 @@ Use the Task tool for each agent with: - **description**: "[Agent name] cross-examination" - **subagent_type**: "general-purpose" -- **model**: (same as original agent) +- **model**: "opus" - **prompt**: ``` You are the [Agent Name] in the cross-examination debate phase. YOUR ORIGINAL FINDINGS: -[Paste that agent's original review output] +[Paste that agent's original review output with severity scores] OTHER AGENTS' FINDINGS: - -🔒 SECURITY AGENT FOUND: -[Security agent's findings] - -🏗️ ARCHITECTURE AGENT FOUND: -[Architecture agent's findings] - -💾 DATA INTEGRITY AGENT FOUND: -[Data Integrity agent's findings] - -🧪 TESTING AGENT FOUND: -[Testing agent's findings] - -👤 UX AGENT FOUND: -[UX agent's findings] +[All other agents' findings with severity scores] MISSION: Review other agents' findings from your specialized perspective. -DEBATE ACTIONS: -1. **CHALLENGE** - Disagree with a finding (max 3 challenges) -2. **SUPPORT** - Strongly agree and add context +DEBATE ACTIONS (use severity scores to prioritize): +1. **CHALLENGE** - Disagree with a finding (max 3 challenges, focus on severity 7+) + - Cite your reasoning with evidence + - Suggest revised severity score +2. **SUPPORT** - Strongly agree and add context (for severity 8+) 3. **EXPAND** - Build on a finding with additional concerns 4. **QUESTION** - Ask for clarification @@ -580,6 +1273,7 @@ RULES: - Maximum 3 challenges (focus on important disagreements) - Provide specific reasoning and evidence - Reference file:line when possible +- Suggest severity score adjustments (1-10) - Be constructive, not combative OUTPUT FORMAT: @@ -588,18 +1282,22 @@ OUTPUT FORMAT: ### Challenges - **Challenge to [Agent X] re: [finding]** + - Original severity: [X]/10 - Why I disagree: [reasoning] - Evidence: [supporting evidence] + - Revised severity: [Y]/10 - Revised view: [your assessment] ### Strong Support -- **Support for [Agent X] re: [finding]** +- **Support for [Agent X] re: [finding at severity [X]/10]** - Additional context: [your perspective] - Added concerns: [related issues] + - Severity agreement: [X]/10 is correct ### Expansions - **Building on [Agent X]'s [topic]**: - - [Your additional concerns] + - Additional severity: [+N] points + - Reasoning: [why more severe] ### Questions - **To [Agent X]**: [question] @@ -611,7 +1309,7 @@ OUTPUT FORMAT: - Key disagreements: [main contentions] ``` -Launch all 5 debate agents in parallel. +Launch all 7 debate agents in parallel. Display progress: @@ -634,36 +1332,41 @@ Use the Task tool with: - **description**: "[Agent name] rebuttal" - **subagent_type**: "general-purpose" -- **model**: (same as original) +- **model**: "opus" - **prompt**: ``` You are the [Agent Name] responding to challenges from debate round 1. YOUR ORIGINAL FINDINGS: -[Their original findings] +[Their original findings with severity scores] CHALLENGES RAISED AGAINST YOU: +[List each challenge with severity score adjustments] -[List each challenge with the challenging agent's name and reasoning] - -MISSION: Respond to each challenge. +MISSION: Respond to each challenge, adjusting severity scores based on evidence. RESPONSE OPTIONS: 1. **DEFEND** - Additional evidence supports your finding + - Maintain original severity score 2. **CONCEDE** - Acknowledge challenge, downgrade/remove finding + - Lower severity score or remove 3. **REVISE** - Update finding based on new perspective + - Adjust severity score 4. **ESCALATE** - Flag as unresolved, needs human senior review + - Mark for human decision OUTPUT FORMAT: ## [Agent Name] - Rebuttals ### Response to Challenge #1 from [Agent] +- Original Severity: [X]/10 - Decision: DEFEND/CONCEDE/REVISE/ESCALATE - Reasoning: [explanation] +- Final Severity: [Y]/10 - Updated Finding (if revised): - - Severity: Critical/Important/Suggestion + - Severity: [Y]/10 - Description: [updated] ### Response to Challenge #2 @@ -674,6 +1377,7 @@ OUTPUT FORMAT: - Conceded: [N] - Revised: [N] - Escalated: [N] +- Average severity adjustment: [+/-X] ``` Launch rebuttal tasks for all challenged agents. @@ -689,27 +1393,30 @@ Display: ## Stage 5 — Consensus Synthesis -Now analyze all findings, debates, and resolutions to build consensus. +Now analyze all findings, debates, and final severity scores to build consensus. **Process:** -1. Collect all final findings (original + revised from rebuttals) +1. Collect all final findings with severity scores 2. Group by similarity (same file:line or same general issue) -3. Count agent agreement for each finding -4. Classify by consensus level +3. Calculate average severity score for each finding +4. Count agent agreement -**Consensus Levels:** +**Consensus Levels (using severity scores):** -- **Unanimous** (4-5 agents agree) → BLOCKING / HIGH PRIORITY -- **Majority** (3 agents agree) → IMPORTANT / MEDIUM PRIORITY -- **Minority** (1-2 agents) → SUGGESTION / LOW PRIORITY -- **Unresolved Debate** (agents couldn't agree) → NEEDS HUMAN REVIEW +- **Average 9-10, 4+ agents**: CRITICAL BLOCKER +- **Average 8-9, 3+ agents**: HIGH PRIORITY BLOCKER +- **Average 7-8, 3+ agents**: IMPORTANT (should fix before merge) +- **Average 5-7, 2+ agents**: MEDIUM PRIORITY +- **Average 3-5, 1-2 agents**: SUGGESTION +- **Unresolved Debate** (agents couldn't agree, severity differs by 4+): NEEDS HUMAN REVIEW For each grouped finding, determine: -- Final severity: BLOCKING / IMPORTANT / SUGGESTION +- Final severity: Average of all agent severity scores +- Classification: BLOCKING / IMPORTANT / SUGGESTION - Which agents flagged it -- Debate summary (if there was disagreement) +- Debate summary - Consensus strength Display a summary: @@ -718,17 +1425,157 @@ Display a summary: 📊 Consensus Analysis: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -Blocking Issues (Unanimous): [N] -Important Concerns (Majority): [N] -Suggestions (Minority): [N] +Critical Blockers (Severity 9-10): [N] +High Priority Blockers (Severity 8-9): [N] +Important Issues (Severity 7-8): [N] +Medium Priority (Severity 5-7): [N] +Suggestions (Severity 3-5): [N] Unresolved Debates: [N] Total Findings: [N] +Average Confidence: [High/Medium/Low] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ``` --- +## Stage 5B — Generate Historical Metrics Dashboard + +Create quality dashboard to commit to PR: + +```bash +echo "📊 Generating quality metrics dashboard..." + +# Get PR info +PR_NUM=$(gh pr view --json number -q .number 2>/dev/null || echo "local") +CURRENT_DATE=$(date +%Y-%m-%d) +GITHUB_USER=$(git config user.name || echo "Developer") + +# Calculate current severity (from consensus) +# This should be calculated from the actual consensus findings +CURRENT_SEVERITY="[X.X]" # Replace with actual average severity + +# Calculate averages from history +if [ -f .claude/review-metrics/severity_history.txt ]; then + AVG_SEVERITY=$(awk '{sum+=$2; count++} END {printf "%.1f", sum/count}' .claude/review-metrics/severity_history.txt 2>/dev/null || echo "N/A") + LAST_10=$(tail -10 .claude/review-metrics/severity_history.txt | awk '{print $2}') +else + AVG_SEVERITY="N/A" + LAST_10="" +fi + +# Generate dashboard +cat > .claude/pr-metrics/PR_${PR_NUM}_metrics.md << EOF +# 📊 Code Quality Metrics Dashboard + +**PR**: #${PR_NUM} +**Date**: ${CURRENT_DATE} +**Developer**: @${GITHUB_USER} +**Review Mode**: ${AGENT_MODE} + +--- + +## 📈 Quality Trend + +### Current PR +- **Quality Score**: ${CURRENT_SEVERITY}/10 +- **Risk Level**: [from Stage 0] +- **Findings**: [N] blockers, [N] important, [N] suggestions + +### Historical Comparison +- **Your Average**: ${AVG_SEVERITY}/10 (last 10 PRs) +- **Trend**: [↗️ Improving / → Stable / ↘️ Declining] + +\`\`\` +Last 10 PRs: +${LAST_10} +\`\`\` + +--- + +## 🔍 This Review + +### Agents Used +- **Mode**: ${AGENT_MODE} +- **Agents**: [list selected agents] +- **Cost**: \$[X.XX] +- **Time**: [X] minutes + +### Key Findings +1. [Top issue category] - [count] issues +2. [Second category] - [count] issues +3. [Third category] - [count] issues + +### Automated Fixes Available +- **Total Fixes**: ${FIX_COUNT} +- **High Confidence**: [count] +- **Categories**: [list] + +--- + +## 💡 Improvement Recommendations + +Based on patterns in your recent PRs: +1. [Recommendation based on common issues] +2. [Recommendation based on trends] +3. [Recommendation for quality improvement] + +--- + +## 📦 Dependency Impact + +[Include high-impact changes from dependency analysis] + +--- + +## 💰 Review ROI + +- **AI Review Cost**: \$[X.XX] +- **Estimated Human Review Time**: 30-45 minutes +- **Estimated Human Review Cost**: ~\$150 +- **Savings**: ~\$[Y] +- **Issues Caught Pre-Review**: [N] + +--- + +_Generated by AI Code Review v3.0 | [Full Report](../tmp/agent_review_report.md)_ +EOF + +echo "✅ Metrics dashboard created: .claude/pr-metrics/PR_${PR_NUM}_metrics.md" +``` + +### Update Review History + +```bash +# Append to history +echo "$PR_NUM $CURRENT_SEVERITY $CURRENT_DATE" >> .claude/review-metrics/severity_history.txt + +# Save detailed review +cat > .claude/review-history/${CURRENT_DATE}_${PR_NUM}.json << EOF +{ + "date": "$CURRENT_DATE", + "pr_number": "$PR_NUM", + "severity": $CURRENT_SEVERITY, + "mode": "$AGENT_MODE", + "agents_used": ${#SELECTED_AGENTS[@]}, + "cost": "[actual cost]", + "time_minutes": "[actual time]", + "findings": { + "critical": [N], + "high": [N], + "important": [N], + "suggestions": [N] + }, + "fixes_available": $FIX_COUNT +} +EOF + +echo "✅ Review history updated" +echo "" +``` + +--- + ## Stage 6 — Generate Review Report Create the comprehensive review report in markdown format: @@ -737,7 +1584,8 @@ Create the comprehensive review report in markdown format: # 🤖 Multi-Agent Code Review Report **Generated**: [timestamp] -**Agents**: 5 specialized reviewers with debate rounds +**Agents**: 7 specialized reviewers with debate rounds +**💰 Review Cost**: $[X.XX] (Opus model) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ @@ -745,14 +1593,20 @@ Create the comprehensive review report in markdown format: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Risk Score**: [X]/10 - [LOW/MEDIUM/HIGH/CRITICAL] +**Risk Score**: [X]/[max] - [LOW/MEDIUM/HIGH/CRITICAL] **Day**: [day of week] **Files Changed**: [N] (+[X] -[Y] lines) -**Risk Factors**: -[List specific factors detected] +**Risk Level Meaning**: +- **LOW** (0-3): ✅ Entry-level or above can review +- **MEDIUM** (4-6): ✅ Entry-level or above can review +- **HIGH** (7-9): ⚠️ Experienced developer or above should review +- **CRITICAL** (10+): 🚨 Senior developer (Caleb Cox) must review + +**Required Reviewer**: [Based on risk level] -**Required Reviewer**: [JUNIOR/MID | MID/SENIOR | SENIOR (Caleb Cox)] +**Risk Factors Detected**: +[List specific factors] [IF FRIDAY/WEEKEND] ⚠️ **[DAY] DEPLOYMENT WARNING** @@ -760,31 +1614,112 @@ Create the comprehensive review report in markdown format: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -## 🚫 BLOCKING ISSUES +## 🔧 AUTOMATED FIXES AVAILABLE + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**${FIX_COUNT} automated fixes available** + +Review and apply these fixes to address common issues quickly. + +[IF FIX_COUNT > 0, FOR EACH FIX:] + +### Fix #N: [Title] ([Confidence] Confidence) + +**File**: \`path/to/file:line\` +**Category**: [category] +**Estimated Time**: 30 seconds + +
+📝 View Fix Details + +**Issue**: [description] + +**Current Code**: +\`\`\`typescript +[old code] +\`\`\` + +**Fixed Code**: +\`\`\`typescript +[new code] +\`\`\` + +**Apply This Fix**: +\`\`\`bash +bash /tmp/automated_fixes/fix_N_category.sh +\`\`\` + +
+ +--- + +**To apply all fixes**: +\`\`\`bash +# Review all fixes first +ls -la /tmp/automated_fixes/ + +# Apply all (REVIEW FIRST!) +bash /tmp/automated_fixes/apply_all.sh + +# Then review changes +git diff + +# If good, commit +git add . && git commit -m "Apply AI-suggested fixes" + +# To undo +git checkout . +\`\`\` + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 📦 DEPENDENCY IMPACT ANALYSIS + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +[Include contents of /tmp/dependency_impact.txt] + +### High-Impact Changes +Files with 10+ dependents - test thoroughly: + +[List high-impact files with dependent counts] + +### Breaking Changes +[List any removed exports or breaking changes from /tmp/breaking_changes.txt] + +### Recommendations +- Review all dependents before merging +- Add integration tests for high-impact changes +- Update documentation for breaking changes +- Consider deprecation warnings for removed exports + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## 🚫 CRITICAL BLOCKERS (Severity 9-10) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Must be fixed before merge** (Unanimous: 4-5 agents) +**Must be fixed before merge** (Average severity 9-10 from multiple agents) -[FOR EACH BLOCKING ISSUE:] +[FOR EACH CRITICAL BLOCKER:] ### [Issue Title] +**Severity**: [X.X]/10 (Consensus from [N] agents) **File**: `[file:line]` -**Flagged by**: [Agent 1, Agent 2, Agent 3, ...] +**Flagged by**: [Agent 1 ([score]/10), Agent 2 ([score]/10), ...] **Problem**: [Detailed description from consensus] **Agent Perspectives**: - -- **[Agent 1]**: [Their specific concern] -- **[Agent 2]**: [Their specific concern] +- **[Agent 1]** (Severity: [X]/10): [Their specific concern] +- **[Agent 2]** (Severity: [X]/10): [Their specific concern] **Debate Summary**: - - [Summary of any challenges and resolutions] -- Final consensus: BLOCKING +- Final consensus: CRITICAL BLOCKER (Average: [X.X]/10) **Required Action**: [Specific steps to fix] @@ -793,41 +1728,58 @@ Create the comprehensive review report in markdown format: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -## ⚠️ IMPORTANT CONCERNS +## 🔴 HIGH PRIORITY BLOCKERS (Severity 8-9) + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Must be fixed before merge** (Average severity 8-9) + +[FOR EACH HIGH PRIORITY BLOCKER - same format as above] + +--- + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +## ⚠️ IMPORTANT ISSUES (Severity 7-8) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Should be addressed before merge** (Majority: 3 agents) +**Should be addressed before merge** (Average severity 7-8) -[FOR EACH IMPORTANT CONCERN:] +[FOR EACH IMPORTANT ISSUE - condensed format] -### [Concern Title] +### [Issue Title] +**Severity**: [X.X]/10 **File**: `[file:line]` **Flagged by**: [Agents] -**Issue**: -[Description] +**Issue**: [Description] +**Recommended Fix**: [How to address] -**Debate Summary**: +--- -- [Summary of debate if any] -- Recommendation: Fix before merge +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Suggested Action**: -[How to address] +## 💡 MEDIUM PRIORITY (Severity 5-7) + +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + +**Consider addressing** (Average severity 5-7) + +[Bulleted list of issues with file:line and brief description] --- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -## 💡 SUGGESTIONS +## 💭 SUGGESTIONS (Severity 3-5) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Nice-to-have improvements** (Minority: 1-2 agents) +**Nice-to-have improvements** (Average severity 3-5) -[List suggestions by category] +[Grouped by category, bulleted list] --- @@ -844,22 +1796,22 @@ Create the comprehensive review report in markdown format: ### [Debate Topic] **Context**: [What the debate is about] +**Severity Range**: [Low]-[High]/10 (agents disagree by [X] points) **Positions**: -**[Agent 1 argues]**: +**[Agent 1]** argues (Severity: [X]/10): [Their position with reasoning] -**[Agent 2 counters]**: +**[Agent 2]** counters (Severity: [Y]/10): [Their counter-position] **Other agents**: - -- [Agent 3]: [Position] -- [Agent 4]: [Position] +- [Agent 3]: [Position] (Severity: [Z]/10) +- [Agent 4]: [Position] (Severity: [W]/10) **Why needs human review**: -[Explanation] +[Explanation of why agents couldn't reach consensus] **Recommendation**: Senior developer (Caleb Cox) should decide based on [considerations] @@ -872,23 +1824,31 @@ Senior developer (Caleb Cox) should decide based on [considerations] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -| Agent | Critical | Important | Suggestions | Confidence | -| ----------------- | -------- | --------- | ----------- | ---------- | -| 🔒 Security | [N] | [N] | [N] | [H/M/L] | -| 🏗️ Architecture | [N] | [N] | [N] | [H/M/L] | -| 💾 Data Integrity | [N] | [N] | [N] | [H/M/L] | -| 🧪 Testing | [N] | [N] | [N] | [H/M/L] | -| 👤 UX | [N] | [N] | [N] | [H/M/L] | -| **Total** | **[N]** | **[N]** | **[N]** | - | +| Agent | Critical | High | Important | Suggestions | Confidence | +| ------------------------ | -------- | ---- | --------- | ----------- | ---------- | +| 🔒 Security | [N] | [N] | [N] | [N] | [H/M/L] | +| 🏗️ Architecture | [N] | [N] | [N] | [N] | [H/M/L] | +| 💾 Data Integrity | [N] | [N] | [N] | [N] | [H/M/L] | +| 🧪 Testing | [N] | [N] | [N] | [N] | [H/M/L] | +| 👤 UX | [N] | [N] | [N] | [N] | [H/M/L] | +| 💰 Financial | [N] | [N] | [N] | [N] | [H/M/L] | +| 📋 Standards Compliance | [N] | [N] | [N] | [N] | [H/M/L] | +| **Total** | **[N]** | **[N]** | **[N]** | **[N]** | - | **Debate Statistics**: - - Total challenges raised: [N] - Challenges defended: [N] - Challenges conceded: [N] - Findings revised: [N] +- Severity adjustments: [+/-X] average - Escalated to human: [N] +**Review Quality**: +- Average agent confidence: [High/Medium/Low] +- Consensus rate: [X]% +- Debate rounds: 2 +- Total review time: [X] minutes + --- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ @@ -897,25 +1857,28 @@ Senior developer (Caleb Cox) should decide based on [considerations] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -**Immediate Actions** (Blockers): -[FOR EACH BLOCKING ISSUE:] - -- [ ] Fix [issue] at [file:line] - -**Important Actions** (Before merge): -[FOR EACH IMPORTANT CONCERN:] +**Critical Actions** (MUST fix before merge): +[FOR EACH CRITICAL/HIGH PRIORITY BLOCKER:] +- [ ] Fix [issue] at [file:line] (Severity: [X]/10) -- [ ] Address [concern] at [file:line] +**Important Actions** (Should fix before merge): +[FOR EACH IMPORTANT ISSUE:] +- [ ] Address [concern] at [file:line] (Severity: [X]/10) **Human Review Needed**: [FOR EACH UNRESOLVED DEBATE:] +- [ ] Senior developer to resolve: [debate topic] (Severity range: [X]-[Y]/10) -- [ ] Senior developer to resolve: [debate topic] +**Medium Priority** (Consider addressing): +- [List with severity scores] **Optional Improvements**: [FOR EACH SUGGESTION:] +- Consider [suggestion] (Severity: [X]/10) -- Consider [suggestion] +--- + +💰 **Review Cost**: $[X.XX] | ⏱️ **Review Time**: [X] minutes | 🤖 **Agents**: 7 (Opus) --- @@ -923,139 +1886,223 @@ Senior developer (Caleb Cox) should decide based on [considerations] 📋 Full Agent Reports (click to expand) ## 🔒 Security Agent - Complete Report - [Full original report] ## 🏗️ Architecture Agent - Complete Report - [Full original report] ## 💾 Data Integrity Agent - Complete Report - [Full original report] ## 🧪 Testing & Quality Agent - Complete Report - [Full original report] ## 👤 UX Agent - Complete Report +[Full original report] +## 💰 Financial Accuracy Agent - Complete Report +[Full original report] + +## 📋 MPDX Standards Agent - Complete Report [Full original report] --- -_🤖 Generated by MPDX Multi-Agent Review System_ -_Review time: [X] minutes | Agents: Security, Architecture, Data, Testing, UX_ +
+🗣️ Debate Transcript (click to expand) + +## Round 1: Cross-Examination +[Full debate round 1 transcripts] + +## Round 2: Rebuttals +[Full rebuttal transcripts] + +
+ +--- + +_🤖 Generated by MPDX Multi-Agent Review System v2.0_ +_Review time: [X] minutes | Cost: $[X.XX] | Agents: Security, Architecture, Data, Testing, UX, Financial, Standards_ ``` Save this to `/tmp/agent_review_report.md` --- -## Stage 7 — Post to GitHub (Optional) +## Stage 7 — Commit Metrics & Interactive Actions + +### Commit Metrics Dashboard + +```bash +if [ -f .claude/pr-metrics/PR_${PR_NUM}_metrics.md ]; then + echo "📊 Committing quality metrics dashboard..." + + git add .claude/pr-metrics/PR_${PR_NUM}_metrics.md + git add .claude/review-metrics/severity_history.txt + git add .claude/review-history/${CURRENT_DATE}_${PR_NUM}.json + + git commit -m "Add AI code review metrics dashboard + +📊 Quality Score: ${CURRENT_SEVERITY}/10 +🤖 Agents Used: ${#SELECTED_AGENTS[@]} of 7 +💰 Review Cost: \$[X.XX] +⏱️ Review Time: [X] minutes +🔧 Fixes Available: ${FIX_COUNT} + +Generated by AI Code Review v3.0" || echo "Nothing to commit" + + git push || echo "Failed to push, push manually later" + + echo "✅ Metrics dashboard committed and pushed" +else + echo "⚠️ No metrics dashboard to commit" +fi +``` + +### Interactive Menu Ask the user: ``` ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -✅ MULTI-AGENT REVIEW COMPLETE +✅ REVIEW COMPLETE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Found: -• [N] BLOCKING issues (unanimous) -• [N] IMPORTANT concerns (majority) -• [N] Suggestions (minority) +• [N] CRITICAL BLOCKERS (severity 9-10) +• [N] HIGH PRIORITY BLOCKERS (severity 8-9) +• [N] IMPORTANT issues (severity 7-8) +• [N] MEDIUM priority (severity 5-7) +• [N] Suggestions (severity 3-5) • [N] Unresolved debates (needs senior review) -Review report saved to: /tmp/agent_review_report.md +💰 Review Cost: $[X.XX] +⏱️ Review Time: [X] minutes +🔧 Automated Fixes: ${FIX_COUNT} available -Would you like to post this review to GitHub? +Risk Level: [LOW/MEDIUM/HIGH/CRITICAL] +Required Reviewer: [Level based on risk] -1. Post full review (all findings + debates + agent reports) -2. Post summary only (blocking + important + recommendations) -3. Don't post (review locally only) -4. Show me the report first +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -Please respond: 1, 2, 3, or 4 -``` +What would you like to do? -If user chooses 1 (full review): +1. 📊 View metrics dashboard +2. 📝 Post review to GitHub +3. 🔧 Apply automated fixes (review first!) +4. 📦 View dependency impact +5. 💾 Save report locally only +6. ❌ Exit -```bash -PR_NUM=$(gh pr view --json number -q .number 2>/dev/null) -if [ -n "$PR_NUM" ]; then - gh pr comment $PR_NUM --body-file /tmp/agent_review_report.md - echo "✅ Full review posted to PR #$PR_NUM" -else - echo "❌ No PR found. Run this from a PR branch or use 'gh pr view' to check." -fi +Please respond: 1, 2, 3, 4, 5, or 6 ``` -If user chooses 2 (summary only): -Create a condensed version with just: - -- Risk assessment -- Blocking issues -- Important concerns -- Unresolved debates -- Recommended next steps +Handle user's choice: ```bash -# Extract summary sections and post -PR_NUM=$(gh pr view --json number -q .number 2>/dev/null) -if [ -n "$PR_NUM" ]; then - # Create summary version - gh pr comment $PR_NUM --body-file /tmp/agent_review_summary.md - echo "✅ Summary posted to PR #$PR_NUM" -fi +case "$choice" in + 1) + cat .claude/pr-metrics/PR_${PR_NUM}_metrics.md | less + ;; + 2) + echo "Posting review to GitHub..." + gh pr comment $PR_NUM --body-file /tmp/agent_review_report.md + echo "✅ Review posted" + ;; + 3) + echo "" + echo "🔧 Automated Fixes Available: $FIX_COUNT" + echo "" + if [ "$FIX_COUNT" -gt 0 ]; then + cat /tmp/fix_summary.txt + echo "" + read -p "Apply all fixes? (y/n) " apply_choice + if [ "$apply_choice" = "y" ]; then + bash /tmp/automated_fixes/apply_all.sh + echo "" + echo "Review changes:" + git diff --stat + echo "" + echo "To see full diff: git diff" + echo "To commit: git add . && git commit -m 'Apply AI fixes'" + echo "To undo: git checkout ." + fi + else + echo "No automated fixes available" + fi + ;; + 4) + cat /tmp/dependency_impact.txt | less + ;; + 5) + echo "Report saved to: /tmp/agent_review_report.md" + echo "Metrics saved to: .claude/pr-metrics/PR_${PR_NUM}_metrics.md" + ;; + *) + echo "Exiting..." + ;; +esac ``` -If user chooses 3 (don't post): - -``` -Review complete! Report available at: /tmp/agent_review_report.md +--- -You can: -- Read it with: cat /tmp/agent_review_report.md -- Post later with: gh pr comment [PR#] --body-file /tmp/agent_review_report.md -``` +## Stage 8 — Final Summary -If user chooses 4 (show report): -Display the full report in the terminal, then re-ask the posting question. +Display comprehensive summary: ---- +``` +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +🎉 AI CODE REVIEW COMPLETE v3.0 +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -## Summary +**Review Mode**: ${AGENT_MODE} +**Agents Used**: ${#SELECTED_AGENTS[@]} of 7 +**Model**: [Haiku/Sonnet/Opus mix based on mode] +**Review Time**: [X] minutes +**💰 Cost**: $[X.XX] -Display final summary: +**Quality Metrics**: +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +Quality Score: ${CURRENT_SEVERITY}/10 +Your Average: ${AVG_SEVERITY}/10 +Trend: [↗️/→/↘️] -``` +**Findings**: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -🎉 MULTI-AGENT REVIEW SESSION COMPLETE +🚫 [N] Critical Blockers +🔴 [N] High Priority Issues +⚠️ [N] Important Issues +💡 [N] Suggestions + +**Automated Fixes**: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +🔧 ${FIX_COUNT} fixes available + • High confidence: [N] + • Medium confidence: [N] + Apply with option 3 in menu -**Review Statistics**: -- Agents deployed: 5 -- Debate rounds: 2 -- Total findings: [N] -- Consensus rate: [X]% -- Review time: [X] minutes +**Dependency Impact**: +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +📦 [N] high-impact changes +⚠️ [N] potential breaking changes -**Key Outcomes**: -✅ [N] blocking issues identified -✅ [N] important concerns flagged -✅ [N] suggestions for improvement -⚠️ [N] unresolved debates (need senior review) +**Saved**: +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +💰 $${SAVED_COST} with smart agent selection +⏱️ ~30-45 min of human review time +📊 Metrics committed to PR **Next Steps**: -1. Address blocking issues before merge -2. Review important concerns -3. Get senior input on unresolved debates -4. Consider suggestions for code quality +━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ +1. Review quality dashboard (.claude/pr-metrics/) +2. Address [N] critical/high priority issues +3. Consider applying ${FIX_COUNT} automated fixes +4. Review [N] high-impact dependency changes +5. Post review to GitHub when ready -Thank you for using Multi-Agent Code Review! 🤖 +Thank you for using AI Code Review v3.0! 🤖 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ``` @@ -1063,36 +2110,69 @@ Thank you for using Multi-Agent Code Review! 🤖 ## Notes -**Performance**: - -- Stage 1 agents run in parallel (~2-3 min) -- Stage 3 debate runs in parallel (~1-2 min) -- Stage 4 rebuttals run in parallel (~1 min) -- Total time: 4-6 minutes typical PR - -**Cost Estimation**: - -- 5 agents × 2 rounds = 10 major LLM calls -- Plus orchestrator synthesis -- Estimated: $0.80-$2.50 per review -- Cost varies with PR size and model selection +**New in v3.0 (Phase 1 & 2)**: + +1. ✅ **Smart Agent Selection**: Analyzes PR to determine which agents are needed (saves $1-3) +2. ✅ **Progressive Review Modes**: Quick ($0.50), Standard ($2-4), Deep ($6-10) +3. ✅ **Automated Fix Generation**: Agents generate bash scripts to auto-fix common issues +4. ✅ **Codebase Context Search**: Agents search before flagging to reduce false positives +5. ✅ **Dependency Impact Analysis**: Tracks high-impact files and breaking changes +6. ✅ **Historical Metrics Dashboard**: Quality trends committed to each PR +7. ✅ **Interactive Menu**: View metrics, apply fixes, post review, check dependencies + +**Improvements from v2.0**: + +1. ✅ **More Context**: Agents now read full files, not just diffs +2. ✅ **Domain-Specific Agents**: Added Financial Accuracy and MPDX Standards agents +3. ✅ **Improved Risk Scoring**: Four-level system (LOW/MEDIUM/HIGH/CRITICAL) +4. ✅ **Better Consensus**: Numeric severity scores (1-10) for precise consensus +5. ✅ **MPDX-Specific Checks**: Dedicated agent for CLAUDE.md standards +6. ✅ **Cost Transparency**: Shows estimated and actual costs throughout + +**Performance by Mode**: + +**Quick Mode**: +- Time: ~2 minutes +- Agents: 3 (Testing, UX, Standards) +- Model: Haiku +- Cost: ~$0.50 +- Use for: Minor UI tweaks, documentation + +**Standard Mode** (Recommended): +- Time: ~5 minutes +- Agents: 3-6 (smart selection) +- Model: Sonnet/Opus +- Cost: ~$2-4 +- Use for: Normal feature development + +**Deep Mode**: +- Time: ~10 minutes +- Agents: All 7 +- Model: Opus +- Cost: ~$6-10 +- Use for: Critical security/financial changes + +**Cost Savings**: +- Smart agent selection: $1-3 per review (20-40% reduction) +- Monthly savings (20 PRs): $20-60 +- Time saved per review: 30-45 minutes of human time **Model Configuration**: +- Quick: Haiku for all agents +- Standard: Smart mix (Sonnet/Opus based on agent) +- Deep: Opus for all agents (maximum quality) -- Security: Opus (highest quality reasoning needed) -- Architecture: Opus (deepest system thinking needed) -- Data: Opus (maximum precision needed) -- Testing: Opus (comprehensive analysis) -- UX: Opus (thorough accessibility review) - -**Limitations**: +**Risk Levels**: +- **LOW** (0-3): Entry-level+ can review +- **MEDIUM** (4-6): Entry-level+ can review +- **HIGH** (7-9): Experienced developer+ should review +- **CRITICAL** (10+): Senior developer (Caleb Cox) must review -- Agents can't see full codebase (only diff) -- May miss context from related files -- Debate reduces but doesn't eliminate false positives -- Unresolved debates still need human judgment +**Phase 3 Features** (Not Yet Implemented): +- AI Learning from Past Reviews +- Team Knowledge Base Integration --- -_Multi-Agent Code Review System v1.0_ -_See `.claude/AGENT_BASED_CODE_REVIEW.md` for full documentation_ +_Multi-Agent Code Review System v3.0_ +_Phase 1 & 2 Complete | Phase 3 Deferred_ From 9168d8ec2c3e72194ff4e1f5c74b0888c0bf65ee Mon Sep 17 00:00:00 2001 From: Daniel Bisgrove Date: Fri, 27 Feb 2026 08:02:43 -0500 Subject: [PATCH 5/5] adding tmp files to git ignore prettier --- .claude/commands/agent-review.md | 174 ++++++++++++++++++++++++++++--- .gitignore | 18 ++++ 2 files changed, 177 insertions(+), 15 deletions(-) diff --git a/.claude/commands/agent-review.md b/.claude/commands/agent-review.md index 5b264c2a63..e6a4ef59cc 100644 --- a/.claude/commands/agent-review.md +++ b/.claude/commands/agent-review.md @@ -10,11 +10,13 @@ approve_tools: AI-powered code review with smart agent selection, automated fixes, and quality metrics. **💰 COST**: + - `/agent-review quick` - $0.50 (2 min, 3 agents, Haiku) - `/agent-review` or `/agent-review standard` - $2-4 (5 min, smart selection) - `/agent-review deep` - $6-10 (10 min, all 7 agents, Opus) **Usage**: + ```bash /agent-review # Standard mode (smart selection, recommended) /agent-review quick # Quick feedback for simple PRs @@ -328,6 +330,7 @@ Read `/tmp/selected_agents.txt` to determine which agents to launch. Display: "🚀 Launching [N] specialized review agents in parallel..." **Note**: Only launch agents that are needed based on the mode and smart selection. Check the variables: + - `$SECURITY_NEEDED` - Launch Security Agent if true - `$DATA_NEEDED` - Launch Data Integrity Agent if true - `$UX_NEEDED` - Launch UX Agent if true @@ -343,7 +346,7 @@ Use the Task tool with: - **model**: "opus" - **prompt**: -``` +```` You are the Security Review Agent for MPDX code review. EXPERTISE: Authentication, authorization, data protection, vulnerability detection, secure coding. @@ -430,9 +433,10 @@ Format: ```diff - [old code with vulnerability] + [new code with fix] -``` +```` **Apply command**: + ```bash cat > /tmp/automated_fixes/fix_N_security.sh << 'EOF' #!/bin/bash @@ -446,12 +450,14 @@ chmod +x /tmp/automated_fixes/fix_N_security.sh ``` FIXABLE SECURITY ISSUES: + - Missing authentication checks - Exposed sensitive data - Missing input validation - Insecure session handling GUIDELINES: + - Be specific with file:line references - Rate severity on 1-10 scale for consensus - Explain WHY it's a risk, not just WHAT @@ -461,6 +467,7 @@ GUIDELINES: - READ THE FULL FILES for context, not just the diff - Search codebase before flagging to avoid false positives - Generate automated fixes for simple security improvements + ``` ### Agent 2: Architecture Review 🏗️ @@ -473,6 +480,7 @@ Use the Task tool with: - **prompt**: ``` + You are the Architecture Review Agent for MPDX code review. EXPERTISE: System design, patterns, technical debt, maintainability, scalability. @@ -480,17 +488,20 @@ EXPERTISE: System design, patterns, technical debt, maintainability, scalability MISSION: Review this PR for architectural concerns and design issues. CONTEXT: + - Risk Score: [calculated score]/[max] - Risk Level: [LOW/MEDIUM/HIGH/CRITICAL] - Changed Files: [N] INSTRUCTIONS: + 1. Read /tmp/pr_diff.txt and /tmp/changed_files.txt 2. Read FULL content of changed files for context 3. Read CLAUDE.md for project patterns 4. Search for usage patterns of modified components/functions CRITICAL FOCUS: + - GraphQL schema design (pages/api/Schema/, .graphql files) - Apollo Client cache (src/lib/apollo/) - Next.js configuration (next.config.ts) @@ -504,6 +515,7 @@ OUTPUT FORMAT: ## 🏗️ Architecture Agent Review ### Critical Architecture Issues (BLOCKING) - Severity: 10/10 + - **File:Line** - Issue - Severity: 10/10 - Problem: What's architecturally wrong @@ -511,32 +523,40 @@ OUTPUT FORMAT: - Alternative: Better approach ### Architecture Concerns (IMPORTANT) - Severity: 6-9/10 + - **File:Line** - Concern - Severity: [6-9]/10 - Issue: Description - Recommendation: How to improve ### Architecture Suggestions - Severity: 3-5/10 + [Better patterns and approaches] + - Severity: [3-5]/10 ### Technical Debt Analysis + - Debt Added: [what new debt] - Debt Removed: [what debt fixed] - Net Impact: Better/Worse/Neutral ### Pattern Compliance + - Follows CLAUDE.md standards: Yes/No/Partial - Violations: [list] ### Questions for Other Agents + - **To [Agent]**: Question ### Confidence + - Overall: High/Medium/Low CODEBASE CONTEXT SEARCH: Before flagging architectural issues, search for existing patterns: + 1. Use Grep to find how similar problems are solved 2. Check if pattern is used consistently across codebase 3. Reference good examples to suggest @@ -544,6 +564,7 @@ Before flagging architectural issues, search for existing patterns: AUTOMATED FIX GENERATION: Generate fixes for common architectural issues: + - Inconsistent file naming - Missing exports - Improper component structure @@ -551,6 +572,7 @@ Generate fixes for common architectural issues: Format same as Security Agent above with category: architecture GUIDELINES: + - Rate severity on 1-10 scale - Focus on long-term maintainability - Identify pattern inconsistencies vs CLAUDE.md @@ -558,6 +580,7 @@ GUIDELINES: - Balance pragmatism vs purity - Search codebase before flagging inconsistencies - Generate fixes for structural issues + ``` ### Agent 3: Data Integrity Review 💾 @@ -570,6 +593,7 @@ Use the Task tool with: - **prompt**: ``` + You are the Data Integrity Review Agent for MPDX code review. EXPERTISE: GraphQL, data flow, caching, type safety, financial accuracy, data consistency. @@ -577,16 +601,19 @@ EXPERTISE: GraphQL, data flow, caching, type safety, financial accuracy, data co MISSION: Review this PR for data correctness and integrity. CONTEXT: + - Risk Score: [calculated score]/[max] - Changed Files: [N] INSTRUCTIONS: + 1. Read diff and changed files 2. Read FULL files for data flow context 3. Search for related GraphQL operations 4. Check for financial calculation changes CRITICAL FOCUS: + - GraphQL queries/mutations (check for `id` fields!) - Apollo cache normalization - Data fetching patterns, pagination @@ -601,6 +628,7 @@ OUTPUT FORMAT: ## 💾 Data Integrity Agent Review ### Critical Data Issues (BLOCKING) - Severity: 10/10 + - **File:Line** - Issue - Severity: 10/10 - Problem: Data integrity concern @@ -608,34 +636,41 @@ OUTPUT FORMAT: - Fix: Required action ### Data Concerns (IMPORTANT) - Severity: 6-9/10 + - **File:Line** - Concern - Severity: [6-9]/10 - Issue: Description - Recommendation: Fix ### Data Suggestions - Severity: 3-5/10 + [Better data handling] ### GraphQL Specific Checks + - Missing `id` fields: [list] - Cache policy issues: [concerns] - Fragment reuse opportunities: [list] - Pagination properly handled: Yes/No ### Financial Accuracy Review + - Financial calculations reviewed: Yes/No/N/A - Currency handling correct: Yes/No/N/A - Rounding issues: None/[issues] ### Questions for Other Agents + - **To [Agent]**: Question ### Confidence + - Overall: High/Medium/Low - Financial review confidence: High/Medium/Low/N/A CODEBASE CONTEXT SEARCH: Search for GraphQL patterns before flagging: + 1. Find similar queries/mutations 2. Check if id fields are consistently used 3. Look for established cache update patterns @@ -643,14 +678,16 @@ Search for GraphQL patterns before flagging: AUTOMATED FIX GENERATION: Generate fixes for data integrity issues: + - Missing id fields in GraphQL queries -- Missing __typename fields +- Missing \_\_typename fields - Incorrect cache updates - Type inconsistencies Category: graphql or data-integrity GUIDELINES: + - Financial accuracy is CRITICAL - flag ANY doubts - Check cache updates match data changes - Verify pagination logic @@ -658,6 +695,7 @@ GUIDELINES: - Consider data consistency - Search for similar GraphQL patterns before flagging - Generate fixes for missing fields + ``` ### Agent 4: Testing & Quality Review 🧪 @@ -670,6 +708,7 @@ Use the Task tool with: - **prompt**: ``` + You are the Testing & Quality Review Agent for MPDX code review. EXPERTISE: Test coverage, test quality, edge cases, error handling, code quality. @@ -677,16 +716,19 @@ EXPERTISE: Test coverage, test quality, edge cases, error handling, code quality MISSION: Review this PR for testing adequacy and code quality. CONTEXT: + - Risk Score: [calculated score]/[max] - Changed Files: [N] INSTRUCTIONS: + 1. Read diff and changed files 2. For each modified component/function, check if tests exist 3. Read test files to assess quality 4. Search for error handling patterns CRITICAL FOCUS: + - Test coverage for new code - Test quality and maintainability - Edge case handling, error states @@ -701,6 +743,7 @@ OUTPUT FORMAT: ## 🧪 Testing & Quality Agent Review ### Critical Testing Gaps (BLOCKING) - Severity: 10/10 + - **File:Line** - Gap - Severity: 10/10 - Missing: What's not tested @@ -708,34 +751,41 @@ OUTPUT FORMAT: - Required: What tests to add ### Testing Concerns (IMPORTANT) - Severity: 6-9/10 + - **File:Line** - Concern - Severity: [6-9]/10 - Issue: Description - Recommendation: Improvement ### Code Quality Issues - Severity: varies + - Unused imports: [list with file:line] - Console.logs: [list with file:line] - Type safety issues (`any` types): [list] - Other issues: [list] ### Testing Suggestions - Severity: 3-5/10 + [Improvements] ### Coverage Assessment + - New code tested: Yes/Partial/No - Edge cases covered: [list what's covered] - Error handling tested: Yes/Partial/No - Missing critical tests: [list] ### Questions for Other Agents + - **To [Agent]**: Question ### Confidence + - Overall: High/Medium/Low CODEBASE CONTEXT SEARCH: Search for testing patterns: + 1. Find similar component tests 2. Check how mocks are typically structured 3. Look for established test patterns @@ -743,6 +793,7 @@ Search for testing patterns: AUTOMATED FIX GENERATION: Generate fixes for common testing issues: + - Unused imports (can be auto-removed) - Console.logs in production code - Missing test skeletons (generate basic structure) @@ -751,6 +802,7 @@ Generate fixes for common testing issues: Categories: imports, tests, types, code-quality GUIDELINES: + - Critical business logic MUST have tests - Don't require tests for trivial UI-only changes - Focus on edge cases and error paths @@ -759,6 +811,7 @@ GUIDELINES: - Flag console.logs in non-test code - Generate fixes for code quality issues - Provide test skeleton templates + ``` ### Agent 5: UX Review 👤 @@ -771,6 +824,7 @@ Use the Task tool with: - **prompt**: ``` + You are the User Experience Review Agent for MPDX code review. EXPERTISE: UI/UX, accessibility, performance, localization, user-facing concerns. @@ -778,16 +832,19 @@ EXPERTISE: UI/UX, accessibility, performance, localization, user-facing concerns MISSION: Review this PR for user experience and usability. CONTEXT: + - Risk Score: [calculated score]/[max] - Changed Files: [N] INSTRUCTIONS: + 1. Read diff and full changed files 2. Look for user-facing changes 3. Check for localization compliance 4. Review loading/error states CRITICAL FOCUS: + - Component usability, intuitiveness - Loading states (MUST show for async operations) - Error messages (user-friendly, localized) @@ -803,6 +860,7 @@ OUTPUT FORMAT: ## 👤 UX Agent Review ### Critical UX Issues (BLOCKING) - Severity: 10/10 + - **File:Line** - Issue - Severity: 10/10 - Problem: UX concern @@ -810,37 +868,45 @@ OUTPUT FORMAT: - Fix: Required action ### UX Concerns (IMPORTANT) - Severity: 6-9/10 + - **File:Line** - Concern - Severity: [6-9]/10 - Issue: Description - Recommendation: Improvement ### Accessibility Issues + - Missing ARIA labels: [list with file:line] - Keyboard navigation: [issues] - Screen reader support: [concerns] - Color contrast: [issues] ### Localization Issues + - Hardcoded strings (not using t()): [list with file:line] - Missing translation keys: [list] ### Performance Concerns + - Unnecessary re-renders: [list] - Missing useMemo/useCallback: [list] - Heavy calculations in render: [list] ### UX Suggestions - Severity: 3-5/10 + [Improvements] ### Questions for Other Agents + - **To [Agent]**: Question ### Confidence + - Overall: High/Medium/Low CODEBASE CONTEXT SEARCH: Search for UX patterns: + 1. Find similar components for UX patterns 2. Check how loading states are typically shown 3. Look for localization patterns @@ -848,6 +914,7 @@ Search for UX patterns: AUTOMATED FIX GENERATION: Generate fixes for UX issues: + - Missing localization (wrap in t()) - Hardcoded strings - Missing ARIA labels @@ -856,6 +923,7 @@ Generate fixes for UX issues: Category: localization, accessibility, ux GUIDELINES: + - Put yourself in user's shoes - Consider error scenarios - ALL user-visible text MUST use t() @@ -864,6 +932,7 @@ GUIDELINES: - Think about mobile users - Generate automated localization fixes - Provide ARIA attribute additions + ``` ### Agent 6: Financial Accuracy Review 💰 @@ -876,6 +945,7 @@ Use the Task tool with: - **prompt**: ``` + You are the Financial Accuracy Review Agent for MPDX code review. EXPERTISE: Financial calculations, currency handling, donation tracking, pledge management, monetary accuracy. @@ -883,16 +953,19 @@ EXPERTISE: Financial calculations, currency handling, donation tracking, pledge MISSION: Review this PR for financial calculation accuracy and monetary data integrity. CONTEXT: + - Risk Score: [calculated score]/[max] - Changed Files: [N] INSTRUCTIONS: + 1. Read diff and full changed files 2. Search for financial-related terms: donation, pledge, gift, amount, currency, balance, total, payment 3. Look for arithmetic operations on monetary values 4. Check for currency conversion CRITICAL FOCUS: + - Donation amount calculations - Pledge tracking accuracy - Gift processing @@ -909,7 +982,9 @@ OUTPUT FORMAT: ## 💰 Financial Accuracy Agent Review ### Critical Financial Issues (BLOCKING) - Severity: 10/10 + [These MUST be fixed - money errors are unacceptable] + - **File:Line** - Issue - Severity: 10/10 - Problem: Financial calculation error @@ -917,15 +992,18 @@ OUTPUT FORMAT: - Fix: Required correction ### Financial Concerns (IMPORTANT) - Severity: 6-9/10 + - **File:Line** - Concern - Severity: [6-9]/10 - Issue: Potential accuracy problem - Recommendation: How to fix ### Financial Suggestions - Severity: 3-5/10 + [Better financial handling practices] ### Financial Checklist + - Currency handling correct: Yes/No/N/A - Rounding to 2 decimals: Yes/No/N/A - Using appropriate numeric types: Yes/No/N/A @@ -933,15 +1011,18 @@ OUTPUT FORMAT: - Financial tests present: Yes/No/N/A ### Questions for Other Agents + - **To Data Integrity**: [questions about data flow] - **To Testing**: [questions about financial test coverage] ### Confidence + - Overall: High/Medium/Low - Calculations reviewed: [list what was checked] CODEBASE CONTEXT SEARCH: Search for financial patterns: + 1. Find similar financial calculations 2. Check established rounding practices 3. Look for currency handling patterns @@ -949,6 +1030,7 @@ Search for financial patterns: AUTOMATED FIX GENERATION: Generate fixes for financial issues: + - Incorrect rounding (fix to 2 decimals) - Missing currency validation - Type precision issues @@ -957,6 +1039,7 @@ Generate fixes for financial issues: Category: financial GUIDELINES: + - Financial errors are CRITICAL - be thorough - Even small rounding errors matter - Check ALL arithmetic on money @@ -968,6 +1051,7 @@ GUIDELINES: - Generate fixes for rounding and precision issues IMPORTANT: If you don't see any financial code, just note "No financial code in this PR" and skip to confidence section. + ``` ### Agent 7: MPDX Standards Compliance Review 📋 @@ -980,6 +1064,7 @@ Use the Task tool with: - **prompt**: ``` + You are the MPDX Standards Compliance Review Agent. EXPERTISE: MPDX-specific coding standards, patterns, and conventions from CLAUDE.md. @@ -987,10 +1072,12 @@ EXPERTISE: MPDX-specific coding standards, patterns, and conventions from CLAUDE MISSION: Verify this PR follows MPDX project standards and conventions. CONTEXT: + - Risk Score: [calculated score]/[max] - Changed Files: [N] INSTRUCTIONS: + 1. Read CLAUDE.md thoroughly 2. Read diff and changed files 3. Check each standard against the code @@ -998,37 +1085,44 @@ INSTRUCTIONS: MPDX STANDARDS CHECKLIST: **File Naming:** + - [ ] Components use PascalCase (e.g., ContactDetails.tsx) - [ ] Pages use kebab-case with .page.tsx - [ ] Tests use same name as file + .test.tsx - [ ] GraphQL files use PascalCase .graphql **Exports:** + - [ ] Uses named exports (NO default exports) - [ ] Component exports: `export const ComponentName: React.FC = () => {}` **GraphQL:** + - [ ] All queries/mutations have `id` fields for cache normalization - [ ] Operation names are descriptive (not starting with "Get" or "Load") - [ ] `yarn gql` was run (check for .generated.ts files) - [ ] Pagination handled for `nodes` fields **Localization:** + - [ ] All user-visible text uses `t()` from useTranslation - [ ] No hardcoded English strings **Testing:** + - [ ] Uses GqlMockedProvider for GraphQL mocking - [ ] Test describes what it tests clearly - [ ] Proper async handling with findBy/waitFor **Code Quality:** + - [ ] No console.logs (except in error handlers) - [ ] No unused imports - [ ] TypeScript types (no `any` unless justified) - [ ] Proper error handling **Package Management:** + - [ ] Uses yarn (not npm) OUTPUT FORMAT: @@ -1036,7 +1130,9 @@ OUTPUT FORMAT: ## 📋 MPDX Standards Compliance Review ### Standards Violations (BLOCKING) - Severity: 8-10/10 + [Clear violations of project standards] + - **File:Line** - Violation - Severity: [8-10]/10 - Standard: What standard violated @@ -1044,12 +1140,14 @@ OUTPUT FORMAT: - Fix: How to fix ### Standards Concerns (IMPORTANT) - Severity: 5-7/10 + - **File:Line** - Concern - Severity: [5-7]/10 - Issue: Doesn't quite follow standards - Recommendation: How to align ### Standards Checklist Results + **File Naming**: ✅/⚠️/❌ **Exports**: ✅/⚠️/❌ **GraphQL**: ✅/⚠️/❌ (or N/A) @@ -1059,20 +1157,25 @@ OUTPUT FORMAT: **Package Management**: ✅/⚠️/❌ ### Pattern Deviations + [List any deviations from CLAUDE.md patterns] ### Suggestions for Better Alignment + [How to better follow MPDX patterns] ### Questions for Other Agents + - **To [Agent]**: Question ### Confidence + - Overall: High/Medium/Low - Standards knowledge: Complete/Partial CODEBASE CONTEXT SEARCH: Search for standard patterns: + 1. Check how similar files are structured 2. Look for naming conventions 3. Find export patterns @@ -1080,6 +1183,7 @@ Search for standard patterns: AUTOMATED FIX GENERATION: Generate fixes for standards violations: + - Incorrect file naming - Default exports (convert to named) - Missing yarn usage @@ -1088,21 +1192,25 @@ Generate fixes for standards violations: Category: standards GUIDELINES: + - Reference specific sections of CLAUDE.md - Distinguish between violations and preferences - Be constructive, not pedantic - Explain WHY standards matter - Search codebase for patterns before flagging - Generate fixes for simple standards violations + ``` After launching selected agents, display: ``` + ✅ All [N] agents launched in parallel ⏳ Waiting for agents to complete their reviews... 💰 Estimated cost: $[X.XX] -``` + +```` --- @@ -1151,7 +1259,7 @@ fi echo "✅ Dependency analysis complete" echo "" -``` +```` --- @@ -1598,6 +1706,7 @@ Create the comprehensive review report in markdown format: **Files Changed**: [N] (+[X] -[Y] lines) **Risk Level Meaning**: + - **LOW** (0-3): ✅ Entry-level or above can review - **MEDIUM** (4-6): ✅ Entry-level or above can review - **HIGH** (7-9): ⚠️ Experienced developer or above should review @@ -1656,19 +1765,25 @@ bash /tmp/automated_fixes/fix_N_category.sh **To apply all fixes**: \`\`\`bash + # Review all fixes first + ls -la /tmp/automated_fixes/ # Apply all (REVIEW FIRST!) + bash /tmp/automated_fixes/apply_all.sh # Then review changes + git diff # If good, commit + git add . && git commit -m "Apply AI-suggested fixes" # To undo + git checkout . \`\`\` @@ -1681,14 +1796,17 @@ git checkout . [Include contents of /tmp/dependency_impact.txt] ### High-Impact Changes + Files with 10+ dependents - test thoroughly: [List high-impact files with dependent counts] ### Breaking Changes + [List any removed exports or breaking changes from /tmp/breaking_changes.txt] ### Recommendations + - Review all dependents before merging - Add integration tests for high-impact changes - Update documentation for breaking changes @@ -1714,10 +1832,12 @@ Files with 10+ dependents - test thoroughly: [Detailed description from consensus] **Agent Perspectives**: + - **[Agent 1]** (Severity: [X]/10): [Their specific concern] - **[Agent 2]** (Severity: [X]/10): [Their specific concern] **Debate Summary**: + - [Summary of any challenges and resolutions] - Final consensus: CRITICAL BLOCKER (Average: [X.X]/10) @@ -1807,6 +1927,7 @@ Files with 10+ dependents - test thoroughly: [Their counter-position] **Other agents**: + - [Agent 3]: [Position] (Severity: [Z]/10) - [Agent 4]: [Position] (Severity: [W]/10) @@ -1824,18 +1945,19 @@ Senior developer (Caleb Cox) should decide based on [considerations] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -| Agent | Critical | High | Important | Suggestions | Confidence | -| ------------------------ | -------- | ---- | --------- | ----------- | ---------- | -| 🔒 Security | [N] | [N] | [N] | [N] | [H/M/L] | -| 🏗️ Architecture | [N] | [N] | [N] | [N] | [H/M/L] | -| 💾 Data Integrity | [N] | [N] | [N] | [N] | [H/M/L] | -| 🧪 Testing | [N] | [N] | [N] | [N] | [H/M/L] | -| 👤 UX | [N] | [N] | [N] | [N] | [H/M/L] | -| 💰 Financial | [N] | [N] | [N] | [N] | [H/M/L] | -| 📋 Standards Compliance | [N] | [N] | [N] | [N] | [H/M/L] | -| **Total** | **[N]** | **[N]** | **[N]** | **[N]** | - | +| Agent | Critical | High | Important | Suggestions | Confidence | +| ----------------------- | -------- | ------- | --------- | ----------- | ---------- | +| 🔒 Security | [N] | [N] | [N] | [N] | [H/M/L] | +| 🏗️ Architecture | [N] | [N] | [N] | [N] | [H/M/L] | +| 💾 Data Integrity | [N] | [N] | [N] | [N] | [H/M/L] | +| 🧪 Testing | [N] | [N] | [N] | [N] | [H/M/L] | +| 👤 UX | [N] | [N] | [N] | [N] | [H/M/L] | +| 💰 Financial | [N] | [N] | [N] | [N] | [H/M/L] | +| 📋 Standards Compliance | [N] | [N] | [N] | [N] | [H/M/L] | +| **Total** | **[N]** | **[N]** | **[N]** | **[N]** | - | **Debate Statistics**: + - Total challenges raised: [N] - Challenges defended: [N] - Challenges conceded: [N] @@ -1844,6 +1966,7 @@ Senior developer (Caleb Cox) should decide based on [considerations] - Escalated to human: [N] **Review Quality**: + - Average agent confidence: [High/Medium/Low] - Consensus rate: [X]% - Debate rounds: 2 @@ -1859,21 +1982,26 @@ Senior developer (Caleb Cox) should decide based on [considerations] **Critical Actions** (MUST fix before merge): [FOR EACH CRITICAL/HIGH PRIORITY BLOCKER:] + - [ ] Fix [issue] at [file:line] (Severity: [X]/10) **Important Actions** (Should fix before merge): [FOR EACH IMPORTANT ISSUE:] + - [ ] Address [concern] at [file:line] (Severity: [X]/10) **Human Review Needed**: [FOR EACH UNRESOLVED DEBATE:] + - [ ] Senior developer to resolve: [debate topic] (Severity range: [X]-[Y]/10) **Medium Priority** (Consider addressing): + - [List with severity scores] **Optional Improvements**: [FOR EACH SUGGESTION:] + - Consider [suggestion] (Severity: [X]/10) --- @@ -1886,24 +2014,31 @@ Senior developer (Caleb Cox) should decide based on [considerations] 📋 Full Agent Reports (click to expand) ## 🔒 Security Agent - Complete Report + [Full original report] ## 🏗️ Architecture Agent - Complete Report + [Full original report] ## 💾 Data Integrity Agent - Complete Report + [Full original report] ## 🧪 Testing & Quality Agent - Complete Report + [Full original report] ## 👤 UX Agent - Complete Report + [Full original report] ## 💰 Financial Accuracy Agent - Complete Report + [Full original report] ## 📋 MPDX Standards Agent - Complete Report + [Full original report] @@ -1914,9 +2049,11 @@ Senior developer (Caleb Cox) should decide based on [considerations] 🗣️ Debate Transcript (click to expand) ## Round 1: Cross-Examination + [Full debate round 1 transcripts] ## Round 2: Rebuttals + [Full rebuttal transcripts] @@ -2132,6 +2269,7 @@ Thank you for using AI Code Review v3.0! 🤖 **Performance by Mode**: **Quick Mode**: + - Time: ~2 minutes - Agents: 3 (Testing, UX, Standards) - Model: Haiku @@ -2139,6 +2277,7 @@ Thank you for using AI Code Review v3.0! 🤖 - Use for: Minor UI tweaks, documentation **Standard Mode** (Recommended): + - Time: ~5 minutes - Agents: 3-6 (smart selection) - Model: Sonnet/Opus @@ -2146,6 +2285,7 @@ Thank you for using AI Code Review v3.0! 🤖 - Use for: Normal feature development **Deep Mode**: + - Time: ~10 minutes - Agents: All 7 - Model: Opus @@ -2153,22 +2293,26 @@ Thank you for using AI Code Review v3.0! 🤖 - Use for: Critical security/financial changes **Cost Savings**: + - Smart agent selection: $1-3 per review (20-40% reduction) - Monthly savings (20 PRs): $20-60 - Time saved per review: 30-45 minutes of human time **Model Configuration**: + - Quick: Haiku for all agents - Standard: Smart mix (Sonnet/Opus based on agent) - Deep: Opus for all agents (maximum quality) **Risk Levels**: + - **LOW** (0-3): Entry-level+ can review - **MEDIUM** (4-6): Entry-level+ can review - **HIGH** (7-9): Experienced developer+ should review - **CRITICAL** (10+): Senior developer (Caleb Cox) must review **Phase 3 Features** (Not Yet Implemented): + - AI Learning from Past Reviews - Team Knowledge Base Integration diff --git a/.gitignore b/.gitignore index 1fb3058877..3979ce7b90 100644 --- a/.gitignore +++ b/.gitignore @@ -54,3 +54,21 @@ node_modules # Lighthouse .lighthouseci lighthouse-results.md + +# AI Code Review - Temporary Files +# These are working files that get regenerated each review +/tmp/automated_fixes/ +/tmp/dependency_analysis/ +/tmp/dependency_impact.txt +/tmp/breaking_changes.txt +/tmp/changed_files.txt +/tmp/pr_diff.txt +/tmp/diff_stat.txt +/tmp/selected_agents.txt +/tmp/fix_summary.txt +/tmp/agent_review_report.md +/tmp/dependents_*.txt +/tmp/changed_file_contents/ + +# Note: .claude/review-history/, .claude/review-metrics/, and .claude/pr-metrics/ +# are intentionally NOT ignored - these track quality trends and should be committed