Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
File renamed without changes.
34 changes: 17 additions & 17 deletions dry-run-output.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,44 @@

| Score | Verdict | 🚨 Critical | 🔶 High | 🔹 Medium | 🔧 Nitpick |
| :--- | :--- | :--- | :--- | :--- | :--- |
| **15/100** | **REQUEST_CHANGES** | 4 | 1 | - | - |
| **15/100** | **REQUEST_CHANGES** | 3 | 2 | - | - |

## 📄 src/auth/login.ts

> [!CAUTION]
> **SQL Injection via String Interpolation**
> Directly interpolating user input into the SQL query allows attackers to inject arbitrary SQL.
> **SQL Injection via string interpolation**
> Email is interpolated directly into SQL, allowing injection. This will leak or destroy data.

> [!CAUTION]
> **Plain-Text Password Comparison**
> Storing and comparing passwords in plain text leaks credentials if the database is compromised.
> [!WARNING]
> **Plaintext password comparison**
> Passwords stored and compared in plaintext; a breach reveals every credential.

> [!WARNING]
> **No Rate Limiting or Session Handling**
> Login endpoint lacks rate limiting, account lockout, and secure session management.
> **No rate limiting or session handling**
> Brute-force attacks are trivial and users receive no secure session.

**🛠️ Recommended Fixes**

- **SQL Injection via String Interpolation**: Replace the string-interpolated query with a parameterized query using the db driver's placeholder syntax (e.g., db.query('SELECT * FROM users WHERE email = ?', [email])).
- **Plain-Text Password Comparison**: Hash the provided password with a slow hash like bcrypt and compare the hash against the stored hash in the database.
- **No Rate Limiting or Session Handling**: Implement rate limiting middleware and return a signed JWT or session cookie instead of the raw user id.
- **SQL Injection via string interpolation**: Replace the raw string interpolation with a parameterized query: `await db.query('SELECT * FROM users WHERE email = ?', [email])`
- **Plaintext password comparison**: Hash passwords with bcrypt (10+ rounds) on registration and compare hashes here.
- **No rate limiting or session handling**: Implement rate-limiting middleware and issue signed JWT or session cookie with expiry.

---

## 📄 src/api/users/route.ts

> [!CAUTION]
> **Password Leak in User List**
> Returning raw user rows exposes password hashes to any caller.
> **Passwords exposed in GET /users**
> Endpoint returns all columns including plaintext passwords to any caller.

> [!CAUTION]
> **Missing Authorization on Delete**
> Any request can delete any user; add authentication and permission checks before executing the delete.
> **Missing authorization on DELETE**
> Any request can delete any user; no auth, no ownership check.

**🛠️ Recommended Fixes**

- **Password Leak in User List**: Map the result to exclude sensitive fields (password, reset tokens) before returning JSON.
- **Missing Authorization on Delete**: Require a valid session token and verify the caller has admin rights before calling db.query; use parameterized DELETE query to prevent SQL injection.
- **Passwords exposed in GET /users**: Select only safe columns: `SELECT id, name, email FROM users` and never return passwords.
- **Missing authorization on DELETE**: Add middleware that verifies the caller is authenticated and has admin role before executing the DELETE.

---

Expand Down
88 changes: 65 additions & 23 deletions src/orchestrator.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,11 @@ interface GitHubContext {
octokit: Octokit;
}

interface PrFile {
filename: string;
status: 'added' | 'modified' | 'removed' | 'renamed' | 'changed' | 'copied' | 'unchanged';
}

function initGitHub(env: ReturnType<typeof validateEnv>): GitHubContext {
const [owner, repo] = env.GITHUB_REPOSITORY.split('/');
return {
Expand Down Expand Up @@ -165,17 +170,30 @@ async function fetchPRDiff(ctx: GitHubContext): Promise<string> {
pull_number: ctx.prNumber,
mediaType: { format: 'diff' },
});
return response.data as unknown as string;

// Runtime check directly without casting immediately
const data = response.data;
if (typeof data === 'string') {
return data;
}

// In some Octokit versions/configurations, diffs might return as objects if mediaType isn't respected
// but typically strict 'diff' format returns string.
console.warn('⚠️ Unexpected diff format:', typeof data);
return String(data || '');
}

async function fetchPRFiles(ctx: GitHubContext): Promise<string[]> {
async function fetchPRFiles(ctx: GitHubContext): Promise<PrFile[]> {
const response = await ctx.octokit.pulls.listFiles({
owner: ctx.owner,
repo: ctx.repo,
pull_number: ctx.prNumber,
per_page: 100,
});
return response.data.map((file) => file.filename);
return response.data.map((file) => ({
filename: file.filename,
status: file.status as PrFile['status']
}));
}

async function postComment(ctx: GitHubContext, body: string): Promise<void> {
Expand All @@ -188,39 +206,51 @@ async function postComment(ctx: GitHubContext, body: string): Promise<void> {
console.log('💬 Comment posted to PR.');
}

async function addReaction(ctx: GitHubContext, reaction: 'eyes' | 'rocket') {
if (!ctx.commentId) return;
async function addReaction(ctx: GitHubContext, reaction: 'eyes' | 'rocket' | 'confused') {
try {
await ctx.octokit.reactions.createForIssueComment({
owner: ctx.owner,
repo: ctx.repo,
comment_id: ctx.commentId,
content: reaction,
});
console.log(`👀 Reacted with ${reaction}`);
if (ctx.commentId) {
await ctx.octokit.reactions.createForIssueComment({
owner: ctx.owner,
repo: ctx.repo,
comment_id: ctx.commentId,
content: reaction,
});
console.log(`👀 Reacted to comment with ${reaction}`);
} else {
// Fallback: React to the PR itself (Issue)
await ctx.octokit.reactions.createForIssue({
owner: ctx.owner,
repo: ctx.repo,
issue_number: ctx.prNumber,
content: reaction,
});
console.log(`👀 Reacted to PR with ${reaction}`);
}
} catch (e) {
console.log("⚠️ Could not react");
console.error("⚠️ Could not react:", e);
}
}

// ============================================================
// AI STAGES
// ============================================================

async function runTriage(files: string[], diffLength: number): Promise<TriageResult> {
async function runTriage(files: PrFile[], diffLength: number): Promise<TriageResult> {
console.log(`🔍 Running Triage with ${TRIAGE_MODEL}...`);

const fileList = files.map(f => `${f.filename} [${f.status}]`).join('\n');

const { object } = await generateObject({
model: groq(TRIAGE_MODEL),
schema: TriageSchema,
system: TRIAGE_SYSTEM_PROMPT,
prompt: `PR contains ${files.length} files. Diff length: ${diffLength} chars.\n\nFiles:\n${files.join('\n')}`,
prompt: `PR contains ${files.length} files. Diff length: ${diffLength} chars.\n\nFiles:\n${fileList}`,
});

return object;
}

async function runDeepReview(filesToAudit: string[], allFiles: string[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
async function runDeepReview(filesToAudit: string[], allFiles: PrFile[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
console.log(`🧠 Running Deep Review with ${ANALYST_MODEL}...`);

// Estimate tokens (rough: 4 chars per token)
Expand All @@ -235,7 +265,7 @@ async function runDeepReview(filesToAudit: string[], allFiles: string[], diff: s

// Large diff: use chunked map-reduce
console.log(`📦 Diff too large (${estimatedTokens} est. tokens), using chunked review`);
const rawResult = await runChunkedReview(filesToAudit, diff, architectureContext, existingDocs);
const rawResult = await runChunkedReview(filesToAudit, allFiles, diff, architectureContext, existingDocs);
return adjustScoreForSkippedFiles(rawResult, filesToAudit.length, allFiles.length);
}

Expand Down Expand Up @@ -275,16 +305,18 @@ function adjustScoreForSkippedFiles(result: JStarReviewResult, auditedCount: num
/**
* Original single-shot review for small diffs.
*/
async function runSingleShotReview(filesToAudit: string[], allFiles: string[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
async function runSingleShotReview(filesToAudit: string[], allFiles: PrFile[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
const enhancedSystemPrompt = architectureContext
? `${ANALYST_SYSTEM_PROMPT}\n\n--- PROJECT CONTEXT ---\n${architectureContext}`
: ANALYST_SYSTEM_PROMPT;

const formattedFiles = allFiles.map(f => `${f.filename} [${f.status}]`);

const { object } = await generateObject({
model: groq(ANALYST_MODEL),
schema: JStarReviewSchema,
system: enhancedSystemPrompt,
prompt: buildAnalystUserPrompt(filesToAudit, allFiles, diff, existingDocs),
prompt: buildAnalystUserPrompt(filesToAudit, formattedFiles, diff, existingDocs),
});

return object;
Expand Down Expand Up @@ -330,7 +362,7 @@ function splitDiffByFile(diff: string): FileDiff[] {
/**
* Review large diffs by chunking per-file and aggregating.
*/
async function runChunkedReview(filesToAudit: string[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
async function runChunkedReview(filesToAudit: string[], allFiles: PrFile[], diff: string, architectureContext: string, existingDocs: string[]): Promise<JStarReviewResult> {
const fileDiffs = splitDiffByFile(diff);
console.log(`🔪 Split into ${fileDiffs.length} file chunks`);

Expand All @@ -348,7 +380,11 @@ async function runChunkedReview(filesToAudit: string[], diff: string, architectu
for (let i = 0; i < relevantDiffs.length; i += BATCH_SIZE) {
const batch = relevantDiffs.slice(i, i + BATCH_SIZE);
const batchResults = await Promise.all(
batch.map(fd => reviewFileChunk(fd.filename, fd.diff, architectureContext, existingDocs))
batch.map(fd => {
const fileInfo = allFiles.find(f => f.filename === fd.filename);
const status = fileInfo?.status || 'modified';
return reviewFileChunk(fd.filename, fd.diff, status, architectureContext, existingDocs);
})
);
allChunkResults.push(...batchResults);

Expand All @@ -364,13 +400,13 @@ async function runChunkedReview(filesToAudit: string[], diff: string, architectu
/**
* Review a single file chunk.
*/
async function reviewFileChunk(filename: string, fileDiff: string, architectureContext: string, existingDocs: string[]): Promise<ChunkReviewResult> {
async function reviewFileChunk(filename: string, fileDiff: string, status: string, architectureContext: string, existingDocs: string[]): Promise<ChunkReviewResult> {
try {
const { object } = await generateObject({
model: groq(ANALYST_MODEL),
schema: ChunkReviewSchema,
system: CHUNK_REVIEW_SYSTEM_PROMPT,
prompt: buildChunkReviewPrompt(filename, fileDiff, architectureContext, existingDocs),
prompt: buildChunkReviewPrompt(filename, fileDiff, status, architectureContext, existingDocs),
});
return object;
} catch (error) {
Expand Down Expand Up @@ -568,6 +604,7 @@ async function main() {

if (triage.files_to_audit.length === 0) {
await postComment(ctx, formatTriageSkipComment(triage));
await addReaction(ctx, 'rocket');
return;
}

Expand All @@ -576,6 +613,11 @@ async function main() {
console.log(`\n🔬 Review:`, JSON.stringify(review, null, 2), '\n');

await postComment(ctx, formatReviewComment(review));

// Final Reaction based on Verdict
const finalReaction = review?.summary?.verdict === 'REQUEST_CHANGES' ? 'confused' : 'rocket';
await addReaction(ctx, finalReaction);

console.log('\n🏁 J Star Review Complete!');
}

Expand Down
16 changes: 12 additions & 4 deletions src/prompts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,13 @@ Your goal is to find BUGS, SECURITY RISKS, LOGIC ERRORS, and **DOCUMENTATION GAP
- **Fix Prompt for Docs:** If truly missing, generate the actual markdown stub they should create.
- Categorize documentation issues as **DOCUMENTATION** or **MAINTAINABILITY**.

6. **PLACEHOLDER AUTH (DEV MODE):**
6. **DELETED FILES/CODE (CRITICAL):**
- Files marked [removed] and lines starting with \`-\` in the diff are being DELETED.
- **DO NOT** flag bugs *inside* deleted code (e.g. "Unused variable", "Typo"). IT IS GONE.
- **DO** flag if the **deletion itself** creates a problem (e.g. "Removed auth check", "Deleted function used elsewhere").
- If a file is fully deleted, only comment if it seems dangerous.

7. **PLACEHOLDER AUTH (DEV MODE):**
- If you see hardcoded usernames like "johndoe" or "demo" with a TODO comment, this is intentional dev scaffolding
- Only flag these are security issues if there's NO indication it's a dev placeholder

Expand All @@ -68,7 +74,7 @@ You MUST include both "summary" and "findings" fields.Do not omit the summary.
JSON Structure:
{
"summary": { "quality_score": 0 - 100, "verdict": "APPROVE" | "REQUEST_CHANGES" | "COMMENT", "tone": "encouraging" | "critical" | "neutral" },
"findings": [{ "file": "...", "severity": "...", "category": "...", "title": "...", "message": "...", "fix_prompt": "..." }]
"findings": [{ "file": "...", "severity": "...", "category": "...", "title": "...", "message": "...", "fix_prompt": string | null }]
}
`;

Expand Down Expand Up @@ -131,10 +137,10 @@ CATEGORIES: SECURITY, PERFORMANCE, LOGIC, MAINTAINABILITY, STYLE, DOCUMENTATION

RULES:
1. Max 2 sentences per finding. Be direct.
2. ⚠️ MANDATORY: You MUST provide "title" for every finding (short, explicit).
3. ⚠️ MANDATORY: You MUST provide "fix_prompt" for ALL CRITICAL, HIGH, and MEDIUM findings.
4. Documentation is per-FEATURE not per-file. Check if feature doc exists before flagging.
5. Hardcoded test usernames (johndoe, demo) with TODO comments are dev placeholders, not security issues.
5. **DELETED FILES:** If status is [removed], DO NOT report bugs in the code. ONLY report if the deletion is dangerous (e.g. missing auth replacement).
6. Hardcoded test usernames (johndoe, demo) with TODO comments are dev placeholders, not security issues.

Output strict JSON only.
`;
Expand All @@ -146,6 +152,7 @@ Output strict JSON only.
export function buildChunkReviewPrompt(
filename: string,
fileDiff: string,
status: string,
architectureContext: string,
existingDocs: string[] = []
): string {
Expand All @@ -165,6 +172,7 @@ export function buildChunkReviewPrompt(

return `${contextSection}${docsSection}${featureHint}
FILE: ${filename}
STATUS: ${status}

=== DIFF ===
${fileDiff}
Expand Down
13 changes: 13 additions & 0 deletions src/test-local.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ const MOCK_FILES = [
'src/components/Button.tsx',
'styles/globals.css',
'README.md',
'src/deprecated.ts [removed]', // Explicitly marked for test scenario
];

// Simulated existing docs (to test false positive fix)
Expand All @@ -59,6 +60,18 @@ index 1234567..abcdefg 100644
+
+export type Theme = z.infer<typeof ThemeSchema>;

diff --git a/src/deprecated.ts b/src/deprecated.ts
deleted file mode 100644
index abcdefg..0000000
--- a/src/deprecated.ts
+++ /dev/null
@@ -1,5 +1,0 @@
- export const API_KEY = "sk-1234567890"; // Hardcoded secret!
-
- function oldLogic() {
- console.log("This is old");
- }

diff --git a/src/auth/login.ts b/src/auth/login.ts
index 1234567..abcdefg 100644
--- a/src/auth/login.ts
Expand Down