feat(index): add FTS5 full-text search implementation #82

galligan · 2026-01-23T05:42:18Z

Implement SQLite FTS5 indexing with bun:sqlite:

createIndex() factory with WAL mode for concurrency
BM25 ranking for relevance scoring
Three tokenizers: unicode61, porter, trigram
CRUD operations: add, search, update, remove
Metadata support with JSON serialization
Highlight extraction from search results
Comprehensive test coverage (39 tests)

Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com

Resolves #55

galligan · 2026-01-23T05:42:40Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

greptile-apps · 2026-01-23T21:06:49Z

Greptile Summary

Implements SQLite FTS5 full-text search functionality for packages/index with BM25 ranking, WAL mode, and CRUD operations
Adds comprehensive test coverage with 39 tests covering index creation, document operations, search functionality, and error handling
Provides clean public API through createIndex() factory function with proper TypeScript types and Result-based error handling

Important Files Changed

Filename	Overview
packages/index/src/fts5.ts	New FTS5 implementation with SQLite virtual tables, BM25 ranking, and transaction safety
packages/index/src/tests/index.test.ts	Comprehensive test suite with 39 tests validating all search functionality and edge cases

Confidence score: 4/5

This PR is safe to merge with only minor risks related to delete-then-insert pattern complexity
Score reflects solid implementation with comprehensive testing, but deducted one point due to complex transaction handling and potential race conditions in the delete-then-insert pattern for document updates
Pay close attention to packages/index/src/fts5.ts for transaction handling and concurrent access patterns

Sequence Diagram

sequenceDiagram
    participant User
    participant Index
    participant Database
    participant FTS5

    User->>Index: "createIndex(options)"
    Index->>Database: "new Database(path)"
    Index->>Database: "PRAGMA journal_mode=WAL"
    Index->>Database: "CREATE VIRTUAL TABLE ... USING fts5"
    Database->>FTS5: "Initialize FTS5 table"
    Index-->>User: "return index instance"

    User->>Index: "add(document)"
    Index->>Database: "DELETE FROM table WHERE id = ?"
    Index->>Database: "INSERT INTO table VALUES (id, content, metadata)"
    Database->>FTS5: "Index content for search"
    Index-->>User: "Result.ok()"

    User->>Index: "search(query)"
    Index->>Database: "SELECT ... FROM table WHERE MATCH ?"
    Database->>FTS5: "Execute FTS5 search with BM25"
    FTS5-->>Database: "Return ranked results"
    Database-->>Index: "Search results with scores"
    Index-->>User: "Result.ok(searchResults)"

    User->>Index: "close()"
    Index->>Database: "close()"

greptile-apps

_{3 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-23T21:06:47Z

packages/index/src/fts5.ts

+			id UNINDEXED,
+			content,
+			metadata UNINDEXED,
+			tokenize='${tokenizer}'


style: String interpolation in SQL could be vulnerable to injection if tokenizer comes from user input. Is the tokenizer value always controlled by the application code?

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/index/src/fts5.ts Line: 122:122 Comment: **style:** String interpolation in SQL could be vulnerable to injection if tokenizer comes from user input. Is the tokenizer value always controlled by the application code? <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-23T21:06:48Z

packages/index/src/__tests__/index.test.ts

+	it("addMany() is atomic (all or nothing)", async () => {
+		// First add some valid docs
+		await index.addMany([{ id: "valid-1", content: "Valid content" }]);
+
+		// Verify initial state
+		const db = new Database(dbPath);
+		const initialCount = db.query("SELECT COUNT(*) as count FROM documents").get() as {
+			count: number;
+		};
+		db.close();
+
+		expect(initialCount.count).toBe(1);
+	});


logic: Test for addMany() atomicity is incomplete - it only verifies initial state but doesn't test the actual atomic behavior (rollback on failure)

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/index/src/__tests__/index.test.ts Line: 230:242 Comment: **logic:** Test for addMany() atomicity is incomplete - it only verifies initial state but doesn't test the actual atomic behavior (rollback on failure) How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-23T21:06:48Z

packages/index/src/__tests__/index.test.ts

+		const walPath = `${dbPath}-wal`;
+
+		// Write something to trigger WAL file creation
+		index.add({ id: "wal-test", content: "WAL test" });


logic: Test calls index.add() without awaiting the result, which could cause race conditions in the WAL file existence check

Suggested change

index.add({ id: "wal-test", content: "WAL test" });

// Write something to trigger WAL file creation

await index.add({ id: "wal-test", content: "WAL test" });

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/index/src/__tests__/index.test.ts Line: 600:600 Comment: **logic:** Test calls index.add() without awaiting the result, which could cause race conditions in the WAL file existence check ```suggestion // Write something to trigger WAL file creation await index.add({ id: "wal-test", content: "WAL test" }); ``` How can I resolve this? If you propose a fix, please make it concise.

galligan · 2026-01-23T21:54:49Z

Verified BM25 ordering: search() orders by bm25 ASC (lower scores first) and the relevance-order test asserts ascending scores. Tests passed in the pre-push run.

galligan · 2026-01-23T23:10:50Z

Restacked after downstack update (formatRelative test stabilization); no additional changes in this PR.

Implement SQLite FTS5 indexing with bun:sqlite: - createIndex() factory with WAL mode for concurrency - BM25 ranking for relevance scoring - Three tokenizers: unicode61, porter, trigram - CRUD operations: add, search, update, remove - Metadata support with JSON serialization - Highlight extraction from search results - Comprehensive test coverage (39 tests) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

galligan · 2026-01-24T03:07:57Z

Validated tableName/tokenizer to avoid SQL interpolation issues, strengthened addMany atomicity test (rollback on JSON stringify failure), and awaited WAL creation write. Kept BM25 ordering ASC—tests/documentation treat lower (more negative) scores as more relevant.

galligan force-pushed the p3-19/index/types-interfaces branch from b130dcc to d3d9843 Compare January 23, 2026 11:45

galligan force-pushed the p3-20/index/fts5-impl branch from b50f9a7 to aff813d Compare January 23, 2026 11:45

galligan force-pushed the p3-19/index/types-interfaces branch from d3d9843 to d6e91c4 Compare January 23, 2026 11:58

galligan force-pushed the p3-20/index/fts5-impl branch from aff813d to 097c3aa Compare January 23, 2026 11:58

galligan added the enhancement New feature or request label Jan 23, 2026

galligan force-pushed the p3-19/index/types-interfaces branch from d6e91c4 to 01b6488 Compare January 23, 2026 16:03

galligan force-pushed the p3-20/index/fts5-impl branch from 608a423 to b8d09c1 Compare January 23, 2026 19:46

This was referenced Jan 23, 2026

fix(cli): enforce cursor TTL #114

Open

fix(tests): remove placeholder tests #113

Open

greptile-apps bot reviewed Jan 23, 2026

View reviewed changes

galligan mentioned this pull request Jan 23, 2026

chore(gitattributes): prefer ours for bun lockfiles #115

Open

galligan changed the base branch from p3-19/index/types-interfaces to graphite-base/82 January 23, 2026 21:47

galligan force-pushed the p3-20/index/fts5-impl branch from b8d09c1 to 86b969f Compare January 23, 2026 21:54

galligan force-pushed the graphite-base/82 branch from 1659fa4 to e52caea Compare January 23, 2026 21:54

galligan changed the base branch from graphite-base/82 to p3-19/index/types-interfaces January 23, 2026 21:54

galligan force-pushed the p3-20/index/fts5-impl branch from 86b969f to c319824 Compare January 23, 2026 23:09

galligan force-pushed the p3-19/index/types-interfaces branch from e52caea to 3bc02a6 Compare January 23, 2026 23:09

galligan and others added 3 commits January 23, 2026 22:01

fix(index): order bm25 results ascending

debd3d0

fix(index): export createIndex from entrypoint

5589b6a

galligan changed the base branch from p3-19/index/types-interfaces to graphite-base/82 January 24, 2026 03:02

fix(index): validate tokenizer and tests

1ed1245

galligan force-pushed the graphite-base/82 branch from 3bc02a6 to d0d9e68 Compare January 24, 2026 03:07

galligan force-pushed the p3-20/index/fts5-impl branch from c319824 to 1ed1245 Compare January 24, 2026 03:07

galligan changed the base branch from graphite-base/82 to p3-19/index/types-interfaces January 24, 2026 03:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(index): add FTS5 full-text search implementation #82

feat(index): add FTS5 full-text search implementation #82

galligan commented Jan 23, 2026 •

edited

Loading

Uh oh!

galligan commented Jan 23, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Jan 23, 2026

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 23, 2026

Uh oh!

greptile-apps bot Jan 23, 2026

Uh oh!

greptile-apps bot Jan 23, 2026

Uh oh!

galligan commented Jan 23, 2026

Uh oh!

galligan commented Jan 23, 2026

Uh oh!

galligan commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	index.add({ id: "wal-test", content: "WAL test" });
	// Write something to trigger WAL file creation
	await index.add({ id: "wal-test", content: "WAL test" });

feat(index): add FTS5 full-text search implementation #82

Are you sure you want to change the base?

feat(index): add FTS5 full-text search implementation #82

Conversation

galligan commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galligan commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Jan 23, 2026

Greptile Summary

Important Files Changed

Confidence score: 4/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

galligan commented Jan 23, 2026

Uh oh!

galligan commented Jan 23, 2026

Uh oh!

galligan commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

galligan commented Jan 23, 2026 •

edited

Loading

galligan commented Jan 23, 2026 •

edited

Loading