Skip to content

feat(thread): Implement main thread crate to unify crate exposure#75

Open
bashandbone wants to merge 18 commits intomainfrom
feat/thread-main
Open

feat(thread): Implement main thread crate to unify crate exposure#75
bashandbone wants to merge 18 commits intomainfrom
feat/thread-main

Conversation

@bashandbone
Copy link
Contributor

@bashandbone bashandbone commented Feb 25, 2026

This minor update:

  • Implements a main 'thread' crate for unified 'use thread' library use.
  • Fixes outdated dependencies in two benchmarks
  • Significantly updates planning documents for feat-001 to reflect current codebase state, clearer commercial boundaries, and integration of improved port of sister project CodeWeaver's semantic analysis capabilities

Summary by Sourcery

Introduce a unified thread facade crate as the main entrypoint for the ecosystem, update realtime code graph specifications to clarify IDs, overlays, OSS vs commercial boundaries, and protocols, and simplify rule-engine benchmarks to focus solely on internal implementations.

New Features:

  • Add a top-level thread crate that re-exports core AST, language, rule engine, services, and utilities for ergonomic use thread consumption.
  • Document a Delta overlay model and reachability index behavior for the realtime code graph, including WebSocket replay and health/observability contracts.

Bug Fixes:

  • Remove stale ast-grep benchmark dependencies and comparisons in rule-engine benches to keep benchmarks aligned with current internal rule engine usage.

Enhancements:

  • Refine data model types with strong ID newtypes, credential indirection, and richer semantic metadata for the realtime code graph.
  • Clarify architecture decisions, API transport (prost over HTTP POST), crate ownership boundaries, and OSS vs commercial scope in the planning and spec documents.
  • Extend WebSocket protocol docs with timeout semantics, missed-update replay, and authentication nuances.
  • Tighten task breakdown and success criteria, adding TDD gates, backend deprecation notes, and clearer responsibilities for crates like thread-graph, thread-conflict, and storage backends.

Tests:

  • Add integration tests to validate thread crate re-exports and basic end-to-end usage.
  • Restructure classification and graph invariants tasks around explicit failing tests and property tests for IDs and reachability.

@bashandbone bashandbone added the enhancement New feature or request label Feb 25, 2026
Copilot AI review requested due to automatic review settings February 25, 2026 01:29
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 25, 2026

Reviewer's Guide

Introduces a new top-level thread crate that unifies access to core Thread ecosystem crates via feature-gated re-exports, updates realtime code graph specs/plans to clarify OSS vs commercial boundaries and technical contracts, and simplifies rule-engine benchmarks by removing ast-grep comparison paths and dependencies.

Class diagram for the new thread facade crate modules and reexports

classDiagram
    class ThreadCrate {
        <<crate>>
        +mod ast
        +mod language
        +mod rule
        +mod flow
        +mod services
        +mod utils
        +AstGrep
        +Language
        +Node
        +Root
        +SupportLang
        +CodeAnalyzer
        +CodeParser
        +ParsedDocument
        +ServiceError
        +ServiceResult
    }

    class AstModule {
        <<module>>
        +use thread_ast_engine
    }

    class LanguageModule {
        <<module>>
        +use thread_language
    }

    class RuleModule {
        <<module>>
        +use thread_rule_engine
    }

    class FlowModule {
        <<module>>
        +use thread_flow
    }

    class ServicesModule {
        <<module>>
        +use thread_services
    }

    class UtilsModule {
        <<module>>
        +use thread_utils
    }

    class ThreadAstEngineExports {
        <<external_crate>>
        +AstGrep
        +Language
        +Node
        +Root
    }

    class ThreadLanguageExports {
        <<external_crate>>
        +SupportLang
    }

    class ThreadServicesExports {
        <<external_crate>>
        +CodeAnalyzer
        +CodeParser
        +ParsedDocument
        +ServiceError
        +ServiceResult
    }

    ThreadCrate *-- AstModule
    ThreadCrate *-- LanguageModule
    ThreadCrate *-- RuleModule
    ThreadCrate *-- FlowModule
    ThreadCrate *-- ServicesModule
    ThreadCrate *-- UtilsModule

    AstModule ..> ThreadAstEngineExports : reexports
    LanguageModule ..> ThreadLanguageExports : reexports
    ServicesModule ..> ThreadServicesExports : reexports

    ThreadCrate ..> ThreadAstEngineExports : top_level_reexports
    ThreadCrate ..> ThreadLanguageExports : top_level_reexports
    ThreadCrate ..> ThreadServicesExports : top_level_reexports
Loading

Flow diagram for simplified rule engine benchmarks without ast_grep_config

graph TD
    BenchEntry[criterion benchmark group] --> PrepareThreadRules[prepare ThreadGlobalRules and YAML rules]
    PrepareThreadRules --> ThreadCombinedScan[create ThreadCombinedScan]
    ThreadCombinedScan --> ThreadAstGrep[create ThreadSupportLang TypeScript ast_grep instance]
    ThreadAstGrep --> ThreadScanBenchmark[run thread_rule_engine scan and collect matches]
    ThreadScanBenchmark --> Metrics[record timing and memory metrics]

    RemovedAstGrepPath[removed ast_grep_config comparison path] -. "no_longer_used" .-> BenchEntry
Loading

File-Level Changes

Change Details Files
Add main thread crate as unified entry point with feature-gated re-exports for core ecosystem crates.
  • Create crates/thread crate with library-style lib.rs that re-exports ast engine, language support, rule engine, flow, services, and utils under feature flags.
  • Define feature matrix (ast, language, rule, flow, services, utils, full, worker) wiring through to underlying crates and their features, including WASM/worker-specific configuration.
  • Add workspace wiring by including the new crate in root Cargo workspace members and as a dependency in the root dependencies table.
  • Add basic integration tests to verify top-level re-exports compile and work for a simple AST query and service type usage.
crates/thread/Cargo.toml
crates/thread/src/lib.rs
crates/thread/tests/integration.rs
Cargo.toml
Cargo.lock
Refine realtime code graph specifications to formalize IDs/types, overlay graph behavior, OSS vs commercial boundaries, observability, and transport protocols.
  • Update data model to introduce strong-typed newtype IDs (FileId, NodeId, EdgeId, ConflictId, UserId, SessionId, EngineId), centralized credentials via CredentialRef/CredentialStore, enriched SemanticMetadata, GraphEdge.id, and a Delta overlay structure.
  • Clarify reachability index design (k-hop bounded, baseline vs live state), overlay graph default query semantics (include_local_delta), and conflict-related type ownership in thread-api vs commercial crates.
  • Rework functional requirements and success criteria to distinguish OSS vs commercial scope, specify degraded modes (semantic search fallback, partial results, circuit breaker behavior), health/metrics endpoints, observability (structured logs), and authentication behavior.
  • Adjust RPC/websocket contracts to move conflict detection into an extension trait, add include_local_delta on queries, swap node_type for semantic_class/node_kind, define ConflictUpdateStatus and RequestMissedUpdates, and document binary vs Protobuf transport roles.
  • Update websocket protocol and architectural plan docs to describe replay behavior, ping/pong timeouts, API protocol choice (prost+HTTP POST, postcard internal), crate ownership boundaries, and task annotations that mark conflict detection and AI resolution as deferred/commercial scope.
specs/001-realtime-code-graph/data-model.md
specs/001-realtime-code-graph/spec.md
specs/001-realtime-code-graph/tasks.md
specs/001-realtime-code-graph/contracts/websocket-protocol.md
specs/001-realtime-code-graph/contracts/rpc-types.rs
specs/001-realtime-code-graph/plan.md
Simplify rule-engine benchmarks by removing direct ast-grep comparison code paths and dependencies.
  • Strip ast-grep imports, rule parsing, scanning, and memory benchmarks from rule-engine comparison benches, leaving only thread-rule-engine measurements.
  • Clean up benchmark setup to only construct Thread rules, scanners, and ASTs, and remove now-unused globals and variables.
crates/rule-engine/benches/comparison_benchmarks.rs
crates/rule-engine/benches/ast_grep_comparison.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In the new thread facade crate, the default feature set pulls in services (and via flow potentially heavy dependencies like storage backends); consider narrowing the default to the lightweight AST/language/rule stack so consumers can opt into service/flow features explicitly rather than getting them transitively.
  • The rule-engine benches (comparison_benchmarks.rs, ast_grep_comparison.rs) still carry a "comparison" naming and structure even though the ast-grep side has been removed; it may be clearer to rename/simplify these benches to match their new purpose and avoid confusion about missing baselines.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the new `thread` facade crate, the `default` feature set pulls in `services` (and via `flow` potentially heavy dependencies like storage backends); consider narrowing the default to the lightweight AST/language/rule stack so consumers can opt into service/flow features explicitly rather than getting them transitively.
- The rule-engine benches (`comparison_benchmarks.rs`, `ast_grep_comparison.rs`) still carry a "comparison" naming and structure even though the ast-grep side has been removed; it may be clearer to rename/simplify these benches to match their new purpose and avoid confusion about missing baselines.

## Individual Comments

### Comment 1
<location path="crates/rule-engine/benches/ast_grep_comparison.rs" line_range="102" />
<code_context>
 fn bench_rule_parsing_comparison(c: &mut Criterion) {
</code_context>
<issue_to_address>
**suggestion:** The `ast_grep_comparison` benchmarks no longer exercise ast-grep but still carry ast-grep-specific naming.

This is effectively a duplicate of the thread-only benchmarks, but its name and benchmark IDs still imply ast-grep coverage. Please either remove it, merge the remaining thread benchmarks into the existing benchmark module, or rename the file and benchmark groups to match their actual scope.
</issue_to_address>

### Comment 2
<location path="crates/thread/Cargo.toml" line_range="40" />
<code_context>
+full = ["ast", "language", "rule", "flow", "services", "utils", "thread-language/html-embedded", "thread-services/tower-services"]
+
+# Special feature for WASM/Edge deployment
+worker = ["ast", "language", "rule", "services", "thread-rule-engine/worker", "thread-flow/worker", "thread-services/ast-grep-backend"]
+
+
</code_context>
<issue_to_address>
**issue (bug_risk):** The `worker` feature enables `thread-flow/worker` but not the crate’s `flow` feature, so the `flow` module stays disabled.

As configured, `worker` only propagates to `thread-flow/worker` and never enables this crate’s `flow` feature (which controls the `flow` module via `cfg(feature = "flow")`). If `thread::flow` APIs should be available when `worker` is enabled, add `"flow"` to the `worker` feature. If `worker` is meant to exclude the flow layer, consider removing `thread-flow/worker` to avoid partial or misleading exposure.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

);
}

group.finish();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The ast_grep_comparison benchmarks no longer exercise ast-grep but still carry ast-grep-specific naming.

This is effectively a duplicate of the thread-only benchmarks, but its name and benchmark IDs still imply ast-grep coverage. Please either remove it, merge the remaining thread benchmarks into the existing benchmark module, or rename the file and benchmark groups to match their actual scope.

full = ["ast", "language", "rule", "flow", "services", "utils", "thread-language/html-embedded", "thread-services/tower-services"]

# Special feature for WASM/Edge deployment
worker = ["ast", "language", "rule", "services", "thread-rule-engine/worker", "thread-flow/worker", "thread-services/ast-grep-backend"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The worker feature enables thread-flow/worker but not the crate’s flow feature, so the flow module stays disabled.

As configured, worker only propagates to thread-flow/worker and never enables this crate’s flow feature (which controls the flow module via cfg(feature = "flow")). If thread::flow APIs should be available when worker is enabled, add "flow" to the worker feature. If worker is meant to exclude the flow layer, consider removing thread-flow/worker to avoid partial or misleading exposure.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new top-level thread crate to provide a single ergonomic entry point for the workspace, updates benchmarks to remove outdated ast-grep-* comparisons, and substantially refreshes the feat-001 planning/spec documents (including clearer OSS vs commercial boundaries and updated protocol decisions).

Changes:

  • Add crates/thread as the main façade crate that re-exports core sub-crates behind feature flags.
  • Update rule-engine benchmark sources to drop ast-grep-config/ast-grep-language comparisons.
  • Expand/modernize feat-001 spec/plan/contracts with revised API transport, conflict-scope boundaries, and additional requirements/criteria.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
specs/001-realtime-code-graph/tasks.md Updates task breakdown/gates (TDD split, stub-hardening requirement, commercial deferrals).
specs/001-realtime-code-graph/spec.md Updates requirements/SCs and clarifies protocol/operational behavior (e.g., HTTP POST + protobuf).
specs/001-realtime-code-graph/plan.md Updates architectural plan and decisions (protocol choice, edge deployment, crate boundaries).
specs/001-realtime-code-graph/data-model.md Refines conceptual data model (newtyped IDs, credential references, reachability/index details).
specs/001-realtime-code-graph/contracts/websocket-protocol.md Extends WebSocket contract (timeout semantics, replay request, updated heartbeat).
specs/001-realtime-code-graph/contracts/rpc-types.rs Updates draft RPC types to reflect new protocol decisions and commercial conflict extension trait.
crates/thread/tests/integration.rs Adds integration tests validating the new thread crate re-exports.
crates/thread/src/lib.rs Implements the thread façade module structure and top-level re-exports.
crates/thread/Cargo.toml Defines the new thread crate package, features, and optional deps.
crates/rule-engine/benches/comparison_benchmarks.rs Removes ast-grep-* benchmark paths, leaving thread-only benchmarks.
crates/rule-engine/benches/ast_grep_comparison.rs Removes ast-grep-* benchmark paths, leaving thread-only benchmarks.
Cargo.toml Adds crates/thread to the workspace and as a workspace dependency.
Cargo.lock Adds the new thread package to the lockfile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

use thread::language::{Tsx, LanguageExt};
use thread::Root;

let ast: Root<_> = Tsx.ast_grep("const x = 1;");
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LanguageExt::ast_grep returns an AstGrep<...>, not a Root, so let ast: Root<_> = Tsx.ast_grep(...) will not type-check and ast.root() will fail to compile. Drop the Root<_> annotation (or change it to AstGrep<_>), and keep calling .root() on the returned AstGrep if you want a Root for matching.

Suggested change
let ast: Root<_> = Tsx.ast_grep("const x = 1;");
let ast = Tsx.ast_grep("const x = 1;");

Copilot uses AI. Check for mistakes.
@socket-security
Copy link

socket-security bot commented Feb 25, 2026

…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfix(thread): expose flow module under worker feature; fix test type annotation�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231m- Change `#[cfg(feature = "flow")]` to `#[cfg(any(feature = "flow", feature = "worker"))]`�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231m  on `pub mod flow` in lib.rs. The `worker` feature implicitly enables `dep:thread-flow`�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231m  via `thread-flow/worker`, but the flow module was never exposed because the crate's own�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231m  `flow` feature remained false. Simply adding "flow" to worker was not viable since it�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231m  would also pull in `thread-flow/parallel` and `thread-flow/postgres-backend`, which are�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231m  incompatible with WASM/edge deployment.�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231m- Remove incorrect `Root<_>` type annotation in integration test; `LanguageExt::ast_grep`�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231m  returns `AstGrep<...>`, not `Root`. Drop the now-unused `use thread::Root` import.�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231mCo-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfix(ci): resolve three clippy errors causing CI failures�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231m- comparison_benchmarks.rs: use ThreadSupportLang (already imported) instead�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231m  of thread_language::TypeScript for from_yaml_string type parameter. TypeScript�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231m  only implements Deserialize when the `typescript` feature is enabled; SupportLang�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231m  is the correct type here and is already used consistently everywhere else in the�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231m  same bench file.�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231m- d1.rs: remove redundant closure in filter_map; PathBuf::from can be passed�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231m  directly as the function item.�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231m- performance.rs: replace manual checked-division pattern (if x > 0 { a / x })�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m �[38;5;231m  with checked_div().unwrap_or(0) in fingerprint_stats and query_stats.�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m
�[38;5;238m  13�[0m �[38;5;238m│�[0m �[38;5;231mCo-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
…──────────────────────�[0m

     �[38;5;238m│ �[0m�[1mSTDIN�[0m
�[38;5;238m─────┼──────────────────────────────────────────────────────────────────────────�[0m
�[38;5;238m   1�[0m �[38;5;238m│�[0m �[38;5;231mfix(ci): resolve additional clippy lints in flow crate�[0m
�[38;5;238m   2�[0m �[38;5;238m│�[0m
�[38;5;238m   3�[0m �[38;5;238m│�[0m �[38;5;231m- types.rs: fix inconsistent digit grouping in test timestamp literal�[0m
�[38;5;238m   4�[0m �[38;5;238m│�[0m �[38;5;231m  (1706400000_000_000 → 1_706_400_000_000_000)�[0m
�[38;5;238m   5�[0m �[38;5;238m│�[0m �[38;5;231m- graph.rs: replace PathBuf::from("A/B") with Path::new("A/B") in assertion�[0m
�[38;5;238m   6�[0m �[38;5;238m│�[0m �[38;5;231m  comparison to avoid allocating owned values (cmp_owned lint); clippy�[0m
�[38;5;238m   7�[0m �[38;5;238m│�[0m �[38;5;231m  suggested bare &str but PathBuf doesn't impl PartialEq<&str>, so use Path�[0m
�[38;5;238m   8�[0m �[38;5;238m│�[0m �[38;5;231m- observability_metrics_tests.rs: replace &[x.clone()] with�[0m
�[38;5;238m   9�[0m �[38;5;238m│�[0m �[38;5;231m  std::slice::from_ref(&x) for four single-element slice args�[0m
�[38;5;238m  10�[0m �[38;5;238m│�[0m �[38;5;231m  (cloned_ref_to_slice_refs lint)�[0m
�[38;5;238m  11�[0m �[38;5;238m│�[0m
�[38;5;238m  12�[0m �[38;5;238m│�[0m �[38;5;231mCo-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>�[0m
�[38;5;238m─────┴──────────────────────────────────────────────────────────────────────────�[0m
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants