Open
Conversation
Move definition ID computation (namespace_definition_id, method_definition_id) from inline format strings in each definition type into model::ids so they can be reused by the operation applier.
Introduce the Operation enum with 18 struct variants (EnterClass, EnterModule, EnterMethod, DefineConstant, etc.) plus ExitScope. Each struct carries its own data; scope context is provided by surrounding Enter/Exit operations.
Implement OperationVisitor to produce a human-readable, indented text representation of operations. Used by builder tests to assert on output.
Walk the Ruby AST (via ruby_prism) to produce a Vec<Operation>. This is the first phase of the two-phase indexing pipeline: source code to operations.
Implement OperationVisitor to convert operations into definitions in a LocalGraph. This is the second phase of the pipeline: operations to indexed definitions. LocalGraph::from_parts allows constructing a graph from pre-built parts (strings, names, document) so the applier can build graphs without going through the indexer.
Introduce IndexerBackend to switch between RubyIndexer and the operation pipeline. Add build_local_graph to dispatch indexing based on the backend.
Both RubyIndexer and OperationApplier now run the same tests via a shared ruby_indexer_tests.rs included with #[path]. Each parent provides a backend() function to select which pipeline to use.
Wire up the IndexerBackend enum to the CLI so users can choose between the original RubyIndexer and the operation pipeline.
e4e719f to
fd915fa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Introduces an intermediate representation (IR) between Ruby parsing and graph construction. Instead of the indexer writing directly to a
LocalGraph, a new pipeline produces an orderedVec<Operation>(an explicit list of what each file contributes to the graph), then applies those operations to build the graph.Current pipeline:
graph LR AST --> RubyIndexer --> LocalGraph --> Graph --> ResolutionNew pipeline:
graph LR AST --> RubyOperationBuilder --> Operations["Vec<Operation>"] --> OperationApplier --> LocalGraph --> Graph --> Resolution(This is an intermediate step to ensure parity with the current indexer / local graph and reduce the scope of this PR.)
Target pipeline:
graph LR AST --> Operations["Vec<Operation>"] --> Graph["Graph (resolution applies operations directly)"]Once we commit to this direction, we can remove the
LocalGraphand feed the operations directly to the resolver to let it create both the definitions and the declarations.Why
Four motivations that reinforce each other:
DSL plugin API
Plugins need to contribute definitions to the graph. Without operations, plugins mutate
LocalGraphinternals: fragile, tightly coupled, and bad for parallelism (Ruby plugins calling into the graph would require synchronization).With operations, a plugin just produces
Vec<Operation>, a pure data structure that can be built in parallel and applied later. Operations become the public contract between Rubydex and plugins.An ActiveRecord plugin sees
has_many :commentsand emits operations. It doesn't touch graph internals, it speaks the same operation protocol as the built-in indexer.Unloading
Each file's contribution is its operation list. Negating the operations (walk in reverse, opposite effect) cleanly unloads the file without graph diffing.
Because DSL plugins produce the same IR primitives as the built-in indexer, we know how to undo their contributions for free: no plugin-specific cleanup logic, no tracking which definitions in the graph came from which plugin.
Incremental re-indexing
Because operations are a flat list, we can diff the old and new operation lists when a file changes. Only the delta gets applied: remove operations that disappeared, add new ones. Most edits change very few operations, so the diff is small.
No full rebuild, no negating the entire file, just the minimal set of changes.
Order-dependent diagnostics
Operations preserve source order, which the current graph doesn't. Consider:
Ruby raises
NameErrorbecauseprivate_constant :BARruns beforeBARexists. Today Rubydex can't catch this because both the constant and the visibility change become unordered definitions in the graph, and resolution wires them together regardless of order. With operations, the ordering is explicit (SetConstantVisibilitybeforeDefineConstant), and a diagnostic pass can detect the forward reference.How
Operationenum (operation/mod.rs): ~20 variants covering namespace scopes (EnterClass,EnterModule,EnterSingletonClass,ExitScope), method operations (EnterMethod,DefineAttribute,AliasMethod,SetMethodVisibility,SetDefaultVisibility), constants (DefineConstant,AliasConstant,SetConstantVisibility), variables (DefineInstanceVariable,DefineClassVariable,DefineGlobalVariable), mixins (Mixin), and references (ReferenceConstant,ReferenceMethod). Each variant is its own struct so consumers can take fields by value or by reference.OperationPrinter(operation/printer.rs): Debug formatter that pretty-printsVec<Operation>with indentation reflecting scope nesting. Used for debugging and testing the builder output.RubyOperationBuilder(operation/ruby_builder.rs): Walks the Ruby AST and producesVec<Operation>. Same visitor pattern asRubyIndexer, but emits operations instead of mutating a graph. ~3100 lines, 1:1 correspondence with everyRubyIndexerbehavior.OperationApplier(operation/applier.rs): ConvertsVec<Operation>→LocalGraph. Consumes operations by value (zero clones). This is scaffolding that exists only because the current merge pipeline expects aLocalGraph. In the target architecture, resolution consumes operations directly and bothOperationApplierandLocalGraphgo away.IndexerBackendenum (indexing.rs,main.rs):--indexer ruby_indexer|operation_builderCLI flag to switch backends. Both backends run the same shared test suite (166 test functions × 2 backends).Parity verification
Both the indexer test suite (166 tests) and the resolution test suite (208 tests) run with both backends using a
#[path]module trick — each parent module provides abackend()function viasuper::backend(), so the same test file exercises bothRubyIndexerandOperationBuilder.Both backends produce identical output on Shopify core:
Performance
Essentially identical performance. The intermediate
Vec<Operation>per file adds a very small overhead.