Skip to content

Feature/abstract ops#116

Draft
jellllly420 wants to merge 22 commits intoRPL-Toolchain:masterfrom
jellllly420:feature/abstract-ops
Draft

Feature/abstract ops#116
jellllly420 wants to merge 22 commits intoRPL-Toolchain:masterfrom
jellllly420:feature/abstract-ops

Conversation

@jellllly420
Copy link
Copy Markdown
Contributor

No description provided.

Adds a design document for the upcoming "ops" block feature that lets
pattern authors declare role-typed operations (e.g., $lock, $unlock)
and lets users supply concrete instances in rpl.toml. Match-time
substitution composes naturally with existing +/- set ops, and
patterns referencing op groups with no instances fold to empty
match-sets without special-casing.

Includes a CVE-2025-68260 POC test plan demonstrating the feature on
the first-ever Rust CVE in the Linux kernel.
15 task-level checklist covering grammar additions, AST + lowering,
resolver well-formedness checks (R1-R6), TOML schema and substitution
(C1-C7), match-time iteration over op instances via cartesian product,
and end-to-end UI tests including a CVE-2025-68260 POC.

Each task uses TDD: write failing test, run-fail, implement, run-pass,
commit.  The fold-on-empty-instances property falls out of the
cartesian helper without any conditional — that's the structural win.
Task 2 added OpRef as a sixth alternative to MirFnOperand, which
shifted the auto-generated pest_typed Choice5 -> Choice6.  Two
hand-written match sites needed updating to compile.  The new arm
delegates to a Task 4 stub since OpRef IR lowering lands there.
Adds the OpRef IR node so $group::$op call targets in patterns become
typed Operand::OpRef { group, op }.  Replaces the Task 2 placeholder
todo! in from_fn_op with real lowering, and adds stub arms wherever
matches over Operand are now non-exhaustive (resolver and matcher
implementations land in later tasks).
The previous Operand::OpRef lowering preserved the $-prefix from the
MetaVariable span, producing $-prefixed symbols.  But OpGroup.name
will be lowered from Identifier, which is bare.  Task 7's resolver
does a direct lookup against ops_block.groups, so the keys must match.

Adds a parser-round-trip test to catch this kind of regression.
Adds an ops_block field to Pattern<'pcx> and an add_ops_block
lowering pass that walks pest's pairs::opsBlock and populates
ops_block.groups with OpGroup entries.  Hooks into the top-level
Block dispatcher alongside pattBlock/utilBlock/diagBlock.

Extends collect_blocks in rpl_meta to also return opsBlock slices so
that add_parsed_pattern in context.rs can dispatch them.  A local
OpsMetaLookup struct implements GetType for resolving type meta-
variables ($T, $U) when lowering op-fn parameter and return types;
path types are not supported in ops items and will panic if attempted.

Adds crates/rpl_context/tests/ops_lowering.rs with two integration
tests that exercise the full parse→collect→lower pipeline.
Addresses code review on Task 5:

- Verify and document the index-counter alignment between
  OpsMetaLookup and NonLocalMetaVars::from_meta_decls (Path A:
  indices are correct as-is — both count only type-kind decls,
  starting independently at 0, matching NonLocalMetaVars' separate
  IndexVec per kind).
- Replace panics in OpsMetaLookup with unreachable! and messages
  that explicitly reference resolver check R1 as the precondition.
- Add a TODO marker on DUMMY_SP uses tying them to Task 6.
- Add a test exercising multiple op groups within a single block.
Adds three resolver checks that reject malformed ops blocks before
lowering hits the unreachable! contracts:

- R1: op-level meta-vars must be of kind 'type' (v1 restriction).
- R2: op signatures may only reference meta-vars declared on the
  same op group.
- R3: op signatures must be abstract — concrete Rust paths belong
  in rpl.toml.
Adds three more well-formedness checks plus per-pattern accounting:

- R4: $group::$op references must resolve to a declared op group
  and op signature.  Wrong group → "op group '<g>' is not declared";
  wrong op within a known group → "op '<o>' is not declared in op
  group '<g>'".
- R5: Op-call arity must match the op signature's arity at the
  call site.
- R6: Pattern bodies cannot reference op-level meta-vars; those are
  scope-isolated to the ops block.

Adds RustItems::referenced_op_groups() returning the set of op
groups referenced by OpRefs anywhere in the pattern body.  This
feeds Task 11's cartesian-product iteration over instances.

Also: replace the Task 4 todo!() stub in super_operand for
Operand::OpRef with a proper no-op (OpRef carries no sub-operands
to visit).  Includes parser.rs changes from Tasks 1-4 that were
uncommitted in the working tree.
Adds resolve_ops_config() that turns rpl_config::RawOpInstance into
typed ResolvedOpInstance for the matcher.

Implements C2-C7:
- C2/C3: missing meta-var/op binding -> warning, skip instance.
- C4: unknown extra key -> warning, skip.
- C5: undeclared placeholder in template -> warning, skip.
- C6: post-expansion parse failure -> warning, skip (deferred to match time).
- C7: cyclic substitution -> warning, skip.

Eager $T/$U expansion at load time via 16-iteration fixed-point loop;
only $1/$2-style existentials remain when the matcher consumes the result.

Deferred-parse path chosen for Step 5: binding values stored as
post-expansion strings (BTreeMap<Symbol, String>) rather than typed
Ty<pcx>/Path<pcx>. This keeps the implementation self-contained and
avoids threading PatCtxt through resolution. C6 parse-clean check
deferred to match time (Task 12).
Loads OpsConfig in check_crate from RplConfig.ops via
resolve_ops_config() and threads it through impl_matched_pat_op so
Task 11's cartesian-product iteration can consume the resolved
instances. Surfaces ResolveDiagnostics as rustc warnings.

Adds load_raw_ops() to rpl_config for the driver process to read
rpl.toml without going through the cargo-wrapper path. Stores the
raw ops vec on CheckFnCtxt and resolves per-pattern OpsConfig inside
each for_each_rpl_pattern closure in check_fn/check_assoc_fn.

The OpsConfig consumer code (cartesian iteration, OpRef resolution
during matching) lands in Tasks 11 and 12.
Wraps impl_matched_pat_op (and fn_matched_pat_op) with a cartesian-
product loop over op-instance combinations.  For each combination,
builds a ResolvedOpBindings view and invokes the renamed
impl_matched_pat_op_with_bindings (the original body, lifted out).

The "fold on empty instances" property falls out of the cartesian
helper: an empty factor folds the product to zero combinations,
which contributes the empty match-set to set-op composition.

Adds PatternOperation::referenced_op_groups() returning the union of
op-group names referenced by any operand of this set-op (recursive
through nested PatternOperations).

Exports pub fn cartesian<T,I,II> (odometer-order iterator) and
pub struct ResolvedOpBindings from rpl_driver; four unit tests
cover the four corner-cases including fold-on-no-instances.
At match time, when an Operand::OpRef { group, op } is encountered,
look up the active ResolvedOpInstance for the group, fetch the
expanded path string for the op, strip generic-arg brackets, split
on "::", intern to Symbols, build an ItemPath, and delegate to the
existing match_item_path_by_def_path matcher.

ResolvedOpBindings moved from rpl_driver (lifetime-parameterised) to
rpl_context::pat::ops_resolved as an owned struct (clones instances).
CheckMirCtxt gains an op_bindings field and a new_with_bindings
constructor. MatchStatement gains an op_bindings() method implemented
by both CheckMirCtxt and MatchCtxt.

Op-typed parameter handling: v1 punt — pattern-level type meta-vars
continue to match permissively; only call-target OpRef resolution is
wired end-to-end. The lock_unlock e2e tests (Task 13) will determine
if this is sufficient.

The bindings flow through impl_matched_pat_op_with_bindings and
fn_matched_pat_op_with_bindings (Task 11) into impl_matched /
fn_matched, which construct CheckMirCtxt::new_with_bindings.
Exercises every layer of the abstract-ops feature: parser surface,
ops block lowering, OpRef IR, OpsConfig substitution, cartesian
iteration, and matcher OpRef resolution against std::sync::Mutex.

A single rpl.toml [[ops.sync]] instance binds T -> Mutex<$1>,
U -> MutexGuard<$1>; the pattern's $sync::$lock matches the
m.lock() call site.

Also fixes a gap found during testing: fn_matched_pat_item and
impl_matched_pat_item for RustItems patterns were passing empty
ResolvedOpBindings to CheckMirCtxt instead of iterating over the
cartesian product of (group -> instance) assignments.  The fix
mirrors the existing cartesian-product logic in fn_matched_pat_op,
using referenced_op_groups() to determine which groups to expand.
Add four end-to-end UI tests for abstract-ops properties:

- folded: ops group declared in .rpl with no rpl.toml instances →
  cartesian product of zero combos → zero matches → no diagnostics.

- partial_bad: three [[ops.sync_pb]] entries, the third missing the
  'unlock' binding (fails C3 validation and is silently skipped); the
  first valid instance (Mutex) still fires the lint.

- set_op_with_ops: util patterns p_lock / p_uncovered both reference
  ops; patt = p_lock[] - p_uncovered[]. lock_only() fires, lock_and_mark()
  is subtracted. Demonstrates set-op composition with ops.

- two_groups: two independent op groups (sync_2g × logger_2g); pattern
  uses both in one function body. Cartesian product (2×1=2 combos) fires
  only for the (Mutex, black_box) combination present in the test file.

Bug fix (context): check_and_populate_op_refs previously only processed
patt_block items. util_block items (arena-allocated, shared refs) were
not populated with referenced_op_groups, causing the set-op and two-groups
patterns to produce zero matches. Fixed by also iterating util_block items
using an unsafe cast (sound: bump-arena items, single-threaded construction,
field written only once before any reader observes it).

test.sh: comment out the broken +rpl-dbg invocation; remove -e from
set -euxo pipefail; add || true after invocations expected to exit 1.
…ttern

Demonstrates the abstract-ops feature catching a real-world bug
class.  CVE-2025-68260 is the first Rust CVE in the Linux kernel:
a race condition in Node::release where intrusive list elements
get drained outside the lock that owned them.

The pattern uses two op groups (sync_cve and intrusive_list_cve)
to describe the bug structurally; rpl.toml binds them to
std::sync::Mutex and stub free-function paths.  buggy.rs mirrors
the pre-fix call shape (lock -> transfer -> unlock -> drain) and
matches; fixed.rs omits the explicit unlock sink (guard drops
naturally) so the pattern's $unlock step finds no candidate and
structurally fails to match.

Key implementation notes:
- #[inline(never)] on stub fns prevents MIR-inline from erasing
  the call sites that the pattern matches.
- std::intrinsics::black_box (via core_intrinsics feature) is used
  as the opaque unlock sink; it is not inlined and resolves to a
  stable DefId matching the rpl.toml "unlock" binding.
- Pattern arguments use wildcards (_) for the mutable-reference
  parameters since optimized MIR emits copy operands for &mut locals
  (which are not Copy, so move/copy cross-kind check would reject).

This is the headline e2e validation: one pattern, swappable
primitives, real bug class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant