-
Notifications
You must be signed in to change notification settings - Fork 218
Open
Description
Context
The GFQL executor codebase currently uses intermediate materializations throughout:
# Example from forward WHERE pruning
left_values = series_values(left_frame[left_col])
right_values = series_values(right_frame[right_col])
common = left_values & right_values
new_left = left_frame[left_frame[left_col].isin(common)]This pattern appears throughout the hop, pruning, and filtering logic. It creates multiple intermediate objects and, for cuDF, triggers multiple kernel launches with potential CPU roundtrips.
Research Questions
-
Expression pushdown: Can we use
.query()or expression APIs to push filtering to the engine?- pandas:
.query(f"{col} in @other_df['{col}']") - cuDF: Similar but may have syntax differences
- pandas:
-
Merge-based filtering: Would merge-based joins be more efficient?
filtered = left_frame.merge( right_frame[[right_col]].drop_duplicates().rename(columns={right_col: left_col}), on=left_col, how='inner' )
-
cuDF-specific optimizations:
- Fused kernels for GPU
- Avoiding CPU roundtrips (e.g.,
series_values()converts to Python set) - Leveraging cuDF's native join/filter primitives
Scope
This applies broadly across the GFQL codebase:
graphistry/compute/gfql/df_executor.py: Forward/backward pruning, materializationgraphistry/compute/gfql/same_path/where_filter.py: WHERE clause filteringgraphistry/compute/gfql/same_path/post_prune.py: Post-prune constraint propagationgraphistry/compute/gfql/same_path/multihop.py: Multi-hop edge filteringgraphistry/compute/gfql/same_path/bfs.py: BFS reachabilitygraphistry/compute/hop.py: Core hop operations
Tradeoffs to Evaluate
| Factor | Intermediate Materialization | Query/Expression | Merge-based |
|---|---|---|---|
| Memory | Creates intermediate sets | May avoid copies | Single join |
| GPU perf | Multiple kernels | Single fused kernel? | Native join |
| CPU perf | Fine for small data | Parsing overhead | Join overhead |
| pandas/cuDF parity | Works identically | Syntax may differ | Identical API |
| Debuggability | Explicit steps | String-based | Implicit |
Acceptance Criteria
- Audit codebase for intermediate materialization patterns
- Benchmark current approach vs alternatives on representative graphs
- Measure GPU memory and kernel launch overhead for cuDF path
- Document recommended approach with rationale
- Implement if significant improvement found (>20% speedup or memory reduction)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels