Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions docs/measures-sql-paper-parity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Measures in SQL Paper Parity Matrix

Source paper: `arXiv:2406.00251v2` ("Measures in SQL", Jan 10, 2025), local copy at `.context/measures_in_sql.txt`.

## Scope

This matrix tracks parity for the core language semantics described in sections 3-5 of the paper.

- `Covered`: behavior is validated by existing automated tests.
- `Partial`: behavior is exercised, but not with an explicit parity assertion.
- `Gap`: no direct automated validation yet.

## Matrix

| Paper ref | Requirement | Status | Evidence |
|---|---|---|---|
| §3.2, Listing 3 | `AS MEASURE` in `CREATE VIEW`; no `GROUP BY` keeps base row cardinality | Covered | `test/sql/measures.test:20`, `test/sql/measures.test:1409` |
| §3.3 | `AGGREGATE(measure)` expansion/evaluation in grouped queries | Covered | `test/sql/measures.test:30`, `test/sql/measures.test:459` |
| Table 3 (`ALL`) | `AT (ALL)` removes all filters (grand total) | Covered | `test/sql/measures.test:152`, `test/sql/measures.test:444` |
| Table 3 (`ALL dim`) | `AT (ALL dim)` removes one dimension from context | Covered | `test/sql/measures.test:83`, `test/sql/measures.test:292` |
| Table 3 (`ALL dim1 dim2`) | single-clause multi-dimension `ALL` semantics | Covered | `test/sql/measures.test:1443` |
| Table 3 (modifier sequence) | chained modifiers execute right-to-left | Covered | `test/sql/measures.test:233`, `test/sql/measures.test:548` |
| Table 3 (`SET`) | `AT (SET dim = expr)` changes one dimension, correlates on others | Covered | `test/sql/measures.test:189`, `test/sql/measures.test:975` |
| Table 3 (`SET` + lost rows) | `SET` can reach rows removed by outer `WHERE` | Covered | `test/sql/measures.test:962` |
| Table 3 (`CURRENT`) | `CURRENT` resolves from single-valued context and returns `NULL` otherwise | Covered | `test/sql/measures.test:1665`, `test/sql/measures.test:1675` |
| Table 3 (`WHERE`) | `AT (WHERE predicate)` sets evaluation predicate | Covered | `test/sql/measures.test:165`, `test/sql/measures.test:341` |
| Table 3 (`WHERE`) | qualified refs and nested function predicates in `AT (WHERE ...)` | Covered | `test/sql/measures.test:178`, `test/sql/measures.test:1487` |
| Table 3 (`VISIBLE`) | `AT (VISIBLE)` respects current query visibility | Covered | `test/sql/measures.test:218`, `test/sql/measures.test:329` |
| §3.5 (ad hoc dims) | expression dimensions in `ALL`/`SET` (`MONTH(order_date)`, etc.) | Covered | `test/sql/measures.test:818`, `test/sql/measures.test:824`, `test/sql/measures.test:834` |
| Listing 8 | rollup query with `AGGREGATE`, plain measure ref, and `AT (VISIBLE)` | Covered | `test/sql/measures.test:1530` |
| Listing 9 | joins: weighted aggregate vs measure semantics vs `VISIBLE` | Covered | `test/sql/measures.test:1582`, `test/sql/measures.test:1458` |
| Listing 12 (queries 1-4) | correlated subquery, self-join, window, and measure forms return same rows | Covered | `test/sql/measures.test:1614`, `test/sql/measures.test:1624`, `test/sql/measures.test:1637`, `test/sql/measures.test:1652` |
| §5.1 claim | `AT` can access rows excluded by outer `WHERE` (more expressive than `OVER`) | Covered | `test/sql/measures.test:962` |
| §5.4 composability | derived measures referencing measures in same `SELECT` | Covered | `test/sql/measures.test:772`, `test/sql/measures.test:1499` |
| §5.3 wide-table safety direction | joins with measures avoid double counting in tested cases | Partial | `test/sql/measures.test:889`, `test/sql/measures.test:1473` |
| §5.5 security model | measure views preserve SQL security boundaries | Gap | no privilege-based test in suite |
| §3.4 call-site breadth | explicit use in `HAVING` parity path | Covered | `test/sql/measures.test:1548` |

## Current Verdict

- Core semantics used by the paper’s main language examples are covered, including listings `8`, `9`, and `12` (all four forms), `CURRENT`, rollup behavior, and modifier semantics.
- A strict "100% paper parity" claim is still not justified because of remaining `Gap` items above.

## Minimal Remaining Work for a 100% Claim

1. Add a security-behavior test plan (or explicit out-of-scope declaration if privileges are not testable in this harness).
46 changes: 46 additions & 0 deletions src/yardstick_extension.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,52 @@ BoundStatement yardstick_bind(ClientContext &context, Binder &binder,
}
throw BinderException("Registered state not found");
}

// Non-yardstick extension statements should not be rewritten by yardstick.
return {};
}
case StatementType::SELECT_STATEMENT: {
auto sql_to_check = context.GetCurrentQuery();
Comment on lines +460 to +461

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add break after EXTENSION_STATEMENT handling

Because the EXTENSION_STATEMENT case has no terminating break, control now falls through into the new SELECT_STATEMENT rewrite path whenever parse_function != yardstick_parse. That means non-yardstick extension statements are unexpectedly passed through yardstick_has_aggregate/yardstick_expand_aggregate, and statements containing measure-like SQL text can be rebound as SELECT * FROM yardstick(...) instead of being handled by their own extension binder.

Useful? React with 👍 / 👎.


if (yardstick_has_aggregate(sql_to_check.c_str())) {
YardstickAggregateResult result = yardstick_expand_aggregate(sql_to_check.c_str());
if (result.error) {
string error_msg(result.error);
yardstick_free_aggregate_result(result);
throw BinderException("Failed to expand AGGREGATE: %s", error_msg);
}

if (result.had_aggregate) {
string expanded_sql(result.expanded_sql);
yardstick_free_aggregate_result(result);

// Escape single quotes for embedding in string literal
string escaped_sql;
for (char c : expanded_sql) {
if (c == '\'') {
escaped_sql += "''";
} else {
escaped_sql += c;
}
}

// Rebind through table function so rewritten SQL executes with normal planning
string wrapper_sql = "SELECT * FROM yardstick('" + escaped_sql + "')";
Parser parser;
parser.ParseQuery(wrapper_sql);
auto statements = std::move(parser.statements);

if (statements.empty()) {
throw BinderException("Table function wrapper produced no statements");
}

auto yardstick_binder = Binder::CreateBinder(context);
return yardstick_binder->Bind(*statements[0]);
}

yardstick_free_aggregate_result(result);
}
return {};
}
default:
return {};
Expand Down
Loading
Loading