Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,12 +245,17 @@ When LTJ cannot decompose a join (unions, repetitions, any-direction edges), the

### Null semantics

`Value::Null` is a first-class variant. Properties that are absent from a node/edge map are treated as null at query time, and explicit nulls round-trip through the on-disk format.
`Value::Null` is a first-class variant. For a **bound** graph variable, **missing property keys** are read as `Success(Value::Null)` in `AttrLookup`; **missing record keys** in `Expr::FieldAccess` behave the same, and **field access with a null base** (`null.city`) yields **`Success(Null)`** — FPPC-style `ok null` / ISO default-nullable missing-property behavior. Explicit nulls round-trip through the on-disk format.

- **3VL in `cmp_values`** (`runtime/mod.rs`): null on either side yields `false`, so a predicate involving null is dropped from the result. Used by both the LTJ filter loop (`NodeAttrCmp`) and the standard scan (`filter_node`/`filter_edge`).
- **Residual `WHERE` and general expressions** (`engine.rs` `run_expr`, `eval_binop`): SQL/GQL-style **three-valued logic** — e.g. null comparisons yield unknown (success value `Null`), `AND`/`OR`/`NOT` follow SQL truth tables, `WHERE` keeps a binding only when the condition is definite `Bool(true)`. `BinOp::As` passes `Null` through so casts do not turn missing reads back into `Failure`.
- **Pushed-down value predicates** (`cmp_values` in `runtime/mod.rs`, `check_value_preds` in `engine.rs`): null on either side yields **`false`** (not full 3VL). Used by LTJ `NodeAttrCmp` and standard `filter_node`/`filter_edge`; arbitrary residual `WHERE` uses the path above.
- **Aggregate null elimination** (`engine.rs` `collect_aggregate_values`): both `ExprResult::Failure` and `Success(Value::Null)` are dropped before the reducer runs. Empty aggregates emit `Value::Null`.
- **Wire format**: `PropValue::Null` carries tag byte 6 (no payload). Nested nulls inside lists / records survive the round-trip. Top-level nulls are encoded as key absence — the property is omitted from the on-disk record.
- **Surface syntax**: the lexer accepts `null` / `NULL`. The parser emits `Expr::Const(Value::Null)`. The typechecker maps the literal to `SimpleType::Star` so `WHERE x = null` does not collapse the surrounding type derivation. `IS NULL` and `IS NOT NULL` (parsed via `try_is_null` lookahead) produce an `Expr::IsNull { operand, negated }` that returns `Value::Bool` regardless of operand type; missing-attribute and unbound-variable failures are treated as null.
- **Surface syntax**: the lexer accepts `null` / `NULL`. The parser emits `Expr::Const(Value::Null)`. The typechecker maps the literal to `SimpleType::Star` so `WHERE x = null` does not collapse the surrounding type derivation. `IS NULL` / `IS NOT NULL` → `Expr::IsNull`; operand `Failure` is still treated as null for the test (unchecked queries).

**Follow-up (separate workstream):** pushed-down predicates (`cmp_values`, node scans, LTJ `NodeAttrCmp`) treat null comparisons as **false**, while residual `WHERE` uses full **3VL**. That split can change observable row sets for the same logical filter depending on optimization. Next step is to unify semantics or document proveably ISO-safe shortcuts (cf. ISO/IEC 39075:2024 subclause 5.3.2.4 observable effect).

**Not modeled yet:** `<property exists>` as a value predicate (ISO 19.13 — distinct syntax / feature). Descriptor typing covers some “must have shape” cases at match time — see `docs/iso-gql-gaps.md`.

### CSV loader

Expand Down
12 changes: 6 additions & 6 deletions docs/iso-gql-gaps.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ gqlite ya implementa:
- Operadores: `+`, `-`, comparaciones, `=`, `!=`, `AND`, `OR`, `NOT` (sobre booleanos), `IS`, `AS`, `IN`.
- Storage: formato `.gdb` de una sola página-file, embedded.
- Optimizer: predicate pushdown, label index selection, Leapfrog Triejoin.
- `Value::Null`, lecturas de propiedad ausente como null en `AttrLookup`, y lógica trivalente en expresiones generales de `WHERE` (`run_expr` / `eval_binop`). Los predicados empujados al scan/LTJ siguen usando `cmp_values` (null → false), no la misma 3VL que el `WHERE` residual.

## Tier 1: expresividad que rompe cosas si falta

Expand Down Expand Up @@ -70,15 +71,14 @@ Cambios necesarios:

### 1.4 NULL y lógica trivalente

El `Value` actual no tiene NULL. Atributos faltantes producen `ExprResult::Failure` y la fila se descarta. GQL y SQL usan lógica trivalente (`TRUE`/`FALSE`/`UNKNOWN`).
**Parcialmente cubierto:** hay `Value::Null`, propiedades ausentes en entidades ligadas leen como null (no `Failure`), y `AND`/`OR`/`NOT`/comparaciones aritméticas siguen 3VL tipo SQL en el evaluador de expresiones del `WHERE` residual.

Sin NULL, OPTIONAL MATCH y las comparaciones con datos faltantes quedan inconsistentes.
Siguen fuera (features / alineación ISO futura):

Cambios necesarios:
- Predicado `<property exists>` (ISO 19.13): sintaxis y runtime propios; no es el mismo tubo que `x.p`.
- **Follow-up de corrección:** unificar o justificar pushdown (`cmp_values`) vs 3VL del `WHERE` residual cuando el efecto observable del filtro pueda diferir.

- `Value::Null`.
- Reescribir `eval_binop` para propagar NULL (cualquier op con NULL da NULL, excepto `IS NULL`/`IS NOT NULL`).
- `Value::Null` se trata como FALSE en contexto booleano (WHERE).
OPTIONAL MATCH sigue dependiendo de semántica de variables opcionales además del null en expresiones.

## Tier 2: útiles, hay workaround

Expand Down
6 changes: 3 additions & 3 deletions src/model/value.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ use std::fmt;

/// A property value. Scalars (Int/Float/Str/Bool/Null) plus the GQL
/// constructed types List (ordered collection) and Record (named fields, can
/// nest). `Null` is the SQL-style absent value: comparisons with it produce
/// false (predicate-is-null → row drops), aggregators skip it, and it is
/// distinct from any string, including `"NULL"`.
/// nest). `Null` is the SQL-style absent value; in general `WHERE` expressions,
/// comparisons involving null follow SQL three-valued logic. Aggregators skip null,
/// and it is distinct from any string, including `"NULL"`.
#[derive(Debug, Clone, PartialEq)]
pub enum Value {
Null,
Expand Down
Loading
Loading