diff --git a/docs/byte-layout-spec.md b/docs/byte-layout-spec.md
new file mode 100644
index 0000000..0fd7791
--- /dev/null
+++ b/docs/byte-layout-spec.md
@@ -0,0 +1,1019 @@
+# Starfix Byte Layout Specification
+
+This document describes the **exact byte-level serialization** used by Starfix to compute deterministic hashes of Apache Arrow schemas and record batches. Every byte fed into SHA-256 is specified here, making it possible to implement a compatible hasher in any language.
+
+All multi-byte integers use **little-endian** byte order unless explicitly stated otherwise.
+
+---
+
+## 1. Output Format
+
+Every Starfix hash is **35 bytes**:
+
+```
+[version: 3 bytes] [SHA-256 digest: 32 bytes]
+```
+
+The version prefix is currently `0x00 0x00 0x01` (version 0.0.1).
+
+When displayed as hex, a hash looks like:
+
+```
+000001 <64 hex chars of SHA-256>
+```
+
+---
+
+## 2. Schema Serialization
+
+### 2.1 Canonical JSON String
+
+The schema is serialized as a **compact JSON string** (no whitespace) of an object where:
+
+- **Keys** are field names, sorted alphabetically (via `BTreeMap`).
+- **Values** are objects with keys `"data_type"` and `"nullable"`, with JSON keys sorted alphabetically within every nested object (recursively).
+
+Because all JSON object keys are sorted recursively, the key order is always `"data_type"` before `"nullable"` (and `"data_type"` before `"name"` before `"nullable"` for struct children).
+
+#### Type Canonicalization
+
+Before serialization, these logical equivalence classes are collapsed:
+
+| Arrow type(s)              | Canonical JSON form           |
+|----------------------------|-------------------------------|
+| `Binary`, `LargeBinary`   | `"LargeBinary"`               |
+| `Utf8`, `LargeUtf8`       | `"LargeUtf8"`                 |
+| `List(f)`, `LargeList(f)` | `{"LargeList": <element>}`    |
+| `Dictionary(k, v)`        | canonical form of `v`         |
+
+#### Nested Type Serialization
+
+**Struct fields** are serialized as:
+```json
+{"Struct": [<array of child objects sorted by "name">]}
+```
+Each child object: `{"data_type": ..., "name": "<field_name>", "nullable": <bool>}`.
+
+**List / LargeList elements** are serialized as:
+```json
+{"LargeList": {"data_type": ..., "nullable": <bool>}}
+```
+Note: the Arrow-internal field name (typically `"item"`) is **omitted** — only `data_type` and `nullable` are included.
+
+**Primitive types** use Arrow's built-in serde:
+- `"Int32"`, `"Boolean"`, `"Float64"`, `"LargeBinary"`, `"LargeUtf8"`, etc.
+- `{"Decimal128": [38, 5]}`, `{"Time32": "Second"}`, etc.
+
+### 2.2 Schema Digest
+
+```
+schema_digest = SHA-256(canonical_json_string_bytes)
+```
+
+The UTF-8 bytes of the JSON string are fed directly into SHA-256. The result is 32 bytes.
+
+### 2.3 Concrete Example
+
+Schema: `{name: LargeUtf8 nullable, age: Int32 non-nullable}`
+
+Canonical JSON string (compact, keys sorted):
+```
+{"age":{"data_type":"Int32","nullable":false},"name":{"data_type":"LargeUtf8","nullable":true}}
+```
+
+Note: `"age"` comes before `"name"` alphabetically, and `"data_type"` comes before `"nullable"`.
+
+```
+schema_digest = SHA-256(b'{"age":{"data_type":"Int32","nullable":false},"name":{"data_type":"LargeUtf8","nullable":true}}')
+```
+
+---
+
+## 3. Field Data Serialization
+
+Each leaf field in the schema is hashed independently into its own SHA-256 digest. Struct fields are flattened: a struct field `address` with children `city` and `zip` becomes two leaf fields `address/city` and `address/zip`.
+
+Each leaf field has a **digest buffer** containing up to three components:
+
+| Component | Present when | Purpose |
+|-----------|-------------|---------|
+| `null_bits` (BitVec) | field is nullable | Tracks which elements are valid vs null |
+| `structural` (SHA-256) | field is a list type (`List` or `LargeList`) | Accumulates element counts (structure) |
+| `data` (SHA-256) | always | Accumulates leaf data bytes |
+
+A field is nullable if the Arrow field's `nullable` flag is `true`. A field is "structured" if its (canonical) data type is `List` or `LargeList`.
+
+This separation of structural information from leaf data ensures that list element boundaries are hashed independently from the values they contain. For example, `[[1,2],[3]]` and `[[1],[2,3]]` differ in their structural digest (element counts `[2,1]` vs `[1,2]`) even though their leaf data digest is identical (`[1,2,3]`).
+
+### 3.1 Fixed-Size Types
+
+**Types**: `Int8`, `UInt8`, `Int16`, `UInt16`, `Int32`, `UInt32`, `Int64`, `UInt64`, `Float16`, `Float32`, `Float64`, `Date32`, `Date64`, `Time32(*)`, `Time64(*)`, `Decimal32`, `Decimal64`, `Decimal128`, `Decimal256`, `FixedSizeBinary(n)`.
+
+| Type | Bytes per element |
+|------|-------------------|
+| Int8 / UInt8 | 1 |
+| Int16 / UInt16 / Float16 | 2 |
+| Int32 / UInt32 / Float32 / Date32 / Decimal32 / Time32 | 4 |
+| Int64 / UInt64 / Float64 / Date64 / Decimal64 / Time64 | 8 |
+| Decimal128 | 16 |
+| Decimal256 | 32 |
+| FixedSizeBinary(n) | n |
+
+**Non-nullable path**: The entire contiguous byte buffer (all elements concatenated, little-endian) is fed into the data digest in a single update.
+
+**Nullable path**:
+1. For each element `i`, push `is_valid(i)` (true=1, false=0) into the validity `BitVec`.
+2. For each **valid** element, feed its little-endian bytes into the data digest.
+3. **Null elements are skipped entirely** — no data bytes are fed.
+
+If a nullable field has no actual nulls (null buffer absent), all elements are marked valid and the entire buffer is fed in one update (same as non-nullable data path).
+
+### 3.2 Boolean Type
+
+Boolean values are **bit-packed** using **MSB-first** (`Msb0`) ordering into bytes.
+
+**Non-nullable**: All values are packed sequentially into a `BitVec<u8, Msb0>`, then the raw bytes are fed into the data digest.
+
+**Nullable**:
+1. Extend the validity `BitVec` as usual.
+2. Only **valid** values are packed (nulls are skipped).
+3. The packed bytes are fed into the data digest.
+
+**Example**: `[true, NULL, false, true]` (nullable, 4 elements)
+- Validity bits: `[1, 0, 1, 1]`
+- Data bits (valid only): `[true, false, true]` → Msb0 packed: `1_0_1_00000` = `0xA0`
+- Bytes fed to data digest: `[0xA0]`
+
+### 3.3 Variable-Length Types (Binary, String)
+
+**Types**: `Binary`, `LargeBinary`, `Utf8`, `LargeUtf8`.
+
+Each element is serialized as:
+```
+[length as u64 little-endian: 8 bytes] [raw bytes: length bytes]
+```
+
+The length prefix is **always `u64`** (8 bytes, little-endian) regardless of the Arrow offset type.
+
+**Non-nullable**: For each element, feed `(len as u64).to_le_bytes()` then the raw bytes.
+
+**Nullable**:
+1. Extend the validity `BitVec`.
+2. For valid elements: feed length prefix + raw bytes.
+3. For null elements: **skip entirely** — no bytes fed to data digest.
+
+### 3.4 List Types
+
+**Types**: `List(field)`, `LargeList(field)`.
+
+List types use **structural hashing**: element counts are written to a separate `structural` SHA-256 digest, while leaf data from sub-arrays flows into the `data` digest. This separation prevents collisions between differently-grouped lists (e.g., `[[1,2],[3]]` vs `[[1],[2,3]]`).
+
+For each valid list element (a sub-array):
+
+1. **Structural digest** receives: `[sub-array element count as u64 little-endian: 8 bytes]`
+2. **Data digest** receives: recursive serialization of the sub-array's leaf values
+
+**Nullable**: Extend validity `BitVec`; skip null list entries entirely (no bytes to either digest).
+
+Sub-array elements are hashed recursively using the same rules. If a list contains nested lists (e.g., `List<List<Int32>>`), each nesting level writes its element counts to the same structural digest, and only the innermost leaf values reach the data digest.
+
+#### Concrete Example: Structural vs Leaf Separation
+
+For `LargeList<Int32>` with data `[[1,2],[3]]`:
+
+```
+structural digest receives:
+    02 00 00 00 00 00 00 00     (element 0: 2 items, u64 LE)
+    01 00 00 00 00 00 00 00     (element 1: 1 item, u64 LE)
+
+data digest receives:
+    01 00 00 00                  (1 as i32 LE)
+    02 00 00 00                  (2 as i32 LE)
+    03 00 00 00                  (3 as i32 LE)
+```
+
+Compare with `[[1],[2,3]]`:
+
+```
+structural digest receives:
+    01 00 00 00 00 00 00 00     (element 0: 1 item)
+    02 00 00 00 00 00 00 00     (element 1: 2 items)
+
+data digest receives:
+    01 00 00 00                  (same leaf bytes)
+    02 00 00 00
+    03 00 00 00
+```
+
+The data digests are identical, but the structural digests differ — so the final hashes differ.
+
+### 3.5 Struct Types
+
+Struct fields are handled differently depending on context:
+
+#### Record-Batch Path (field decomposition)
+
+In the record-batch path (`hash_record_batch`, streaming `update`/`finalize`), struct fields are **decomposed into leaf fields**. Each leaf field within the struct is extracted and hashed independently under its own path key (e.g., `address/city`, `address/zip`). These paths live in a `BTreeMap`, so they are always processed in alphabetical order. The struct itself does not appear as a separate entry.
+
+#### Composite Path (`hash_array`, list sub-arrays)
+
+When a struct appears as a standalone array (`hash_array`) or as a sub-array within a list, it is hashed **compositely**:
+
+1. **Struct-level nulls**: If the parent digest is Nullable, push struct-level validity into the parent's `BitVec` (same as all other types via `handle_null_bits`).
+
+2. **Children sorted alphabetically** by field name.
+
+3. **For each child** (in sorted order):
+   - Create a fresh digest buffer for the child. The child is **effectively nullable** if either the child field is nullable OR the struct has null rows. The child gets a **structural digest** if it is a list type.
+   - If the struct has null rows, **propagate struct nulls** to the child: `combined_valid(i) = struct_valid(i) AND child_valid(i)`. This ensures undefined data at null struct positions is never hashed.
+   - Hash the child recursively via `array_digest_update`.
+   - **Finalize the child digest** and write the resulting bytes into the parent's data stream (in the order: null_bits, structural, data):
+     - Non-nullable, non-list child: `SHA-256(child_data).finalize()` (32 bytes)
+     - Nullable, non-list child: `bit_count LE (8B) || validity_words BE (8B each) || SHA-256(child_data).finalize() (32B)`
+     - Non-nullable list child: `SHA-256(child_structural).finalize() (32B) || SHA-256(child_data).finalize() (32B)`
+     - Nullable list child: `bit_count LE (8B) || validity_words BE (8B each) || SHA-256(child_structural).finalize() (32B) || SHA-256(child_data).finalize() (32B)`
+
+The parent's data stream thus contains the concatenation of all children's finalized bytes (in alphabetical order).
+
+### 3.6 Dictionary-Encoded Arrays
+
+Dictionary arrays are **resolved to their plain equivalent** before hashing. The dictionary is unpacked so that the data stream is identical to a non-dictionary array with the same logical values.
+
+---
+
+## 4. Field Digest Finalization
+
+After all record batches have been fed, each field's digest buffer is finalized and fed into the **final combining digest**. The three components are written in this fixed order:
+
+```
+1. null_bits    (if present — nullable fields only)
+2. structural   (if present — list fields only)
+3. data         (always present)
+```
+
+### 4.1 Non-Nullable, Non-List Field
+
+```
+final_digest.update( SHA-256(data_bytes).finalize() )    // 32 bytes
+```
+
+Only the data digest is finalized (32 bytes).
+
+### 4.2 Nullable, Non-List Field
+
+```
+final_digest.update( bit_count.to_le_bytes() )           // 8 bytes (usize LE = u64 LE on 64-bit)
+for each word in validity_bitvec.as_raw_slice():          // each word is usize (8 bytes on 64-bit)
+    final_digest.update( word.to_be_bytes() )             // 8 bytes big-endian per word
+final_digest.update( SHA-256(data_bytes).finalize() )     // 32 bytes
+```
+
+### 4.3 Non-Nullable List Field
+
+```
+final_digest.update( SHA-256(structural_bytes).finalize() )   // 32 bytes (element counts)
+final_digest.update( SHA-256(data_bytes).finalize() )          // 32 bytes (leaf values)
+```
+
+### 4.4 Nullable List Field
+
+```
+final_digest.update( bit_count.to_le_bytes() )                // 8 bytes
+for each word in validity_bitvec.as_raw_slice():
+    final_digest.update( word.to_be_bytes() )                  // 8 bytes per word
+final_digest.update( SHA-256(structural_bytes).finalize() )    // 32 bytes (element counts)
+final_digest.update( SHA-256(data_bytes).finalize() )          // 32 bytes (leaf values)
+```
+
+**Validity BitVec details** (applies to all nullable variants):
+- Storage type: `usize` (8 bytes on 64-bit platforms).
+- Bit order: `Lsb0` (least significant bit first within each word).
+- `bit_count` = total number of elements (valid + null), serialized as `usize` little-endian.
+- Each storage word is serialized as `usize` big-endian.
+- The last word may have unused high bits (zero-padded).
+
+---
+
+## 5. Final Combining Digest
+
+The final hash is computed by feeding into a fresh SHA-256:
+
+```
+final_digest = SHA-256()
+
+// 1. Schema digest (32 bytes)
+final_digest.update( schema_digest )
+
+// 2. Field digests in alphabetical order of field path
+for field_path in sorted(field_paths):
+    finalize field's DigestBufferType into final_digest (see Section 4)
+
+raw_hash = final_digest.finalize()    // 32 bytes
+output = [0x00, 0x00, 0x01] ++ raw_hash   // 35 bytes
+```
+
+---
+
+## 6. `hash_array` API
+
+The `hash_array` function hashes a single array (without a schema context). It works slightly differently from the record-batch path:
+
+```
+final_digest = SHA-256()
+
+// 1. Type metadata (canonical JSON string)
+canonical_type = data_type_to_value(effective_data_type)
+json_string = JSON.serialize(canonical_type)     // compact, keys sorted
+final_digest.update( json_string.as_bytes() )
+
+// 2. Data (with structural separation for list types)
+digest_buffer = {
+    null_bits:  BitVec if nullable, else absent
+    structural: SHA-256() if list type, else absent
+    data:       SHA-256()
+}
+array_digest_update(effective_data_type, effective_array, digest_buffer)
+finalize digest_buffer into final_digest (see Section 4)
+
+raw_hash = final_digest.finalize()    // 32 bytes
+output = [0x00, 0x00, 0x01] ++ raw_hash   // 35 bytes
+```
+
+Dictionary arrays are resolved to their value type before hashing.
+
+---
+
+## 7. Worked Examples
+
+### Example A: Simple Two-Column Table
+
+**Schema**: `{age: Int32 non-nullable, name: LargeUtf8 nullable}`
+
+**Data** (1 record batch, 2 rows):
+
+| age | name    |
+|-----|---------|
+| 25  | "Alice" |
+| 30  | NULL    |
+
+#### Step 1: Schema Digest
+
+Canonical JSON (compact):
+```
+{"age":{"data_type":"Int32","nullable":false},"name":{"data_type":"LargeUtf8","nullable":true}}
+```
+
+```
+schema_digest = SHA-256("{"age":{"data_type":"Int32","nullable":false},"name":{"data_type":"LargeUtf8","nullable":true}}")
+```
+
+#### Step 2: Field "age" (Int32, non-nullable)
+
+Values: `[25, 30]`
+
+Little-endian bytes:
+- 25 as i32 LE: `19 00 00 00`
+- 30 as i32 LE: `1e 00 00 00`
+
+Data fed to digest: `19 00 00 00 1e 00 00 00` (8 bytes, one contiguous slice)
+
+```
+age_data_digest = SHA-256(0x19000000_1e000000)
+```
+
+Finalization into final_digest (non-nullable):
+```
+final_digest.update( age_data_digest.finalize() )   // 32 bytes
+```
+
+#### Step 3: Field "name" (LargeUtf8, nullable)
+
+Values: `["Alice", NULL]`
+
+**Validity bits** (Lsb0 in usize words):
+- Element 0 ("Alice"): valid → bit = 1
+- Element 1 (NULL): null → bit = 0
+- BitVec contents: bits `[1, 0]`, bit_count = 2
+- As usize (Lsb0): bit 0 = 1, bit 1 = 0 → binary `...0000_0001` = 1
+- `as_raw_slice()` = `[1_usize]`
+
+Validity serialization:
+```
+bit_count LE:  02 00 00 00 00 00 00 00     (2 as usize little-endian)
+word 0 BE:     00 00 00 00 00 00 00 01     (1 as usize big-endian)
+```
+
+**Data bytes** (only valid elements):
+- "Alice": length 5 as u64 LE = `05 00 00 00 00 00 00 00`, then UTF-8 bytes `41 6c 69 63 65`
+- NULL: skipped entirely
+
+```
+name_data_digest = SHA-256(0x0500000000000000_416c696365)
+```
+
+Finalization into final_digest (nullable):
+```
+final_digest.update( 0x0200000000000000 )                   // bit count
+final_digest.update( 0x0000000000000001 )                   // word 0 BE
+final_digest.update( name_data_digest.finalize() )           // 32 bytes
+```
+
+#### Step 4: Final Combination
+
+Fields in alphabetical order: `age`, then `name`.
+
+```
+final_digest = SHA-256()
+final_digest.update( schema_digest )                          // 32 bytes
+final_digest.update( age_data_digest.finalize() )             // 32 bytes (non-nullable)
+final_digest.update( 0x0200000000000000 )                     // name bit count
+final_digest.update( 0x0000000000000001 )                     // name validity word
+final_digest.update( name_data_digest.finalize() )            // 32 bytes
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example B: Boolean Array with Nulls (hash_array API)
+
+**Array**: `BooleanArray [true, NULL, false, true]` (nullable)
+
+#### Step 1: Type Metadata
+
+Canonical type JSON: `"Boolean"` (7 bytes as UTF-8)
+
+```
+final_digest.update(b'"Boolean"')
+```
+
+Note: `serde_json::to_string` of a JSON string value includes the surrounding quotes.
+
+#### Step 2: Data
+
+**Validity bits** (Lsb0 in usize):
+- `[1, 0, 1, 1]` → bits: b0=1, b1=0, b2=1, b3=1
+- As usize (Lsb0): binary `...0000_1101` = 13
+- `as_raw_slice()` = `[13_usize]`
+
+**Data bits** (Msb0 packed, valid values only):
+- Valid values: `[true, false, true]` (3 values)
+- Msb0 packing: bit7=true(1), bit6=false(0), bit5=true(1), bits4-0=0
+- Byte: `10100000` = `0xA0`
+
+```
+data_digest = SHA-256(0xA0)
+```
+
+#### Step 3: Finalization
+
+```
+final_digest = SHA-256()
+final_digest.update(b'"Boolean"')                             // type metadata
+final_digest.update( 0x0400000000000000 )                     // 4 bits (bit count LE)
+final_digest.update( 0x000000000000000D )                     // 13 as usize BE
+final_digest.update( data_digest.finalize() )                 // 32 bytes
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example C: Non-Nullable Int32 Array (hash_array API)
+
+**Array**: `Int32Array [1, 2, 3]` (non-nullable)
+
+#### Step 1: Type Metadata
+
+Canonical type JSON: `"Int32"` (6 bytes: `22 49 6e 74 33 32 22`... wait, `"Int32"` is the JSON string `"Int32"` including quotes)
+
+Actually: `serde_json::to_string(&json!("Int32"))` produces `"\"Int32\""`, but `data_type_to_value` for Int32 produces the JSON value `"Int32"` (a JSON string). Then `serde_json::to_string` of that JSON string value produces `"\"Int32\""` — the 7-byte string `"Int32"` with quotes.
+
+```
+final_digest.update(b'"Int32"')     // 7 bytes: 22 49 6e 74 33 32 22
+```
+
+#### Step 2: Data
+
+Values as i32 LE bytes:
+- 1: `01 00 00 00`
+- 2: `02 00 00 00`
+- 3: `03 00 00 00`
+
+Entire buffer fed as one slice: `01 00 00 00 02 00 00 00 03 00 00 00` (12 bytes)
+
+```
+data_digest = SHA-256(0x010000000200000003000000)
+```
+
+#### Step 3: Finalization (non-nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update(b'"Int32"')                               // 7 bytes
+final_digest.update( data_digest.finalize() )                 // 32 bytes
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example D: Binary Array (hash_array API)
+
+**Array**: `BinaryArray [b"hi", b""]` (non-nullable)
+
+#### Step 1: Type Metadata
+
+`Binary` is canonicalized to `LargeBinary`.
+
+```
+final_digest.update(b'"LargeBinary"')      // 13 bytes
+```
+
+#### Step 2: Data
+
+Each element: `[u64 LE length] [raw bytes]`
+
+- `b"hi"`: length 2 → `02 00 00 00 00 00 00 00` + `68 69`
+- `b""`: length 0 → `00 00 00 00 00 00 00 00` (no raw bytes)
+
+```
+data_digest = SHA-256(0x0200000000000000_6869_0000000000000000)
+```
+
+#### Step 3: Finalization (non-nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update(b'"LargeBinary"')
+final_digest.update( data_digest.finalize() )
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example E: Column-Order Independence
+
+Two record batches with the same logical data but different column orders must produce identical hashes.
+
+**Batch 1** (columns: x, y):
+```
+Schema: {x: Int32 non-nullable, y: Boolean nullable}
+x: [10]
+y: [true]
+```
+
+**Batch 2** (columns: y, x):
+```
+Schema: {y: Boolean nullable, x: Int32 non-nullable}
+y: [true]
+x: [10]
+```
+
+Both produce the same canonical schema JSON:
+```
+{"x":{"data_type":"Int32","nullable":false},"y":{"data_type":"Boolean","nullable":true}}
+```
+
+Both produce the same field digests (fields processed alphabetically: `x` then `y`):
+- Field `x`: `SHA-256(0x0a000000)` (10 as i32 LE)
+- Field `y`: validity `[1]` (1 bit, 1 word), data `0x80` (true packed Msb0)
+
+Therefore `hash_record_batch(batch1) == hash_record_batch(batch2)`.
+
+---
+
+### Example F: Type Equivalence (Utf8 vs LargeUtf8)
+
+**Array 1**: `StringArray ["ab"]` (non-nullable, Arrow type `Utf8`)
+**Array 2**: `LargeStringArray ["ab"]` (non-nullable, Arrow type `LargeUtf8`)
+
+Both produce the same type metadata: `"LargeUtf8"` (after canonicalization).
+
+Both produce the same data bytes:
+```
+02 00 00 00 00 00 00 00   (length 2 as u64 LE)
+61 62                      ("ab" as UTF-8)
+```
+
+Therefore `hash_array(array1) == hash_array(array2)`.
+
+---
+
+### Example G: Nullable Int32 Array with Nulls (hash_array API)
+
+**Array**: `Int32Array [Some(42), None, Some(-7), Some(0)]` (nullable)
+
+#### Step 1: Type Metadata
+
+```
+final_digest.update(b'"Int32"')     // 7 bytes
+```
+
+#### Step 2: Data
+
+**Validity bits** (Lsb0 in usize):
+- `[1, 0, 1, 1]` → bits: b0=1, b1=0, b2=1, b3=1
+- As usize (Lsb0): binary `...0000_1101` = 13
+- bit_count = 4
+
+**Data bytes** (only valid elements):
+- 42 as i32 LE: `2a 00 00 00`
+- -7 as i32 LE: `f9 ff ff ff`
+-  0 as i32 LE: `00 00 00 00`
+
+```
+data_digest = SHA-256(0x2a000000_f9ffffff_00000000)
+```
+
+#### Step 3: Finalization (nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update(b'"Int32"')                                 // type metadata
+final_digest.update( 0x0400000000000000 )                       // 4 bits (bit count LE)
+final_digest.update( 0x000000000000000D )                       // 13 as usize BE
+final_digest.update( data_digest.finalize() )                   // 32 bytes
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example H: Nullable String Array with Nulls (hash_array API)
+
+**Array**: `StringArray [Some("hello"), None, Some("world"), Some("")]` (nullable, Arrow type `Utf8`)
+
+#### Step 1: Type Metadata
+
+`Utf8` is canonicalized to `LargeUtf8`.
+
+```
+final_digest.update(b'"LargeUtf8"')     // 12 bytes
+```
+
+#### Step 2: Data
+
+**Validity bits** (Lsb0 in usize):
+- `[1, 0, 1, 1]` → 0b1101 = 13
+- bit_count = 4
+
+**Data bytes** (only valid elements, null skipped entirely):
+- `"hello"`: `05 00 00 00 00 00 00 00` (len=5 as u64 LE) + `68 65 6c 6c 6f`
+- `"world"`: `05 00 00 00 00 00 00 00` (len=5 as u64 LE) + `77 6f 72 6c 64`
+- `""`: `00 00 00 00 00 00 00 00` (len=0 as u64 LE, no raw bytes)
+
+```
+data_digest = SHA-256(len+"hello" + len+"world" + len+"")
+```
+
+#### Step 3: Finalization (nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update(b'"LargeUtf8"')
+final_digest.update( 0x0400000000000000 )                       // bit_count=4 LE
+final_digest.update( 0x000000000000000D )                       // validity=13 BE
+final_digest.update( data_digest.finalize() )                   // 32 bytes
+raw_hash = final_digest.finalize()
+output = 0x000001 ++ raw_hash
+```
+
+---
+
+### Example I: Empty Table (no data, schema only)
+
+**Schema**: `{a: Int32 non-nullable, b: Boolean nullable}`
+
+When no record batches are fed (i.e., `finalize()` is called immediately after construction), the field digests still exist — they just contain no data.
+
+#### Schema Digest
+
+```
+schema_json = '{"a":{"data_type":"Int32","nullable":false},"b":{"data_type":"Boolean","nullable":true}}'
+schema_digest = SHA-256(schema_json)
+```
+
+#### Field "a" (Int32, non-nullable)
+
+No data was fed, so:
+```
+a_data_digest = SHA-256("")     // SHA-256 of empty input
+```
+
+#### Field "b" (Boolean, nullable)
+
+No data was fed:
+- `bit_count` = 0 (no elements, BitVec is empty)
+- `as_raw_slice()` = `[]` (no words)
+- Data digest = SHA-256 of empty input
+
+#### Final Combination
+
+```
+final_digest = SHA-256()
+final_digest.update( schema_digest )                             // 32 bytes
+final_digest.update( SHA-256("").finalize() )                    // field "a" (non-nullable, 32 bytes)
+final_digest.update( 0x0000000000000000 )                        // field "b" bit_count=0 LE
+// no validity words (raw_slice is empty for 0-length BitVec)
+final_digest.update( SHA-256("").finalize() )                    // field "b" data (32 bytes)
+output = 0x000001 ++ final_digest.finalize()
+```
+
+---
+
+### Example J: Multi-Batch Streaming (batch-split independence)
+
+**Schema**: `{v: Int32 non-nullable}`
+
+Feeding two batches must produce the same hash as feeding one combined batch:
+
+- **Batch 1**: `v = [1, 2]`
+- **Batch 2**: `v = [3]`
+- **Combined**: `v = [1, 2, 3]`
+
+Because the internal SHA-256 state is incremental:
+```
+update(01 00 00 00  02 00 00 00)   // from batch 1
+update(03 00 00 00)                // from batch 2
+```
+is identical to:
+```
+update(01 00 00 00  02 00 00 00  03 00 00 00)   // single combined batch
+```
+
+#### Manual Computation
+
+```
+schema_json = '{"v":{"data_type":"Int32","nullable":false}}'
+schema_digest = SHA-256(schema_json)
+
+v_data_digest = SHA-256(0x010000000200000003000000)
+
+final_digest = SHA-256()
+final_digest.update( schema_digest )
+final_digest.update( v_data_digest.finalize() )
+output = 0x000001 ++ final_digest.finalize()
+```
+
+Therefore `hash(batch1 + batch2) == hash(combined)`.
+
+---
+
+### Example K: Struct Column in a Record Batch
+
+**Schema**: `{person: Struct<age: Int32 non-null, name: LargeUtf8 non-null> non-nullable}`
+
+**Data** (2 rows):
+
+| person.age | person.name |
+|------------|-------------|
+| 25         | "Alice"     |
+| 30         | "Bob"       |
+
+In the record-batch path, the struct is **decomposed into leaf fields**: `person/age` and `person/name`. Each is hashed independently.
+
+#### Step 1: Schema Digest
+
+Canonical JSON:
+```
+{"person":{"data_type":{"Struct":[{"data_type":"Int32","name":"age","nullable":false},{"data_type":"LargeUtf8","name":"name","nullable":false}]},"nullable":false}}
+```
+
+#### Step 2: Leaf field "person/age" (Int32, non-nullable)
+
+```
+age_data_digest = SHA-256(0x19000000_1e000000)    // [25, 30] as i32 LE
+```
+
+#### Step 3: Leaf field "person/name" (LargeUtf8, non-nullable)
+
+```
+name_data_digest = SHA-256(
+    0x0500000000000000 "Alice"    // len=5 u64 LE + UTF-8
+    0x0300000000000000 "Bob"      // len=3 u64 LE + UTF-8
+)
+```
+
+#### Step 4: Final Combination
+
+Fields alphabetically: `person/age`, `person/name`.
+
+```
+final_digest = SHA-256()
+final_digest.update( schema_digest )                     // 32 bytes
+final_digest.update( age_data_digest.finalize() )        // 32 bytes (non-nullable)
+final_digest.update( name_data_digest.finalize() )       // 32 bytes (non-nullable)
+output = 0x000001 ++ final_digest.finalize()
+```
+
+---
+
+### Example L: Struct Array via hash_array (non-nullable)
+
+**Array**: `StructArray [{a: 1, b: true}, {a: 2, b: false}]`
+
+Children: `a: Int32 non-null`, `b: Boolean non-null`. Struct is non-nullable.
+
+#### Step 1: Type Metadata
+
+Canonical type JSON (struct fields sorted alphabetically, keys sorted):
+```
+{"Struct":[{"data_type":"Int32","name":"a","nullable":false},{"data_type":"Boolean","name":"b","nullable":false}]}
+```
+
+#### Step 2: Composite Data
+
+Children sorted by name: `a`, then `b`.
+
+**Child "a"** (Int32, non-nullable):
+```
+child_a_data_digest = SHA-256(0x01000000_02000000)    // [1, 2] as i32 LE
+child_a_finalized = child_a_data_digest.finalize()     // 32 bytes (non-nullable)
+```
+
+**Child "b"** (Boolean, non-nullable):
+```
+// [true, false] → Msb0: bit7=1, bit6=0 → 0x80
+child_b_data_digest = SHA-256(0x80)
+child_b_finalized = child_b_data_digest.finalize()     // 32 bytes
+```
+
+**Parent data stream**: `child_a_finalized || child_b_finalized`
+
+```
+parent_data_digest = SHA-256( child_a_finalized || child_b_finalized )
+```
+
+#### Step 3: Finalization (non-nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update( type_json_bytes )                   // type metadata
+final_digest.update( parent_data_digest.finalize() )     // 32 bytes
+output = 0x000001 ++ final_digest.finalize()
+```
+
+---
+
+### Example M: Nullable Struct Array via hash_array (struct-level nulls)
+
+**Array**: `StructArray [Some({a: 10, b: "x"}), None, Some({a: 30, b: "z"})]`
+
+Children: `a: Int32 non-null`, `b: LargeUtf8 non-null`. Struct is **nullable**.
+
+Row 1 is a null struct — children's data at row 1 is undefined and must be skipped.
+
+#### Step 1: Type Metadata
+
+Same struct type JSON as above (with appropriate fields):
+```
+{"Struct":[{"data_type":"Int32","name":"a","nullable":false},{"data_type":"LargeUtf8","name":"b","nullable":false}]}
+```
+
+#### Step 2: Struct-Level Validity
+
+Struct validity: `[valid, null, valid]` → bits `[1, 0, 1]`
+- bit_count = 3
+- usize word (Lsb0): `0b101` = 5
+
+This goes into the parent's BitVec (the top-level digest for `hash_array`).
+
+#### Step 3: Composite Data (children with struct-null propagation)
+
+**Child "a"** (Int32, effectively nullable due to struct nulls):
+- Combined validity: struct AND child = `[1, 0, 1]` (child has no nulls)
+- Valid data: `[10, 30]` (row 1 skipped)
+- bit_count = 3, validity_word = 5
+
+```
+child_a_data_digest = SHA-256(0x0a000000_1e000000)     // [10, 30] as i32 LE
+child_a_finalized = 0x0300000000000000                  // bit_count=3 LE
+                 || 0x0000000000000005                  // validity word=5 BE
+                 || child_a_data_digest.finalize()      // 32 bytes
+```
+
+**Child "b"** (LargeUtf8, effectively nullable):
+- Combined validity: `[1, 0, 1]`
+- Valid data: `"x"`, `"z"` (row 1 skipped)
+
+```
+child_b_data_digest = SHA-256(
+    0x0100000000000000 "x"     // len=1 + "x"
+    0x0100000000000000 "z"     // len=1 + "z"
+)
+child_b_finalized = 0x0300000000000000                  // bit_count=3 LE
+                 || 0x0000000000000005                  // validity word=5 BE
+                 || child_b_data_digest.finalize()      // 32 bytes
+```
+
+**Parent data stream**: `child_a_finalized || child_b_finalized`
+
+```
+parent_data_digest = SHA-256( child_a_finalized || child_b_finalized )
+```
+
+#### Step 4: Finalization (nullable)
+
+```
+final_digest = SHA-256()
+final_digest.update( type_json_bytes )                   // type metadata
+final_digest.update( 0x0300000000000000 )                // struct bit_count=3 LE
+final_digest.update( 0x0000000000000005 )                // struct validity word=5 BE
+final_digest.update( parent_data_digest.finalize() )     // 32 bytes
+output = 0x000001 ++ final_digest.finalize()
+```
+
+---
+
+### Example N: List-of-Struct in a Record Batch
+
+**Schema**: `{items: LargeList<Struct<id: Int32 non-null, label: LargeUtf8 non-null>> nullable}`
+
+**Data** (2 rows):
+
+| items |
+|-------|
+| `[{id: 1, label: "a"}, {id: 2, label: "b"}]` |
+| `[{id: 3, label: "c"}]` |
+
+The list column is a single field "items" in the BTreeMap. Its sub-arrays are struct arrays, hashed compositely via `array_digest_update(Struct)`.
+
+#### Step 1: Schema Digest
+
+Canonical JSON (element type omits Arrow-internal field name "item"):
+```
+{"items":{"data_type":{"LargeList":{"data_type":{"Struct":[{"data_type":"Int32","name":"id","nullable":false},{"data_type":"LargeUtf8","name":"label","nullable":false}]},"nullable":false}},"nullable":true}}
+```
+
+#### Step 2: Field "items" (nullable list — has null_bits, structural, and data)
+
+**Validity BitVec** (`null_bits`) — accumulates null bits from the list **and** all recursive sub-arrays that share this digest:
+
+1. List-level: `handle_null_bits(list)` → `[1, 1]` (both list elements valid)
+2. Element 0 struct (2 rows, no nulls): `handle_null_bits(struct)` → `[1, 1]`
+3. Element 1 struct (1 row, no nulls): `handle_null_bits(struct)` → `[1]`
+
+Total BitVec: `[1, 1, 1, 1, 1]` — 5 bits, all valid.
+- bit_count = 5
+- usize word (Lsb0): `0b11111` = 31
+
+**Structural digest** — receives element counts for each valid list element:
+
+```
+items_structural receives:
+    0x0200000000000000     // element 0: 2 struct rows (u64 LE)
+    0x0100000000000000     // element 1: 1 struct row (u64 LE)
+```
+
+**Data digest** — receives composite struct data (no element count prefixes):
+
+For each list element, the struct children are sorted alphabetically and their finalized digests are written into the data stream:
+
+**Element 0** (2 struct rows):
+
+Struct children (sorted: "id", "label"):
+- Child "id" (Int32, non-nullable): `SHA-256(0x01000000_02000000).finalize()` — 32 bytes
+- Child "label" (LargeUtf8, non-nullable): `SHA-256(0x0100000000000000 "a" 0x0100000000000000 "b").finalize()` — 32 bytes
+
+**Element 1** (1 struct row):
+
+- Child "id": `SHA-256(0x03000000).finalize()` — 32 bytes
+- Child "label": `SHA-256(0x0100000000000000 "c").finalize()` — 32 bytes
+
+```
+items_data_digest = SHA-256(
+    SHA-256([1,2] as i32 LE).finalize()    // element 0 child "id"
+    || SHA-256(len+"a"+len+"b").finalize()  // element 0 child "label"
+    || SHA-256([3] as i32 LE).finalize()   // element 1 child "id"
+    || SHA-256(len+"c").finalize()          // element 1 child "label"
+)
+```
+
+Note: element counts are **not** in the data digest — they are in the structural digest.
+
+#### Step 3: Final Combination
+
+Finalization order: null_bits → structural → data (see Section 4.4).
+
+```
+final_digest = SHA-256()
+final_digest.update( schema_digest )                              // 32 bytes
+
+// items field finalization (nullable list = null_bits + structural + data)
+final_digest.update( 0x0500000000000000 )                         // bit_count=5 LE
+final_digest.update( 0x000000000000001F )                         // validity word=31 BE
+final_digest.update( items_structural_digest.finalize() )          // 32 bytes (element counts)
+final_digest.update( items_data_digest.finalize() )                // 32 bytes (leaf data)
+
+output = 0x000001 ++ final_digest.finalize()
+```
+
+---
+
+## 8. Platform Considerations
+
+- **Integer sizes**: All length prefixes use `u64` (8 bytes). Validity bit counts and validity words use `usize`, which is 8 bytes on 64-bit platforms. This means hashes are **platform-dependent** if `usize` differs (32-bit vs 64-bit).
+- **Byte order**: Data values use little-endian. Validity words use big-endian. Bit counts use little-endian.
+- **Floating point**: IEEE 754 representation is hashed directly. `NaN` values with different bit patterns produce different hashes. `+0.0` and `-0.0` produce different hashes.
diff --git a/src/arrow_digester_core.rs b/src/arrow_digester_core.rs
index 5dde5a6..112bdbe 100644
--- a/src/arrow_digester_core.rs
+++ b/src/arrow_digester_core.rs
@@ -7,24 +7,39 @@ use std::{collections::BTreeMap, iter::repeat_n};
 
 use arrow::{
     array::{
-        Array, BinaryArray, BooleanArray, GenericBinaryArray, GenericListArray, GenericStringArray,
-        LargeBinaryArray, LargeListArray, LargeStringArray, ListArray, OffsetSizeTrait,
-        RecordBatch, StringArray, StructArray,
+        make_array, Array, BinaryArray, BooleanArray, GenericBinaryArray, GenericListArray,
+        GenericStringArray, LargeBinaryArray, LargeListArray, LargeStringArray, ListArray,
+        OffsetSizeTrait, RecordBatch, StringArray, StructArray,
     },
+    buffer::NullBuffer,
+    compute::cast,
     datatypes::{DataType, Schema},
 };
 use arrow_schema::Field;
 use bitvec::prelude::*;
 use digest::Digest;
 
-const NULL_BYTES: &[u8] = b"NULL";
-
 const DELIMITER_FOR_NESTED_FIELD: &str = "/";
 
 #[derive(Clone)]
-enum DigestBufferType<D: Digest> {
-    NonNullable(D),
-    Nullable(BitVec, D), // Where first digest is for the bull bits, while the second is for the actual data
+struct DigestBufferType<D: Digest> {
+    null_bits: Option<BitVec>,
+    structural: Option<D>,
+    data: D,
+}
+
+impl<D: Digest> DigestBufferType<D> {
+    fn new(nullable: bool, structured: bool) -> Self {
+        Self {
+            null_bits: nullable.then(BitVec::new),
+            structural: structured.then(D::new),
+            data: D::new(),
+        }
+    }
+}
+
+const fn is_list_type(data_type: &DataType) -> bool {
+    matches!(data_type, DataType::List(_) | DataType::LargeList(_))
 }
 
 #[derive(Clone)]
@@ -56,9 +71,10 @@ impl<D: Digest> ArrowDigesterCore<D> {
 
     /// Hash a record batch and update the internal digests.
     pub fn update(&mut self, record_batch: &RecordBatch) {
-        // Verify schema matches
+        // Verify schema matches logically (same fields regardless of order, with type canonicalization)
         assert!(
-            *record_batch.schema() == self.schema,
+            Self::serialized_schema(record_batch.schema().as_ref())
+                == Self::serialized_schema(&self.schema),
             "Record batch schema does not match ArrowDigester schema"
         );
 
@@ -112,21 +128,33 @@ impl<D: Digest> ArrowDigesterCore<D> {
     /// This function will panic if JSON serialization of the data type fails.
     ///
     pub fn hash_array(array: &dyn Array) -> Vec<u8> {
+        // Resolve dictionary arrays to their plain value type
+        let (effective_type, resolved_array);
+        let effective_array: &dyn Array =
+            if let DataType::Dictionary(_, value_type) = array.data_type() {
+                resolved_array = cast(array, value_type.as_ref())
+                    .expect("Failed to cast dictionary to plain array");
+                effective_type = value_type.as_ref().clone();
+                resolved_array.as_ref()
+            } else {
+                effective_type = array.data_type().clone();
+                array
+            };
+
         let mut final_digest = D::new();
 
-        let data_type_serialized = serde_json::to_string(&array.data_type())
+        // Use canonical type serialization for metadata
+        let canonical_type = Self::data_type_to_value(&effective_type);
+        let data_type_serialized = serde_json::to_string(&canonical_type)
             .expect("Failed to serialize data type to string");
 
         // Update the digest buffer with the array metadata and field data
         final_digest.update(data_type_serialized);
 
         // Now we update it with the actual array data
-        let mut digest_buffer = if array.is_nullable() {
-            DigestBufferType::Nullable(BitVec::new(), D::new())
-        } else {
-            DigestBufferType::NonNullable(D::new())
-        };
-        Self::array_digest_update(array.data_type(), array, &mut digest_buffer);
+        let mut digest_buffer =
+            DigestBufferType::new(effective_array.is_nullable(), is_list_type(&effective_type));
+        Self::array_digest_update(&effective_type, effective_array, &mut digest_buffer);
         Self::finalize_digest(&mut final_digest, digest_buffer);
 
         // Finalize and return the digest
@@ -164,18 +192,19 @@ impl<D: Digest> ArrowDigesterCore<D> {
     /// Finalize a single field digest into the final digest.
     /// Helpers to reduce code duplication.
     fn finalize_digest(final_digest: &mut D, digest: DigestBufferType<D>) {
-        match digest {
-            DigestBufferType::NonNullable(data_digest) => {
-                final_digest.update(data_digest.finalize());
-            }
-            DigestBufferType::Nullable(null_bit_digest, data_digest) => {
-                final_digest.update(null_bit_digest.len().to_le_bytes());
-                for &word in null_bit_digest.as_raw_slice() {
-                    final_digest.update(word.to_be_bytes());
-                }
-                final_digest.update(data_digest.finalize());
+        // Null bits first (if nullable)
+        if let Some(null_bit_vec) = &digest.null_bits {
+            final_digest.update(null_bit_vec.len().to_le_bytes());
+            for &word in null_bit_vec.as_raw_slice() {
+                final_digest.update(word.to_be_bytes());
             }
         }
+        // Structural digest (if list type) — sizes separated from leaf data
+        if let Some(structural) = digest.structural {
+            final_digest.update(structural.finalize());
+        }
+        // Data/leaf digest
+        final_digest.update(digest.data.finalize());
     }
 
     /// Serialize the schema into a `BTreeMap` for field name and its digest.
@@ -201,33 +230,44 @@ impl<D: Digest> ArrowDigesterCore<D> {
     /// Convert a `DataType` to a JSON value, recursively converting any inner `Field`
     /// references to only include `name`, `data_type`, and `nullable`.
     fn data_type_to_value(data_type: &DataType) -> serde_json::Value {
-        match data_type {
+        let value = match data_type {
             DataType::Struct(fields) => {
-                let fields_json: Vec<serde_json::Value> = fields
+                let mut sorted_fields: Vec<_> = fields.iter().collect();
+                sorted_fields.sort_by_key(|f| f.name().clone());
+                let fields_json: Vec<serde_json::Value> = sorted_fields
                     .iter()
                     .map(|f| Self::inner_field_to_value(f))
                     .collect();
                 serde_json::json!({ "Struct": fields_json })
             }
-            DataType::List(field) => {
-                serde_json::json!({ "List": Self::inner_field_to_value(field) })
-            }
-            DataType::LargeList(field) => {
-                serde_json::json!({ "LargeList": Self::inner_field_to_value(field) })
+            // Canonicalize List → LargeList; drop Arrow-internal field name ("item")
+            DataType::List(field) | DataType::LargeList(field) => {
+                serde_json::json!({ "LargeList": Self::element_type_to_value(field) })
             }
             DataType::FixedSizeList(field, size) => {
-                serde_json::json!({ "FixedSizeList": [Self::inner_field_to_value(field), size] })
+                serde_json::json!({ "FixedSizeList": [Self::element_type_to_value(field), size] })
             }
             DataType::Map(field, sorted) => {
                 serde_json::json!({ "Map": [Self::inner_field_to_value(field), sorted] })
             }
+            // Canonicalize Binary → LargeBinary
+            DataType::Binary => {
+                serde_json::to_value(&DataType::LargeBinary).expect("Failed to serialize data type")
+            }
+            // Canonicalize Utf8 → LargeUtf8
+            DataType::Utf8 => {
+                serde_json::to_value(&DataType::LargeUtf8).expect("Failed to serialize data type")
+            }
+            // Canonicalize Dictionary → value type
+            DataType::Dictionary(_, value_type) => Self::data_type_to_value(value_type.as_ref()),
             // For all non-nested types, Arrow's default serde is sufficient
             other => serde_json::to_value(other).expect("Failed to serialize data type"),
-        }
+        };
+        Self::sort_json_value(value)
     }
 
-    /// Convert an inner field (e.g., list item, struct child) to a JSON value
-    /// with only `name`, `data_type`, and `nullable`.
+    /// Convert an inner field (e.g., struct child) to a JSON value
+    /// with `name`, `data_type`, and `nullable`.
     fn inner_field_to_value(field: &Field) -> serde_json::Value {
         serde_json::json!({
             "name": field.name(),
@@ -236,6 +276,15 @@ impl<D: Digest> ArrowDigesterCore<D> {
         })
     }
 
+    /// Convert a container element field (e.g., list item) to a JSON value
+    /// with only `data_type` and `nullable`, omitting the Arrow-internal field name.
+    fn element_type_to_value(field: &Field) -> serde_json::Value {
+        serde_json::json!({
+            "data_type": Self::data_type_to_value(field.data_type()),
+            "nullable": field.is_nullable(),
+        })
+    }
+
     /// Recursively sort all JSON object keys for deterministic serialization.
     fn sort_json_value(value: serde_json::Value) -> serde_json::Value {
         match value {
@@ -327,30 +376,25 @@ impl<D: Digest> ArrowDigesterCore<D> {
                     .downcast_ref::<BooleanArray>()
                     .expect("Failed to downcast to BooleanArray");
 
-                match digest {
-                    DigestBufferType::NonNullable(data_digest) => {
-                        // We want to bit pack the boolean values into bytes for hashing
-                        let mut bit_vec = BitVec::<u8, Msb0>::with_capacity(bool_array.len());
-                        for i in 0..bool_array.len() {
+                if let Some(ref mut null_bits) = digest.null_bits {
+                    // Handle null bits first
+                    Self::handle_null_bits(bool_array, null_bits);
+
+                    // Handle the data — only valid bits
+                    let mut bit_vec = BitVec::<u8, Msb0>::with_capacity(bool_array.len());
+                    for i in 0..bool_array.len() {
+                        if bool_array.is_valid(i) {
                             bit_vec.push(bool_array.value(i));
                         }
-
-                        data_digest.update(bit_vec.as_raw_slice());
                     }
-                    DigestBufferType::Nullable(null_bit_vec, data_digest) => {
-                        // Handle null bits first
-                        Self::handle_null_bits(bool_array, null_bit_vec);
-
-                        // Handle the data
-                        let mut bit_vec = BitVec::<u8, Msb0>::with_capacity(bool_array.len());
-                        for i in 0..bool_array.len() {
-                            // We only want the valid bits, for null we will discard from the hash since that is already capture by null_bits
-                            if bool_array.is_valid(i) {
-                                bit_vec.push(bool_array.value(i));
-                            }
-                        }
-                        data_digest.update(bit_vec.as_raw_slice());
+                    digest.data.update(bit_vec.as_raw_slice());
+                } else {
+                    // Non-nullable: pack all boolean values
+                    let mut bit_vec = BitVec::<u8, Msb0>::with_capacity(bool_array.len());
+                    for i in 0..bool_array.len() {
+                        bit_vec.push(bool_array.value(i));
                     }
+                    digest.data.update(bit_vec.as_raw_slice());
                 }
             }
             DataType::Int8 | DataType::UInt8 => Self::hash_fixed_size_array(array, digest, 1),
@@ -432,9 +476,75 @@ impl<D: Digest> ArrowDigesterCore<D> {
                 );
             }
             DataType::LargeListView(_) => todo!(),
-            DataType::Struct(_) => todo!(),
+            DataType::Struct(fields) => {
+                let struct_array = array
+                    .as_any()
+                    .downcast_ref::<StructArray>()
+                    .expect("Failed to downcast to StructArray");
+
+                // Push struct-level nulls to parent's BitVec (same pattern as other types)
+                if let Some(ref mut null_bits) = digest.null_bits {
+                    Self::handle_null_bits(struct_array, null_bits);
+                }
+
+                // Sort children alphabetically by field name
+                let mut sorted_fields: Vec<_> = fields.iter().enumerate().collect();
+                sorted_fields.sort_by_key(|(_, f)| f.name().clone());
+
+                for (idx, child_field) in &sorted_fields {
+                    let child_array = struct_array.column(*idx);
+
+                    // Child is effectively nullable if the child field is nullable
+                    // OR the struct itself has nulls (struct-level nulls propagate down)
+                    let effectively_nullable =
+                        child_field.is_nullable() || struct_array.nulls().is_some();
+
+                    let mut child_digest = DigestBufferType::new(
+                        effectively_nullable,
+                        is_list_type(child_field.data_type()),
+                    );
+
+                    if let Some(struct_nulls) = struct_array.nulls() {
+                        // Propagate struct-level nulls into the child array by combining
+                        // struct validity with child validity: combined = struct AND child
+                        let combined_nulls = child_array.nulls().map_or_else(
+                            || struct_nulls.clone(),
+                            |child_nulls| {
+                                NullBuffer::new(struct_nulls.inner() & child_nulls.inner())
+                            },
+                        );
+                        let child_data = child_array.to_data();
+                        let null_count = combined_nulls.null_count();
+                        let new_data = child_data
+                            .into_builder()
+                            .null_count(null_count)
+                            .null_bit_buffer(Some(combined_nulls.into_inner().into_inner()))
+                            .build()
+                            .expect("Failed to rebuild child array with combined null buffer");
+                        let combined_child = make_array(new_data);
+                        Self::array_digest_update(
+                            child_field.data_type(),
+                            combined_child.as_ref(),
+                            &mut child_digest,
+                        );
+                    } else {
+                        Self::array_digest_update(
+                            child_field.data_type(),
+                            child_array.as_ref(),
+                            &mut child_digest,
+                        );
+                    }
+
+                    // Finalize child digest into parent's data stream
+                    Self::finalize_child_into_data(digest, child_digest);
+                }
+            }
             DataType::Union(_, _) => todo!(),
-            DataType::Dictionary(_, _) => todo!(),
+            DataType::Dictionary(_, value_type) => {
+                let resolved = cast(array, value_type.as_ref())
+                    .expect("Failed to cast dictionary to plain array");
+                Self::array_digest_update(value_type.as_ref(), resolved.as_ref(), digest);
+            }
             DataType::Decimal128(_, _) => {
                 Self::hash_fixed_size_array(array, digest, 16);
             }
@@ -469,41 +579,38 @@ impl<D: Digest> ArrowDigesterCore<D> {
             )
             .expect("Failed to get buffer slice for FixedSizeBinaryArray");
 
-        match digest_buffer {
-            DigestBufferType::NonNullable(data_digest) => {
-                // No nulls, we can hash the entire buffer directly
-                data_digest.update(slice);
-            }
-            DigestBufferType::Nullable(null_bits, data_digest) => {
-                // Handle null bits first
-                Self::handle_null_bits(array, null_bits);
-
-                match array_data.nulls() {
-                    Some(null_buffer) => {
-                        // There are nulls, so we need to incrementally hash each value
-                        for i in 0..array_data.len() {
-                            if null_buffer.is_valid(i) {
-                                let data_pos = i
-                                    .checked_mul(element_size_usize)
-                                    .expect("Data position multiplication overflow");
-                                let end_pos = data_pos
-                                    .checked_add(element_size_usize)
-                                    .expect("End position addition overflow");
-
-                                data_digest.update(
-                                    slice
-                                        .get(data_pos..end_pos)
-                                        .expect("Failed to get data_slice"),
-                                );
-                            }
+        if let Some(ref mut null_bits) = digest_buffer.null_bits {
+            // Handle null bits first
+            Self::handle_null_bits(array, null_bits);
+
+            match array_data.nulls() {
+                Some(null_buffer) => {
+                    // There are nulls, so we need to incrementally hash each value
+                    for i in 0..array_data.len() {
+                        if null_buffer.is_valid(i) {
+                            let data_pos = i
+                                .checked_mul(element_size_usize)
+                                .expect("Data position multiplication overflow");
+                            let end_pos = data_pos
+                                .checked_add(element_size_usize)
+                                .expect("End position addition overflow");
+
+                            digest_buffer.data.update(
+                                slice
+                                    .get(data_pos..end_pos)
+                                    .expect("Failed to get data_slice"),
+                            );
                         }
                     }
-                    None => {
-                        // No nulls, we can hash the entire buffer directly
-                        data_digest.update(slice);
-                    }
+                }
+                None => {
+                    // No nulls, we can hash the entire buffer directly
+                    digest_buffer.data.update(slice);
                 }
             }
+        } else {
+            // No nulls, we can hash the entire buffer directly
+            digest_buffer.data.update(slice);
         }
     }
 
@@ -511,42 +618,16 @@ impl<D: Digest> ArrowDigesterCore<D> {
         array: &GenericBinaryArray<impl OffsetSizeTrait>,
         digest: &mut DigestBufferType<D>,
     ) {
-        match digest {
-            DigestBufferType::NonNullable(data_digest) => {
-                for i in 0..array.len() {
-                    let value = array.value(i);
-                    data_digest.update(value.len().to_le_bytes());
-                    data_digest.update(value);
-                }
-            }
-            DigestBufferType::Nullable(null_bit_vec, data_digest) => {
-                // Deal with the null bits first
-                if let Some(null_buf) = array.nulls() {
-                    // We would need to iterate through the null buffer and push it into the null_bit_vec
-                    for i in 0..array.len() {
-                        null_bit_vec.push(null_buf.is_valid(i));
-                    }
+        if let Some(ref mut null_bits) = digest.null_bits {
+            Self::handle_null_bits(array, null_bits);
+        }
 
-                    for i in 0..array.len() {
-                        if null_buf.is_valid(i) {
-                            let value = array.value(i);
-                            data_digest.update(value.len().to_le_bytes());
-                            data_digest.update(value);
-                        } else {
-                            data_digest.update(NULL_BYTES);
-                        }
-                    }
-                } else {
-                    // All valid, therefore we can extend the bit vector with all true values
-                    null_bit_vec.extend(repeat_n(true, array.len()));
-
-                    // Deal with the data
-                    for i in 0..array.len() {
-                        let value = array.value(i);
-                        data_digest.update(value.len().to_le_bytes());
-                        data_digest.update(value);
-                    }
-                }
+        let null_buf = array.nulls();
+        for i in 0..array.len() {
+            if null_buf.is_none_or(|nb| nb.is_valid(i)) {
+                let value = array.value(i);
+                digest.data.update((value.len() as u64).to_le_bytes());
+                digest.data.update(value);
             }
         }
     }
@@ -555,38 +636,16 @@ impl<D: Digest> ArrowDigesterCore<D> {
         array: &GenericStringArray<impl OffsetSizeTrait>,
         digest: &mut DigestBufferType<D>,
     ) {
-        match digest {
-            DigestBufferType::NonNullable(data_digest) => {
-                for i in 0..array.len() {
-                    let value = array.value(i);
-                    data_digest.update((value.len() as u64).to_le_bytes());
-                    data_digest.update(value.as_bytes());
-                }
-            }
-            DigestBufferType::Nullable(null_bit_vec, data_digest) => {
-                // Deal with the null bits first
-                Self::handle_null_bits(array, null_bit_vec);
-
-                match array.nulls() {
-                    Some(null_buf) => {
-                        for i in 0..array.len() {
-                            if null_buf.is_valid(i) {
-                                let value = array.value(i);
-                                data_digest.update((value.len() as u64).to_le_bytes());
-                                data_digest.update(value.as_bytes());
-                            } else {
-                                data_digest.update(NULL_BYTES);
-                            }
-                        }
-                    }
-                    None => {
-                        for i in 0..array.len() {
-                            let value = array.value(i);
-                            data_digest.update((value.len() as u64).to_le_bytes());
-                            data_digest.update(value.as_bytes());
-                        }
-                    }
-                }
+        if let Some(ref mut null_bits) = digest.null_bits {
+            Self::handle_null_bits(array, null_bits);
+        }
+
+        let null_buf = array.nulls();
+        for i in 0..array.len() {
+            if null_buf.is_none_or(|nb| nb.is_valid(i)) {
+                let value = array.value(i);
+                digest.data.update((value.len() as u64).to_le_bytes());
+                digest.data.update(value.as_bytes());
             }
         }
     }
@@ -596,40 +655,27 @@ impl<D: Digest> ArrowDigesterCore<D> {
         field_data_type: &DataType,
         digest: &mut DigestBufferType<D>,
     ) {
-        match digest {
-            // Wildcard `_` avoids binding so `digest` remains usable below
-            DigestBufferType::NonNullable(_) => {
-                for i in 0..array.len() {
-                    let sub = array.value(i);
-                    // Prefix sub-array element count to prevent cross-boundary collisions.
-                    // Without this [[1,2],[3]] and [[1],[2,3]] produce identical byte streams.
-                    // sub.len() returns usize, avoiding the non-primitive OffsetSizeTrait cast.
-                    Self::update_data_digest(digest, (sub.len() as u64).to_le_bytes());
-                    Self::array_digest_update(field_data_type, sub.as_ref(), digest);
-                }
-            }
-            DigestBufferType::Nullable(bit_vec, _) => {
-                // Deal with null bits first; NLL ends bit_vec borrow after this call
-                Self::handle_null_bits(array, bit_vec);
-
-                match array.nulls() {
-                    Some(null_buf) => {
-                        for i in 0..array.len() {
-                            if null_buf.is_valid(i) {
-                                let sub = array.value(i);
-                                Self::update_data_digest(digest, (sub.len() as u64).to_le_bytes());
-                                Self::array_digest_update(field_data_type, sub.as_ref(), digest);
-                            }
-                        }
-                    }
-                    None => {
-                        for i in 0..array.len() {
-                            let sub = array.value(i);
-                            Self::update_data_digest(digest, (sub.len() as u64).to_le_bytes());
-                            Self::array_digest_update(field_data_type, sub.as_ref(), digest);
-                        }
-                    }
+        // Handle null bits first (if nullable)
+        if let Some(ref mut null_bits) = digest.null_bits {
+            Self::handle_null_bits(array, null_bits);
+        }
+
+        let null_buf = array.nulls();
+        for i in 0..array.len() {
+            if null_buf.is_none_or(|nb| nb.is_valid(i)) {
+                let sub = array.value(i);
+                let size_bytes = (sub.len() as u64).to_le_bytes();
+
+                // Write element count to structural digest (separating structure from leaf data).
+                // If no structural digest exists, fall back to data digest for backward compat.
+                if let Some(ref mut structural) = digest.structural {
+                    structural.update(size_bytes);
+                } else {
+                    digest.data.update(size_bytes);
                 }
+
+                // Recurse into sub-array — leaf data goes to data digest
+                Self::array_digest_update(field_data_type, sub.as_ref(), digest);
             }
         }
     }
@@ -655,11 +701,7 @@ impl<D: Digest> ArrowDigesterCore<D> {
             // Base case, just add the the combine field name to the map
             fields_digest_buffer.insert(
                 Self::construct_field_name_hierarchy(parent_field_name, field.name()),
-                if field.is_nullable() {
-                    DigestBufferType::Nullable(BitVec::new(), D::new())
-                } else {
-                    DigestBufferType::NonNullable(D::new())
-                },
+                DigestBufferType::new(field.is_nullable(), is_list_type(field.data_type())),
             );
         }
     }
@@ -672,12 +714,33 @@ impl<D: Digest> ArrowDigesterCore<D> {
         }
     }
 
-    /// Write bytes directly into the data digest portion of the buffer, bypassing null-bit tracking.
+    /// Write bytes directly into the data/leaf digest portion of the buffer, bypassing null-bit tracking.
     /// Used to write length prefixes that sit in the data stream but are not nullable values.
     fn update_data_digest(digest: &mut DigestBufferType<D>, data: impl AsRef<[u8]>) {
-        match digest {
-            DigestBufferType::NonNullable(d) | DigestBufferType::Nullable(_, d) => d.update(data),
+        digest.data.update(data);
+    }
+
+    /// Finalize a child's digest and write the resulting bytes into the parent's data stream.
+    /// Used for composite types (structs) where each child is independently hashed and then
+    /// its finalized representation is fed into the parent digest.
+    #[expect(
+        clippy::big_endian_bytes,
+        reason = "Use for bit packing the null_bit_values"
+    )]
+    fn finalize_child_into_data(parent: &mut DigestBufferType<D>, child: DigestBufferType<D>) {
+        // Null bits first (if nullable child)
+        if let Some(null_bit_vec) = &child.null_bits {
+            Self::update_data_digest(parent, null_bit_vec.len().to_le_bytes());
+            for &word in null_bit_vec.as_raw_slice() {
+                Self::update_data_digest(parent, word.to_be_bytes());
+            }
+        }
+        // Structural digest (if list child)
+        if let Some(structural) = child.structural {
+            Self::update_data_digest(parent, structural.finalize());
         }
+        // Data/leaf digest
+        Self::update_data_digest(parent, child.data.finalize());
     }
 
     fn handle_null_bits(array: &dyn Array, null_bit_vec: &mut BitVec) {
@@ -727,7 +790,7 @@ mod tests {
     use pretty_assertions::assert_eq;
     use sha2::{Digest as _, Sha256};
 
-    use crate::arrow_digester_core::{ArrowDigesterCore, DigestBufferType};
+    use crate::arrow_digester_core::ArrowDigesterCore;
     use arrow::array::{Decimal256Array, Decimal64Array};
     use arrow_buffer::i256;
 
@@ -920,7 +983,7 @@ mod tests {
         // Check the digest
         assert_eq!(
             encode(digester.finalize()),
-            "9841aab2dfeb637872d41422d33fca1e939f06b8fa0dcec66ff3782592cf9565"
+            "e13ce8a993a636f70e30bc2f4c0667fa6a42aeef94d1a32e78e8fd8dbc59b0a0"
         );
     }
 
@@ -944,11 +1007,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 4);
         assert!(null_bit_vec[0], "index 0 (true) should be valid");
@@ -981,10 +1042,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         // [false, true, false] packed Msb0: bit0=0, bit1=1, bit2=0 → 0100_0000 = 0x40
         let mut manual = Sha256::new();
@@ -1008,11 +1068,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1039,10 +1097,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update([0x01_u8, 0x02_u8, 0xFF_u8]);
@@ -1067,11 +1124,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1102,10 +1157,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(100_u16.to_le_bytes());
@@ -1138,10 +1192,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(half::f16::from_f32(1.0).to_le_bytes());
@@ -1176,13 +1229,12 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) = digester
+        let buf = digester
             .fields_digest_buffer
             .get("int32_col")
-            .expect("int32_col field should exist in digest buffer")
-        else {
-            panic!("Expected a Nullable digest buffer for int32_col");
-        };
+            .expect("int32_col field should exist in digest buffer");
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         // The null bit vector should be [true, false, true, true] for [Some(42), None, Some(-7), Some(0)]
         assert_eq!(null_bit_vec.len(), 4);
@@ -1217,11 +1269,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1256,11 +1306,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1296,11 +1344,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1333,10 +1379,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(0_i32.to_le_bytes());
@@ -1361,11 +1406,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1392,11 +1435,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1429,10 +1470,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(1.0_f64.to_le_bytes());
@@ -1464,11 +1504,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1501,10 +1539,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(0_i64.to_le_bytes());
@@ -1529,11 +1566,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1560,11 +1595,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1601,11 +1634,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1640,11 +1671,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1680,11 +1709,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1724,11 +1751,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1766,11 +1791,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1789,8 +1812,8 @@ mod tests {
     #[test]
     fn digest_binary_nullable_bytes() {
         // [b"hello", None, b"world"]
-        // Valid entries: (length as usize LE) ++ bytes.
-        // Null entries contribute the sentinel b"NULL" to the data digest.
+        // Valid entries: (length as u64 LE) ++ bytes.
+        // Null entries are skipped entirely in the data digest.
         let array = BinaryArray::from(vec![Some(b"hello".as_ref()), None, Some(b"world".as_ref())]);
         let schema = Schema::new(vec![Field::new("col", DataType::Binary, true)]);
         let mut digester = ArrowDigesterCore::<Sha256>::new(schema);
@@ -1802,11 +1825,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1814,10 +1835,10 @@ mod tests {
         assert!(null_bit_vec[2]);
 
         let mut manual = Sha256::new();
-        manual.update(5_usize.to_le_bytes()); // len("hello")
+        manual.update(5_u64.to_le_bytes()); // len("hello")
         manual.update(b"hello");
-        manual.update(b"NULL"); // null sentinel
-        manual.update(5_usize.to_le_bytes()); // len("world")
+        // null entry skipped — no sentinel bytes
+        manual.update(5_u64.to_le_bytes()); // len("world")
         manual.update(b"world");
         assert_eq!(data_digest.clone().finalize(), manual.finalize());
     }
@@ -1840,15 +1861,14 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
-        manual.update(2_usize.to_le_bytes());
+        manual.update(2_u64.to_le_bytes());
         manual.update(b"ab");
-        manual.update(3_usize.to_le_bytes());
+        manual.update(3_u64.to_le_bytes());
         manual.update(b"cde");
         assert_eq!(data_digest.clone().finalize(), manual.finalize());
     }
@@ -1859,7 +1879,7 @@ mod tests {
     fn digest_utf8_nullable_bytes() {
         // ["foo", None, "ba"]
         // Valid entries: (length as u64 LE) ++ UTF-8 bytes.
-        // Null entries contribute the sentinel b"NULL" to the data digest.
+        // Null entries are skipped entirely in the data digest.
         let array = StringArray::from(vec![Some("foo"), None, Some("ba")]);
         let schema = Schema::new(vec![Field::new("col", DataType::Utf8, true)]);
         let mut digester = ArrowDigesterCore::<Sha256>::new(schema);
@@ -1871,11 +1891,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::Nullable(null_bit_vec, data_digest) =
-            &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected Nullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        let null_bit_vec = buf.null_bits.as_ref().expect("Expected nullable");
+        let data_digest = &buf.data;
 
         assert_eq!(null_bit_vec.len(), 3);
         assert!(null_bit_vec[0]);
@@ -1885,7 +1903,7 @@ mod tests {
         let mut manual = Sha256::new();
         manual.update(3_u64.to_le_bytes()); // len("foo")
         manual.update(b"foo");
-        manual.update(b"NULL"); // null sentinel
+        // null entry skipped — no sentinel bytes
         manual.update(2_u64.to_le_bytes()); // len("ba")
         manual.update(b"ba");
         assert_eq!(data_digest.clone().finalize(), manual.finalize());
@@ -1909,10 +1927,9 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let data_digest = &buf.data;
 
         let mut manual = Sha256::new();
         manual.update(1_u64.to_le_bytes());
@@ -1958,18 +1975,28 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let structural_digest = buf
+            .structural
+            .as_ref()
+            .expect("Expected structural digest for list");
+        let data_digest = &buf.data;
+
+        // Structural digest: element count (sizes separated from leaf data)
+        let mut manual_structural = Sha256::new();
+        manual_structural.update(3_u64.to_le_bytes()); // element count prefix
+        assert_eq!(
+            structural_digest.clone().finalize(),
+            manual_structural.finalize()
+        );
 
-        // sub-array has 3 elements at offset 0 → raw buffer slice from byte 0
-        let mut manual = Sha256::new();
-        manual.update(3_u64.to_le_bytes()); // element count prefix
-        manual.update(10_i32.to_le_bytes());
-        manual.update(20_i32.to_le_bytes());
-        manual.update(30_i32.to_le_bytes());
-        assert_eq!(data_digest.clone().finalize(), manual.finalize());
+        // Data/leaf digest: only the raw leaf values
+        let mut manual_data = Sha256::new();
+        manual_data.update(10_i32.to_le_bytes());
+        manual_data.update(20_i32.to_le_bytes());
+        manual_data.update(30_i32.to_le_bytes());
+        assert_eq!(data_digest.clone().finalize(), manual_data.finalize());
     }
 
     #[test]
@@ -2001,16 +2028,27 @@ mod tests {
             .unwrap(),
         );
 
-        let DigestBufferType::NonNullable(data_digest) = &digester.fields_digest_buffer["col"]
-        else {
-            panic!("Expected NonNullable buffer");
-        };
+        let buf = &digester.fields_digest_buffer["col"];
+        assert!(buf.null_bits.is_none(), "Expected non-nullable");
+        let structural_digest = buf
+            .structural
+            .as_ref()
+            .expect("Expected structural digest for list");
+        let data_digest = &buf.data;
+
+        // Structural digest: element count (sizes separated from leaf data)
+        let mut manual_structural = Sha256::new();
+        manual_structural.update(3_u64.to_le_bytes());
+        assert_eq!(
+            structural_digest.clone().finalize(),
+            manual_structural.finalize()
+        );
 
-        let mut manual = Sha256::new();
-        manual.update(3_u64.to_le_bytes());
-        manual.update(1_i32.to_le_bytes());
-        manual.update(2_i32.to_le_bytes());
-        manual.update(3_i32.to_le_bytes());
-        assert_eq!(data_digest.clone().finalize(), manual.finalize());
+        // Data/leaf digest: only the raw leaf values
+        let mut manual_data = Sha256::new();
+        manual_data.update(1_i32.to_le_bytes());
+        manual_data.update(2_i32.to_le_bytes());
+        manual_data.update(3_i32.to_le_bytes());
+        assert_eq!(data_digest.clone().finalize(), manual_data.finalize());
     }
 }
diff --git a/tests/arrow_digester.rs b/tests/arrow_digester.rs
index 303e258..45d9581 100644
--- a/tests/arrow_digester.rs
+++ b/tests/arrow_digester.rs
@@ -73,7 +73,7 @@ mod tests {
 
         assert_eq!(
             encode(ArrowDigester::new(schema.clone()).finalize()),
-            "0000019c75bd0c40bd2fb15e878418c151c0b792c966476b35ded7d0f6fd1922cf5a00"
+            "0000016a44e0dc5c25d5ca0c53312a6afcffa6e07168afc7f16f5e16c8ca052f09f1bb"
         );
 
         let batch = RecordBatch::try_new(
@@ -129,7 +129,7 @@ mod tests {
         // Hash the record batch
         assert_eq!(
             encode(ArrowDigester::hash_record_batch(&batch)),
-            "00000199f7ba7f6c7ec30ad487996c2b3eb6f0e1c750c318a32b09afcdfdce7de8c08e"
+            "0000010bc624523e362eb2377c47ccfaf9399a5631404bc20821fdd4e09ca25ea49fde"
         );
     }
 
@@ -199,10 +199,10 @@ mod tests {
         let hash = hex::encode(ArrowDigester::hash_array(&binary_array));
         assert_eq!(
             hash,
-            "000001466801efd880d2acecd6c78915b5c2a51476870f9116912834d79de43a000071"
+            "000001fd0b85d56d72f59c5981c0b54cea148d3a737db10b696e3e3d1d444aed764893"
         );
 
-        // Test large binary array with same data to ensure consistency
+        // Large binary array with same data should produce identical hash (type canonicalization)
         let large_binary_array = LargeBinaryArray::from(vec![
             Some(b"hello".as_ref()),
             None,
@@ -210,7 +210,7 @@ mod tests {
             Some(b"".as_ref()),
         ]);
 
-        assert_ne!(
+        assert_eq!(
             hex::encode(ArrowDigester::hash_array(&large_binary_array)),
             hash
         );
@@ -263,14 +263,14 @@ mod tests {
         let hash = hex::encode(ArrowDigester::hash_array(&string_array));
         assert_eq!(
             hash,
-            "000001811f2407a0d2e90ef9688514d37cd92225242e7614f02ef5ef36abcae73ca374"
+            "000001088e379f978a8f8ed7148e118bfbcdda99f5bc28c203cdb793da765c76987a9b"
         );
 
-        // Test large string array with same data to ensure consistency
+        // Large string array with same data should produce identical hash (type canonicalization)
         let large_string_array =
             LargeStringArray::from(vec![Some("hello"), None, Some("world"), Some("")]);
 
-        assert_ne!(
+        assert_eq!(
             hex::encode(ArrowDigester::hash_array(&large_string_array)),
             hash
         );
@@ -289,7 +289,7 @@ mod tests {
         let hash = hex::encode(ArrowDigester::hash_array(&list_array));
         assert_eq!(
             hash,
-            "00000114b8faee7c56d2a94d77095db599152df41aaf4d11e485035eebc94e8981f769"
+            "00000125939ebc0815ab1fb13b19fd7c0f36a1b27c09ec33d8100f5ba9f0e0032442ae"
         );
 
         // Collision test: [[1, 2], [3]] vs [[1], [2, 3]]
@@ -603,7 +603,7 @@ mod tests {
     /// Two schemas with the same struct fields in different order should produce identical schema hashes.
     /// Bug: `data_type_to_value()` preserves struct field insertion order in the JSON Vec.
     #[test]
-    #[ignore = "Bug: struct fields not sorted in data_type_to_value (Issue 1)"]
+
     fn struct_field_order_in_schema_should_not_affect_hash() {
         let schema1 = Schema::new(vec![Field::new(
             "my_struct",
@@ -640,7 +640,7 @@ mod tests {
 
     /// Record batches with struct columns whose inner fields are reordered should produce identical hashes.
     #[test]
-    #[ignore = "Bug: struct fields not sorted in data_type_to_value (Issue 1)"]
+
     fn struct_field_order_in_record_batch_should_not_affect_hash() {
         let schema1 = Arc::new(Schema::new(vec![Field::new(
             "s",
@@ -667,8 +667,7 @@ mod tests {
         )]));
 
         let ints = Arc::new(Int32Array::from(vec![1, 2, 3])) as ArrayRef;
-        let bools =
-            Arc::new(BooleanArray::from(vec![Some(true), Some(false), None])) as ArrayRef;
+        let bools = Arc::new(BooleanArray::from(vec![Some(true), Some(false), None])) as ArrayRef;
 
         let struct1 = StructArray::from(vec![
             (
@@ -692,10 +691,8 @@ mod tests {
             ),
         ]);
 
-        let batch1 =
-            RecordBatch::try_new(schema1, vec![Arc::new(struct1) as ArrayRef]).unwrap();
-        let batch2 =
-            RecordBatch::try_new(schema2, vec![Arc::new(struct2) as ArrayRef]).unwrap();
+        let batch1 = RecordBatch::try_new(schema1, vec![Arc::new(struct1) as ArrayRef]).unwrap();
+        let batch2 = RecordBatch::try_new(schema2, vec![Arc::new(struct2) as ArrayRef]).unwrap();
 
         assert_eq!(
             encode(ArrowDigester::hash_record_batch(&batch1)),
@@ -707,7 +704,7 @@ mod tests {
     // ── Issue 5: Type canonicalization (Binary/LargeBinary, Utf8/LargeUtf8, List/LargeList) ──
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for Binary vs LargeBinary (Issue 5)"]
+
     fn binary_and_large_binary_schema_should_hash_equal() {
         let schema1 = Schema::new(vec![Field::new("col", DataType::Binary, true)]);
         let schema2 = Schema::new(vec![Field::new("col", DataType::LargeBinary, true)]);
@@ -720,7 +717,7 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for Utf8 vs LargeUtf8 (Issue 5)"]
+
     fn utf8_and_large_utf8_schema_should_hash_equal() {
         let schema1 = Schema::new(vec![Field::new("col", DataType::Utf8, true)]);
         let schema2 = Schema::new(vec![Field::new("col", DataType::LargeUtf8, true)]);
@@ -733,7 +730,7 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for List vs LargeList (Issue 5)"]
+
     fn list_and_large_list_schema_should_hash_equal() {
         let list_field = Field::new("item", DataType::Int32, true);
         let schema1 = Schema::new(vec![Field::new(
@@ -755,18 +752,11 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for Binary vs LargeBinary in hash_array (Issue 5)"]
+
     fn binary_and_large_binary_array_should_hash_equal() {
-        let bin = BinaryArray::from(vec![
-            Some(b"hello".as_ref()),
-            None,
-            Some(b"world".as_ref()),
-        ]);
-        let large_bin = LargeBinaryArray::from(vec![
-            Some(b"hello".as_ref()),
-            None,
-            Some(b"world".as_ref()),
-        ]);
+        let bin = BinaryArray::from(vec![Some(b"hello".as_ref()), None, Some(b"world".as_ref())]);
+        let large_bin =
+            LargeBinaryArray::from(vec![Some(b"hello".as_ref()), None, Some(b"world".as_ref())]);
 
         assert_eq!(
             encode(ArrowDigester::hash_array(&bin)),
@@ -776,7 +766,7 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for Utf8 vs LargeUtf8 in hash_array (Issue 5)"]
+
     fn utf8_and_large_utf8_array_should_hash_equal() {
         let arr = StringArray::from(vec![Some("hello"), None, Some("world")]);
         let large_arr = LargeStringArray::from(vec![Some("hello"), None, Some("world")]);
@@ -789,7 +779,7 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: no type canonicalization for Binary vs LargeBinary in hash_record_batch (Issue 5)"]
+
     fn binary_and_large_binary_record_batch_should_hash_equal() {
         let schema1 = Arc::new(Schema::new(vec![Field::new("col", DataType::Binary, true)]));
         let schema2 = Arc::new(Schema::new(vec![Field::new(
@@ -800,19 +790,13 @@ mod tests {
 
         let batch1 = RecordBatch::try_new(
             schema1,
-            vec![Arc::new(BinaryArray::from(vec![
-                Some(b"abc".as_ref()),
-                None,
-            ])) as ArrayRef],
+            vec![Arc::new(BinaryArray::from(vec![Some(b"abc".as_ref()), None])) as ArrayRef],
         )
         .unwrap();
 
         let batch2 = RecordBatch::try_new(
             schema2,
-            vec![Arc::new(LargeBinaryArray::from(vec![
-                Some(b"abc".as_ref()),
-                None,
-            ])) as ArrayRef],
+            vec![Arc::new(LargeBinaryArray::from(vec![Some(b"abc".as_ref()), None])) as ArrayRef],
         )
         .unwrap();
 
@@ -826,7 +810,7 @@ mod tests {
     // ── Issue 6: Dictionary-encoded array equivalence ───────────────────
 
     #[test]
-    #[ignore = "Bug: Dictionary arrays hit todo!() panic (Issue 6)"]
+
     fn dictionary_utf8_should_hash_same_as_plain_string() {
         let plain = StringArray::from(vec![Some("apple"), Some("banana"), Some("apple")]);
 
@@ -842,13 +826,12 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: Dictionary arrays hit todo!() panic (Issue 6)"]
+
     fn dictionary_int_values_should_hash_same_as_plain() {
         let plain = StringArray::from(vec![Some("x"), Some("y"), Some("x")]);
 
-        let dict: DictionaryArray<Int8Type> = vec![Some("x"), Some("y"), Some("x")]
-            .into_iter()
-            .collect();
+        let dict: DictionaryArray<Int8Type> =
+            vec![Some("x"), Some("y"), Some("x")].into_iter().collect();
 
         assert_eq!(
             encode(ArrowDigester::hash_array(&plain)),
@@ -858,13 +841,12 @@ mod tests {
     }
 
     #[test]
-    #[ignore = "Bug: Dictionary arrays hit todo!() panic (Issue 6)"]
+
     fn dictionary_with_nulls_should_hash_same_as_plain() {
         let plain = StringArray::from(vec![Some("a"), None, Some("b"), None]);
 
-        let dict: DictionaryArray<Int32Type> = vec![Some("a"), None, Some("b"), None]
-            .into_iter()
-            .collect();
+        let dict: DictionaryArray<Int32Type> =
+            vec![Some("a"), None, Some("b"), None].into_iter().collect();
 
         assert_eq!(
             encode(ArrowDigester::hash_array(&plain)),
@@ -877,7 +859,7 @@ mod tests {
 
     /// Feeding a batch with reordered columns into a digester should not panic.
     #[test]
-    #[ignore = "Bug: update() uses strict schema equality including column order (Issue 7)"]
+
     fn streaming_update_with_reordered_columns_should_succeed() {
         let schema = Schema::new(vec![
             Field::new("a", DataType::Int32, false),
@@ -908,7 +890,7 @@ mod tests {
     /// A digester fed batches with different column orders should produce the same hash
     /// as one fed batches in the original order.
     #[test]
-    #[ignore = "Bug: update() uses strict schema equality including column order (Issue 7)"]
+
     fn streaming_reordered_columns_produce_same_hash() {
         let schema_ab = Schema::new(vec![
             Field::new("a", DataType::Int32, false),
diff --git a/tests/digest_bytes.rs b/tests/digest_bytes.rs
index 5c6016f..f1df3c3 100644
--- a/tests/digest_bytes.rs
+++ b/tests/digest_bytes.rs
@@ -1,2 +1,977 @@
+/// Manual byte-level verification tests for the Starfix hashing specification.
+///
+/// Each test in this module manually computes the expected SHA-256 hash by
+/// feeding the exact bytes described in `docs/byte-layout-spec.md` into a
+/// fresh SHA-256 hasher, then asserts that the library produces the identical
+/// result. This serves as both a conformance check and a reference
+/// implementation for anyone porting Starfix to another language.
 #[cfg(test)]
-mod tests {}
+mod tests {
+    #![expect(clippy::unwrap_used, reason = "Okay in test")]
+    #![expect(
+        clippy::similar_names,
+        reason = "child_a/child_b naming is clear in test context"
+    )]
+    #![expect(clippy::redundant_clone, reason = "Clones for clarity in test setup")]
+    #![expect(clippy::absolute_paths, reason = "One-off use in test")]
+    #![expect(
+        clippy::big_endian_bytes,
+        reason = "Starfix spec requires BE serialization of validity words"
+    )]
+
+    use std::sync::Arc;
+
+    use arrow::array::{
+        ArrayRef, BinaryArray, BooleanArray, Int32Array, LargeListArray, LargeStringArray,
+        RecordBatch, StringArray, StructArray,
+    };
+    use arrow::buffer::NullBuffer;
+    use arrow_schema::{DataType, Field, Schema};
+    use sha2::{Digest as _, Sha256};
+    use starfix::ArrowDigester;
+
+    const VERSION: [u8; 3] = [0x00, 0x00, 0x01];
+
+    // ── Helper ───────────────────────────────────────────────────────────
+
+    /// Prepend the 3-byte version prefix to a 32-byte SHA-256 digest,
+    /// returning the full 35-byte Starfix hash.
+    fn with_version(digest: Vec<u8>) -> Vec<u8> {
+        let mut out = VERSION.to_vec();
+        out.extend(digest);
+        out
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example A: Simple Two-Column Table (record batch)
+    //   Schema: {age: Int32 non-nullable, name: LargeUtf8 nullable}
+    //   Row 0:  age=25, name="Alice"
+    //   Row 1:  age=30, name=NULL
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_a_two_column_table() {
+        // ── Build the table ──────────────────────────────────────────────
+        let schema = Schema::new(vec![
+            Field::new("age", DataType::Int32, false),
+            Field::new("name", DataType::LargeUtf8, true),
+        ]);
+        let batch = RecordBatch::try_new(
+            Arc::new(schema.clone()),
+            vec![
+                Arc::new(Int32Array::from(vec![25_i32, 30])) as ArrayRef,
+                Arc::new(LargeStringArray::from(vec![Some("Alice"), None])) as ArrayRef,
+            ],
+        )
+        .unwrap();
+
+        // ── Step 1: Schema digest ────────────────────────────────────────
+        let schema_json = r#"{"age":{"data_type":"Int32","nullable":false},"name":{"data_type":"LargeUtf8","nullable":true}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        // Verify the library agrees on schema hash
+        assert_eq!(
+            ArrowDigester::hash_schema(&schema),
+            with_version(schema_digest.to_vec()),
+            "Schema hash mismatch — canonical JSON may differ"
+        );
+
+        // ── Step 2: Field "age" (Int32, non-nullable) ────────────────────
+        // Values: [25, 30]  →  little-endian bytes
+        let mut age_data = Sha256::new();
+        age_data.update(25_i32.to_le_bytes()); // 19 00 00 00
+        age_data.update(30_i32.to_le_bytes()); // 1e 00 00 00
+        let age_data_finalized = age_data.finalize();
+
+        // ── Step 3: Field "name" (LargeUtf8, nullable) ───────────────────
+        // Values: ["Alice", NULL]
+        //
+        // Validity BitVec (Lsb0, usize storage):
+        //   bit 0 = 1 (valid), bit 1 = 0 (null)
+        //   → usize word = 0b01 = 1
+        //   bit_count = 2
+        let bit_count: usize = 2;
+        let validity_word: usize = 1; // bits: [1, 0] in Lsb0
+
+        // Data bytes (only valid elements):
+        //   "Alice" → len=5 as u64 LE, then UTF-8 bytes
+        //   NULL → skipped
+        let mut name_data = Sha256::new();
+        name_data.update(5_u64.to_le_bytes()); // length prefix
+        name_data.update(b"Alice"); // raw UTF-8 bytes
+                                    // NULL element: nothing fed
+        let name_data_finalized = name_data.finalize();
+
+        // ── Step 4: Final combination ────────────────────────────────────
+        // Fields in alphabetical order: "age", "name"
+        let mut final_digest = Sha256::new();
+
+        // Schema
+        final_digest.update(schema_digest);
+
+        // Field "age" (non-nullable → just the data digest)
+        final_digest.update(age_data_finalized);
+
+        // Field "name" (nullable → bit_count + validity words + data digest)
+        final_digest.update(bit_count.to_le_bytes()); // 02 00 00 00 00 00 00 00
+        final_digest.update(validity_word.to_be_bytes()); // 00 00 00 00 00 00 00 01
+        final_digest.update(name_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        // ── Verify ───────────────────────────────────────────────────────
+        assert_eq!(
+            ArrowDigester::hash_record_batch(&batch),
+            expected,
+            "Example A: two-column table hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example B: Boolean Array with Nulls (hash_array API)
+    //   BooleanArray [true, NULL, false, true]  (nullable)
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_b_boolean_array_with_nulls() {
+        let array = BooleanArray::from(vec![Some(true), None, Some(false), Some(true)]);
+
+        // ── Type metadata ────────────────────────────────────────────────
+        // data_type_to_value(Boolean) → JSON value "Boolean"
+        // serde_json::to_string(json!("Boolean")) → "\"Boolean\""
+        let type_json = b"\"Boolean\"";
+
+        // ── Validity bits (Lsb0, usize storage) ─────────────────────────
+        // [valid, null, valid, valid] → bits [1, 0, 1, 1]
+        // Lsb0 in usize: bit0=1, bit1=0, bit2=1, bit3=1 → 0b1101 = 13
+        let bit_count: usize = 4;
+        let validity_word: usize = 0b1101; // = 13
+
+        // ── Data bits (Msb0 packed, valid values only) ───────────────────
+        // Valid values: [true, false, true] → 3 bits
+        // Msb0: bit7=1(true), bit6=0(false), bit5=1(true), bits4-0=0
+        // Byte: 0b1010_0000 = 0xA0
+        let mut data_digest = Sha256::new();
+        data_digest.update([0xA0_u8]);
+        let data_finalized = data_digest.finalize();
+
+        // ── Final combination ────────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        // Nullable finalization
+        final_digest.update(bit_count.to_le_bytes());
+        final_digest.update(validity_word.to_be_bytes());
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&array),
+            expected,
+            "Example B: boolean array hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example C: Non-Nullable Int32 Array (hash_array API)
+    //   Int32Array [1, 2, 3]  (non-nullable)
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_c_non_nullable_int32_array() {
+        let array = Int32Array::from(vec![1_i32, 2, 3]);
+
+        // ── Type metadata ────────────────────────────────────────────────
+        let type_json = b"\"Int32\"";
+
+        // ── Data (contiguous LE buffer) ──────────────────────────────────
+        // [1, 2, 3] as i32 LE:
+        //   01 00 00 00  02 00 00 00  03 00 00 00
+        let mut data_digest = Sha256::new();
+        data_digest.update(1_i32.to_le_bytes());
+        data_digest.update(2_i32.to_le_bytes());
+        data_digest.update(3_i32.to_le_bytes());
+        let data_finalized = data_digest.finalize();
+
+        // ── Final (non-nullable) ─────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&array),
+            expected,
+            "Example C: non-nullable int32 array hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example D: Non-Nullable Binary Array (hash_array API)
+    //   BinaryArray [b"hi", b""]  (non-nullable)
+    //   Tests type canonicalization: Binary → LargeBinary
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_d_non_nullable_binary_array() {
+        let array = BinaryArray::from(vec![b"hi".as_ref(), b"".as_ref()]);
+
+        // ── Type metadata (canonicalized) ────────────────────────────────
+        // Binary → LargeBinary in canonical form
+        let type_json = b"\"LargeBinary\"";
+
+        // ── Data ─────────────────────────────────────────────────────────
+        // b"hi": len=2 as u64 LE + raw bytes
+        // b"":   len=0 as u64 LE + (no bytes)
+        let mut data_digest = Sha256::new();
+        data_digest.update(2_u64.to_le_bytes()); // 02 00 00 00 00 00 00 00
+        data_digest.update(b"hi"); // 68 69
+        data_digest.update(0_u64.to_le_bytes()); // 00 00 00 00 00 00 00 00
+        let data_finalized = data_digest.finalize();
+
+        // ── Final (non-nullable) ─────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&array),
+            expected,
+            "Example D: non-nullable binary array hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example E: Column-Order Independence
+    //   Batch 1: columns [x: Int32, y: Boolean nullable] → x=10, y=true
+    //   Batch 2: columns [y: Boolean nullable, x: Int32] → y=true, x=10
+    //   Both must produce the same hash.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_e_column_order_independence() {
+        let ints = Arc::new(Int32Array::from(vec![10_i32])) as ArrayRef;
+        let bools = Arc::new(BooleanArray::from(vec![Some(true)])) as ArrayRef;
+
+        let batch_xy = RecordBatch::try_new(
+            Arc::new(Schema::new(vec![
+                Field::new("x", DataType::Int32, false),
+                Field::new("y", DataType::Boolean, true),
+            ])),
+            vec![Arc::clone(&ints), Arc::clone(&bools)],
+        )
+        .unwrap();
+
+        let batch_yx = RecordBatch::try_new(
+            Arc::new(Schema::new(vec![
+                Field::new("y", DataType::Boolean, true),
+                Field::new("x", DataType::Int32, false),
+            ])),
+            vec![Arc::clone(&bools), Arc::clone(&ints)],
+        )
+        .unwrap();
+
+        // ── Manual computation ───────────────────────────────────────────
+        let schema_json = r#"{"x":{"data_type":"Int32","nullable":false},"y":{"data_type":"Boolean","nullable":true}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        // Field "x" (Int32, non-nullable): value 10
+        let mut x_data = Sha256::new();
+        x_data.update(10_i32.to_le_bytes()); // 0a 00 00 00
+        let x_finalized = x_data.finalize();
+
+        // Field "y" (Boolean, nullable): value true (valid)
+        // Validity: [1] → bit_count=1, word=1 (Lsb0)
+        // Data: [true] Msb0 → bit7=1 → 0x80
+        let bit_count: usize = 1;
+        let validity_word: usize = 1;
+
+        let mut y_data = Sha256::new();
+        y_data.update([0x80_u8]); // true in Msb0 = 1000_0000
+        let y_finalized = y_data.finalize();
+
+        // Final combination: schema, then fields alphabetically (x, y)
+        let mut final_digest = Sha256::new();
+        final_digest.update(schema_digest);
+        // x (non-nullable)
+        final_digest.update(x_finalized);
+        // y (nullable)
+        final_digest.update(bit_count.to_le_bytes());
+        final_digest.update(validity_word.to_be_bytes());
+        final_digest.update(y_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        // ── Verify both column orderings produce the same hash ───────────
+        let hash_xy = ArrowDigester::hash_record_batch(&batch_xy);
+        let hash_yx = ArrowDigester::hash_record_batch(&batch_yx);
+
+        assert_eq!(hash_xy, hash_yx, "Column order should not affect hash");
+        assert_eq!(
+            hash_xy, expected,
+            "Example E: column-order independence hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example F: Type Equivalence (Utf8 vs LargeUtf8, hash_array API)
+    //   StringArray ["ab"]  (Utf8, non-nullable)
+    //   LargeStringArray ["ab"]  (LargeUtf8, non-nullable)
+    //   Both must produce the same hash.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_f_utf8_large_utf8_equivalence() {
+        let small = StringArray::from(vec!["ab"]);
+        let large = LargeStringArray::from(vec!["ab"]);
+
+        // ── Manual computation ───────────────────────────────────────────
+        // Type metadata: both canonicalize to "LargeUtf8"
+        let type_json = b"\"LargeUtf8\"";
+
+        // Data: "ab" → len=2 as u64 LE + UTF-8 bytes
+        let mut data_digest = Sha256::new();
+        data_digest.update(2_u64.to_le_bytes());
+        data_digest.update(b"ab");
+        let data_finalized = data_digest.finalize();
+
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&small),
+            expected,
+            "Example F: Utf8 hash mismatch"
+        );
+        assert_eq!(
+            ArrowDigester::hash_array(&large),
+            expected,
+            "Example F: LargeUtf8 hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example G: Nullable Int32 Array with Nulls (hash_array API)
+    //   Int32Array [Some(42), None, Some(-7), Some(0)]
+    //   Tests nullable fixed-size path with actual nulls.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_g_nullable_int32_with_nulls() {
+        let array = Int32Array::from(vec![Some(42), None, Some(-7), Some(0)]);
+
+        // ── Type metadata ────────────────────────────────────────────────
+        let type_json = b"\"Int32\"";
+
+        // ── Validity bits (Lsb0, usize) ─────────────────────────────────
+        // [valid, null, valid, valid] → bits [1, 0, 1, 1] → 0b1101 = 13
+        let bit_count: usize = 4;
+        let validity_word: usize = 0b1101; // 13
+
+        // ── Data (only valid elements, in order) ─────────────────────────
+        // 42 as i32 LE:  2a 00 00 00
+        // -7 as i32 LE:  f9 ff ff ff
+        //  0 as i32 LE:  00 00 00 00
+        let mut data_digest = Sha256::new();
+        data_digest.update(42_i32.to_le_bytes());
+        data_digest.update((-7_i32).to_le_bytes());
+        data_digest.update(0_i32.to_le_bytes());
+        let data_finalized = data_digest.finalize();
+
+        // ── Final (nullable) ─────────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        final_digest.update(bit_count.to_le_bytes());
+        final_digest.update(validity_word.to_be_bytes());
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&array),
+            expected,
+            "Example G: nullable int32 array hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example H: Nullable String Array with Nulls (hash_array API)
+    //   StringArray [Some("hello"), None, Some("world"), Some("")]
+    //   Tests nullable variable-length path with type canonicalization.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_h_nullable_string_array_with_nulls() {
+        let array = StringArray::from(vec![Some("hello"), None, Some("world"), Some("")]);
+
+        // ── Type metadata (canonicalized) ────────────────────────────────
+        // Utf8 → LargeUtf8
+        let type_json = b"\"LargeUtf8\"";
+
+        // ── Validity bits (Lsb0, usize) ─────────────────────────────────
+        // [valid, null, valid, valid] → bits [1, 0, 1, 1] → 0b1101 = 13
+        let bit_count: usize = 4;
+        let validity_word: usize = 0b1101;
+
+        // ── Data (only valid elements) ───────────────────────────────────
+        // "hello" → len=5 u64 LE + "hello"
+        // "world" → len=5 u64 LE + "world"
+        // ""      → len=0 u64 LE
+        let mut data_digest = Sha256::new();
+        data_digest.update(5_u64.to_le_bytes());
+        data_digest.update(b"hello");
+        // NULL: skipped
+        data_digest.update(5_u64.to_le_bytes());
+        data_digest.update(b"world");
+        data_digest.update(0_u64.to_le_bytes());
+        let data_finalized = data_digest.finalize();
+
+        // ── Final (nullable) ─────────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json);
+        final_digest.update(bit_count.to_le_bytes());
+        final_digest.update(validity_word.to_be_bytes());
+        final_digest.update(data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&array),
+            expected,
+            "Example H: nullable string array hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example I: Empty Table (schema only, no data)
+    //   Tests that finalize() on a fresh digester with no update() calls
+    //   produces schema_digest + empty field digests.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_i_empty_table() {
+        let schema = Schema::new(vec![
+            Field::new("a", DataType::Int32, false),
+            Field::new("b", DataType::Boolean, true),
+        ]);
+
+        // ── Schema digest ────────────────────────────────────────────────
+        let schema_json = r#"{"a":{"data_type":"Int32","nullable":false},"b":{"data_type":"Boolean","nullable":true}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        // ── Field "a" (Int32, non-nullable): no data fed ─────────────────
+        // data_digest = SHA-256() with no updates → SHA-256 of empty input
+        let a_data_finalized = Sha256::digest(b"");
+
+        // ── Field "b" (Boolean, nullable): no data fed ───────────────────
+        // bit_count = 0 (no elements)
+        // as_raw_slice() = [] (no words)
+        // data_digest = SHA-256 of empty input
+        let bit_count: usize = 0;
+        let b_data_finalized = Sha256::digest(b"");
+
+        // ── Final ────────────────────────────────────────────────────────
+        let mut final_digest = Sha256::new();
+        final_digest.update(schema_digest);
+        // Field "a" (non-nullable)
+        final_digest.update(a_data_finalized);
+        // Field "b" (nullable) — bit_count=0, no words, empty data digest
+        final_digest.update(bit_count.to_le_bytes());
+        // no validity words (raw_slice is empty for 0-length BitVec)
+        final_digest.update(b_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        let digester = ArrowDigester::new(schema);
+        assert_eq!(
+            digester.finalize(),
+            expected,
+            "Example I: empty table hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example J: Multi-Batch Streaming
+    //   Feeding two small batches must produce the same hash as feeding
+    //   one combined batch (batch-split independence).
+    //   Schema: {v: Int32 non-nullable}
+    //   Batch 1: [1, 2]
+    //   Batch 2: [3]
+    //   Combined: [1, 2, 3]
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_j_multi_batch_streaming() {
+        let schema = Schema::new(vec![Field::new("v", DataType::Int32, false)]);
+
+        // ── Two-batch path ───────────────────────────────────────────────
+        let batch1 = RecordBatch::try_new(
+            Arc::new(schema.clone()),
+            vec![Arc::new(Int32Array::from(vec![1_i32, 2])) as ArrayRef],
+        )
+        .unwrap();
+        let batch2 = RecordBatch::try_new(
+            Arc::new(schema.clone()),
+            vec![Arc::new(Int32Array::from(vec![3_i32])) as ArrayRef],
+        )
+        .unwrap();
+
+        let mut digester_stream = ArrowDigester::new(schema.clone());
+        digester_stream.update(&batch1);
+        digester_stream.update(&batch2);
+        let hash_stream = digester_stream.finalize();
+
+        // ── Single-batch path ────────────────────────────────────────────
+        let combined = RecordBatch::try_new(
+            Arc::new(schema),
+            vec![Arc::new(Int32Array::from(vec![1_i32, 2, 3])) as ArrayRef],
+        )
+        .unwrap();
+        let hash_combined = ArrowDigester::hash_record_batch(&combined);
+
+        assert_eq!(
+            hash_stream, hash_combined,
+            "Streaming two batches should equal single combined batch"
+        );
+
+        // ── Manual computation ───────────────────────────────────────────
+        let schema_json = r#"{"v":{"data_type":"Int32","nullable":false}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        // Field "v": data is [1, 2, 3] as i32 LE — accumulated across batches
+        // The digester is streaming, so it updates the same SHA-256 state:
+        //   update(01 00 00 00  02 00 00 00)  from batch 1
+        //   update(03 00 00 00)               from batch 2
+        // SHA-256 is incremental, so this is identical to hashing all 12 bytes at once.
+        let mut v_data = Sha256::new();
+        v_data.update(1_i32.to_le_bytes());
+        v_data.update(2_i32.to_le_bytes());
+        v_data.update(3_i32.to_le_bytes());
+        let v_finalized = v_data.finalize();
+
+        let mut final_digest = Sha256::new();
+        final_digest.update(schema_digest);
+        final_digest.update(v_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            hash_stream, expected,
+            "Example J: multi-batch streaming hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example K: Struct Column in a Record Batch
+    //   Schema: {person: Struct<age: Int32 non-null, name: LargeUtf8 non-null> non-nullable}
+    //   Row 0: {age: 25, name: "Alice"}
+    //   Row 1: {age: 30, name: "Bob"}
+    //
+    //   In the record-batch path, struct fields are decomposed into leaf
+    //   fields: "person/age" and "person/name", each hashed independently.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_k_struct_column_in_record_batch() {
+        // ── Build the table ──────────────────────────────────────────────
+        let age = Arc::new(Int32Array::from(vec![25_i32, 30])) as ArrayRef;
+        let name = Arc::new(LargeStringArray::from(vec!["Alice", "Bob"])) as ArrayRef;
+        let struct_array = StructArray::from(vec![
+            (
+                Arc::new(Field::new("age", DataType::Int32, false)),
+                Arc::clone(&age),
+            ),
+            (
+                Arc::new(Field::new("name", DataType::LargeUtf8, false)),
+                Arc::clone(&name),
+            ),
+        ]);
+
+        let schema = Schema::new(vec![Field::new(
+            "person",
+            DataType::Struct(
+                vec![
+                    Field::new("age", DataType::Int32, false),
+                    Field::new("name", DataType::LargeUtf8, false),
+                ]
+                .into(),
+            ),
+            false,
+        )]);
+        let batch = RecordBatch::try_new(
+            Arc::new(schema.clone()),
+            vec![Arc::new(struct_array) as ArrayRef],
+        )
+        .unwrap();
+
+        // ── Step 1: Schema digest ────────────────────────────────────────
+        // Canonical JSON: struct fields sorted by name, keys sorted recursively
+        // "person" has data_type: {"Struct": [{"data_type": "Int32", "name": "age", "nullable": false},
+        //                                     {"data_type": "LargeUtf8", "name": "name", "nullable": false}]}
+        let schema_json = r#"{"person":{"data_type":{"Struct":[{"data_type":"Int32","name":"age","nullable":false},{"data_type":"LargeUtf8","name":"name","nullable":false}]},"nullable":false}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        assert_eq!(
+            ArrowDigester::hash_schema(&schema),
+            with_version(schema_digest.to_vec()),
+            "Example K: schema hash mismatch"
+        );
+
+        // ── Step 2: Leaf field "person/age" (Int32, non-nullable) ────────
+        // Values: [25, 30] as i32 LE
+        let mut age_data = Sha256::new();
+        age_data.update(25_i32.to_le_bytes());
+        age_data.update(30_i32.to_le_bytes());
+        let age_data_finalized = age_data.finalize();
+
+        // ── Step 3: Leaf field "person/name" (LargeUtf8, non-nullable) ───
+        // Values: ["Alice", "Bob"]
+        let mut name_data = Sha256::new();
+        name_data.update(5_u64.to_le_bytes()); // "Alice" length
+        name_data.update(b"Alice");
+        name_data.update(3_u64.to_le_bytes()); // "Bob" length
+        name_data.update(b"Bob");
+        let name_data_finalized = name_data.finalize();
+
+        // ── Step 4: Final combination ────────────────────────────────────
+        // Fields alphabetically: "person/age", "person/name"
+        let mut final_digest = Sha256::new();
+        final_digest.update(schema_digest);
+        // "person/age" (non-nullable): just data digest
+        final_digest.update(age_data_finalized);
+        // "person/name" (non-nullable): just data digest
+        final_digest.update(name_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_record_batch(&batch),
+            expected,
+            "Example K: struct column record batch hash mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example L: Struct Array via hash_array (non-nullable struct)
+    //   StructArray [{a: 1, b: true}, {a: 2, b: false}]
+    //   Children: a: Int32 non-null, b: Boolean non-null
+    //
+    //   In hash_array, the struct is hashed compositely:
+    //   type_json + data where data = finalized(child_a) || finalized(child_b)
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_l_struct_array_hash_array() {
+        let a = Arc::new(Int32Array::from(vec![1_i32, 2])) as ArrayRef;
+        let b = Arc::new(BooleanArray::from(vec![true, false])) as ArrayRef;
+        let struct_array = StructArray::from(vec![
+            (
+                Arc::new(Field::new("a", DataType::Int32, false)),
+                Arc::clone(&a),
+            ),
+            (
+                Arc::new(Field::new("b", DataType::Boolean, false)),
+                Arc::clone(&b),
+            ),
+        ]);
+
+        // ── Type metadata ────────────────────────────────────────────────
+        // Canonical: {"Struct":[{"data_type":"Int32","name":"a","nullable":false},
+        //                       {"data_type":"Boolean","name":"b","nullable":false}]}
+        let type_json = r#"{"Struct":[{"data_type":"Int32","name":"a","nullable":false},{"data_type":"Boolean","name":"b","nullable":false}]}"#;
+
+        // ── Child "a" (Int32, non-nullable) ──────────────────────────────
+        // Values: [1, 2]
+        let mut child_a_data = Sha256::new();
+        child_a_data.update(1_i32.to_le_bytes());
+        child_a_data.update(2_i32.to_le_bytes());
+        let child_a_finalized = child_a_data.finalize();
+
+        // ── Child "b" (Boolean, non-nullable) ────────────────────────────
+        // Values: [true, false] → Msb0: bit7=1(true), bit6=0(false) → 0x80
+        let mut child_b_data = Sha256::new();
+        child_b_data.update([0x80_u8]);
+        let child_b_finalized = child_b_data.finalize();
+
+        // ── Parent data digest ───────────────────────────────────────────
+        // Children sorted by name: "a" then "b"
+        // Each child is non-nullable, so finalized = SHA256(data).finalize() (32 bytes)
+        let mut parent_data = Sha256::new();
+        // Child "a" finalized (non-nullable → just data digest)
+        parent_data.update(child_a_finalized);
+        // Child "b" finalized (non-nullable → just data digest)
+        parent_data.update(child_b_finalized);
+        let parent_data_finalized = parent_data.finalize();
+
+        // ── Final combination ────────────────────────────────────────────
+        // Struct is non-nullable → NonNullable finalization
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json.as_bytes());
+        final_digest.update(parent_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&struct_array),
+            expected,
+            "Example L: struct array hash_array mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example M: Nullable Struct Array via hash_array (struct-level nulls)
+    //   StructArray [Some({a: 10, b: "x"}), None, Some({a: 30, b: "z"})]
+    //   Struct is nullable. Children: a: Int32 non-null, b: LargeUtf8 non-null
+    //
+    //   Struct-level nulls propagate to children: at row 1 (null struct),
+    //   children's data is undefined and must be skipped.
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_m_nullable_struct_array_hash_array() {
+        // Build a nullable struct array with a null at row 1
+        let a = Int32Array::from(vec![10_i32, 0, 30]); // row 1 value is undefined (0 placeholder)
+        let b = LargeStringArray::from(vec!["x", "", "z"]); // row 1 value is undefined
+        let struct_array = StructArray::from((
+            vec![
+                (
+                    Arc::new(Field::new("a", DataType::Int32, false)),
+                    Arc::new(a) as ArrayRef,
+                ),
+                (
+                    Arc::new(Field::new("b", DataType::LargeUtf8, false)),
+                    Arc::new(b) as ArrayRef,
+                ),
+            ],
+            // Struct-level validity: [valid, null, valid]
+            // Buffer from NullBuffer: true=valid, false=null
+            NullBuffer::from(vec![true, false, true])
+                .into_inner()
+                .into_inner(),
+        ));
+
+        // ── Type metadata ────────────────────────────────────────────────
+        let type_json = r#"{"Struct":[{"data_type":"Int32","name":"a","nullable":false},{"data_type":"LargeUtf8","name":"b","nullable":false}]}"#;
+
+        // ── Struct-level validity (Lsb0, usize) ─────────────────────────
+        // [valid, null, valid] → bits [1, 0, 1] → 0b101 = 5
+        let struct_bit_count: usize = 3;
+        let struct_validity_word: usize = 0b101; // 5
+
+        // ── Child "a" (Int32, effectively nullable due to struct nulls) ──
+        // Combined validity: struct AND child = [1, 0, 1] (child has no nulls of its own)
+        // Valid data: [10, 30] (row 1 skipped)
+        let child_a_bit_count: usize = 3;
+        let child_a_validity_word: usize = 0b101;
+
+        let mut child_a_data = Sha256::new();
+        child_a_data.update(10_i32.to_le_bytes());
+        // row 1: skipped (null)
+        child_a_data.update(30_i32.to_le_bytes());
+        let child_a_data_finalized = child_a_data.finalize();
+
+        // ── Child "b" (LargeUtf8, effectively nullable due to struct nulls)
+        let child_b_bit_count: usize = 3;
+        let child_b_validity_word: usize = 0b101;
+
+        let mut child_b_data = Sha256::new();
+        child_b_data.update(1_u64.to_le_bytes()); // "x" len
+        child_b_data.update(b"x");
+        // row 1: skipped (null)
+        child_b_data.update(1_u64.to_le_bytes()); // "z" len
+        child_b_data.update(b"z");
+        let child_b_data_finalized = child_b_data.finalize();
+
+        // ── Parent data digest ───────────────────────────────────────────
+        // Children sorted by name: "a", "b"
+        // Each child is effectively nullable → finalized as:
+        //   bit_count LE + validity_words BE + data_digest.finalize()
+        let mut parent_data = Sha256::new();
+        // Child "a" finalized (nullable)
+        parent_data.update(child_a_bit_count.to_le_bytes());
+        parent_data.update(child_a_validity_word.to_be_bytes());
+        parent_data.update(child_a_data_finalized);
+        // Child "b" finalized (nullable)
+        parent_data.update(child_b_bit_count.to_le_bytes());
+        parent_data.update(child_b_validity_word.to_be_bytes());
+        parent_data.update(child_b_data_finalized);
+        let parent_data_finalized = parent_data.finalize();
+
+        // ── Final combination ────────────────────────────────────────────
+        // Struct is nullable → parent finalization includes struct validity
+        let mut final_digest = Sha256::new();
+        final_digest.update(type_json.as_bytes());
+        // Struct-level nullable finalization
+        final_digest.update(struct_bit_count.to_le_bytes());
+        final_digest.update(struct_validity_word.to_be_bytes());
+        final_digest.update(parent_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_array(&struct_array),
+            expected,
+            "Example M: nullable struct array hash_array mismatch"
+        );
+    }
+
+    // ══════════════════════════════════════════════════════════════════════
+    // Example N: List-of-Struct in a Record Batch
+    //   Schema: {items: LargeList<Struct<id: Int32 non-null, label: LargeUtf8 non-null>> nullable}
+    //   Row 0: [{id: 1, label: "a"}, {id: 2, label: "b"}]   (2 elements)
+    //   Row 1: [{id: 3, label: "c"}]                          (1 element)
+    //
+    //   The list column is decomposed into leaf fields:
+    //   "items" in the BTreeMap (the list field itself, not its inner struct fields).
+    //   But the list's sub-arrays ARE struct arrays, which are now hashed
+    //   compositely via array_digest_update(Struct).
+    // ══════════════════════════════════════════════════════════════════════
+
+    #[test]
+    fn example_n_list_of_struct_record_batch() {
+        // ── Build the table ──────────────────────────────────────────────
+        let struct_fields = vec![
+            Field::new("id", DataType::Int32, false),
+            Field::new("label", DataType::LargeUtf8, false),
+        ];
+        let inner_struct_field = Field::new(
+            "item",
+            DataType::Struct(struct_fields.clone().into()),
+            false,
+        );
+        let list_field = Field::new(
+            "items",
+            DataType::LargeList(Arc::new(inner_struct_field.clone())),
+            true,
+        );
+        let schema = Schema::new(vec![list_field.clone()]);
+
+        // Build struct sub-arrays
+        // Row 0: [{id:1, label:"a"}, {id:2, label:"b"}], Row 1: [{id:3, label:"c"}]
+        // Total struct rows: 3 (ids: [1,2,3], labels: ["a","b","c"])
+        let ids = Int32Array::from(vec![1_i32, 2, 3]);
+        let labels = LargeStringArray::from(vec!["a", "b", "c"]);
+        let struct_array = StructArray::from(vec![
+            (
+                Arc::new(Field::new("id", DataType::Int32, false)),
+                Arc::new(ids) as ArrayRef,
+            ),
+            (
+                Arc::new(Field::new("label", DataType::LargeUtf8, false)),
+                Arc::new(labels) as ArrayRef,
+            ),
+        ]);
+
+        // Build large list array with offsets [0, 2, 3]
+        let list_array = LargeListArray::new(
+            Arc::new(inner_struct_field),
+            arrow::buffer::OffsetBuffer::new(vec![0_i64, 2, 3].into()),
+            Arc::new(struct_array) as ArrayRef,
+            None, // all list elements valid
+        );
+
+        let batch = RecordBatch::try_new(
+            Arc::new(schema.clone()),
+            vec![Arc::new(list_array) as ArrayRef],
+        )
+        .unwrap();
+
+        // ── Step 1: Schema digest ────────────────────────────────────────
+        // Canonical: element type has no name (element_type_to_value drops "item")
+        // The inner struct's data_type is {"Struct": [sorted children]}
+        let schema_json = r#"{"items":{"data_type":{"LargeList":{"data_type":{"Struct":[{"data_type":"Int32","name":"id","nullable":false},{"data_type":"LargeUtf8","name":"label","nullable":false}]},"nullable":false}},"nullable":true}}"#;
+        let schema_digest = Sha256::digest(schema_json.as_bytes());
+
+        assert_eq!(
+            ArrowDigester::hash_schema(&schema),
+            with_version(schema_digest.to_vec()),
+            "Example N: schema hash mismatch"
+        );
+
+        // ── Step 2: Field "items" (LargeList<Struct>, nullable) ──────────
+        //
+        // With structural hashing, list sizes go to a separate structural digest,
+        // while leaf data (struct composites) goes to the data/leaf digest.
+        //
+        // The BitVec accumulates ALL null bits from the list AND its sub-arrays.
+        // List-level: handle_null_bits(list) → [1, 1] (both list elements valid)
+        // Then for each list element, the struct sub-array also pushes its validity:
+        //   Element 0 struct (2 rows, no nulls): → [1, 1]
+        //   Element 1 struct (1 row, no nulls): → [1]
+        // Total BitVec: [1, 1, 1, 1, 1] → 5 bits, all valid
+        let items_bit_count: usize = 5;
+        let items_validity_word: usize = 0b11111; // 31
+
+        // ── Structural digest: element counts (sizes) ────────────────────
+        let mut items_structural = Sha256::new();
+        items_structural.update(2_u64.to_le_bytes()); // element 0 has 2 struct rows
+        items_structural.update(1_u64.to_le_bytes()); // element 1 has 1 struct row
+        let items_structural_finalized = items_structural.finalize();
+
+        // ── Data/leaf digest: struct composites (no size prefixes) ────────
+        //
+        // --- List element 0: [{id:1,label:"a"}, {id:2,label:"b"}] (2 rows) ---
+        //   Struct composite: children sorted by name: "id" then "label"
+        //     No struct-level nulls, children are non-nullable
+        //
+        //   Child "id" (Int32, non-null): values [1, 2]
+        let mut e0_child_id_data = Sha256::new();
+        e0_child_id_data.update(1_i32.to_le_bytes());
+        e0_child_id_data.update(2_i32.to_le_bytes());
+        let e0_child_id_finalized = e0_child_id_data.finalize();
+
+        //   Child "label" (LargeUtf8, non-null): values ["a", "b"]
+        let mut e0_child_label_data = Sha256::new();
+        e0_child_label_data.update(1_u64.to_le_bytes()); // "a" len
+        e0_child_label_data.update(b"a");
+        e0_child_label_data.update(1_u64.to_le_bytes()); // "b" len
+        e0_child_label_data.update(b"b");
+        let e0_child_label_finalized = e0_child_label_data.finalize();
+
+        // --- List element 1: [{id:3,label:"c"}] (1 row) ---
+        //   Child "id": values [3]
+        let mut e1_child_id_data = Sha256::new();
+        e1_child_id_data.update(3_i32.to_le_bytes());
+        let e1_child_id_finalized = e1_child_id_data.finalize();
+
+        //   Child "label": values ["c"]
+        let mut e1_child_label_data = Sha256::new();
+        e1_child_label_data.update(1_u64.to_le_bytes()); // "c" len
+        e1_child_label_data.update(b"c");
+        let e1_child_label_finalized = e1_child_label_data.finalize();
+
+        // Build leaf digest: struct composites for each list element
+        let mut items_data = Sha256::new();
+        // List element 0: struct children finalized into data (no size prefix here)
+        items_data.update(e0_child_id_finalized); // non-nullable child: 32 bytes
+        items_data.update(e0_child_label_finalized); // non-nullable child: 32 bytes
+                                                     // List element 1: struct children finalized into data
+        items_data.update(e1_child_id_finalized);
+        items_data.update(e1_child_label_finalized);
+        let items_data_finalized = items_data.finalize();
+
+        // ── Step 3: Final combination ────────────────────────────────────
+        // For list fields (nullable): bit_count + validity_words + structural_digest + data_digest
+        let mut final_digest = Sha256::new();
+        final_digest.update(schema_digest);
+        // "items" (nullable, structured): null bits + structural + leaf
+        final_digest.update(items_bit_count.to_le_bytes());
+        final_digest.update(items_validity_word.to_be_bytes());
+        final_digest.update(items_structural_finalized);
+        final_digest.update(items_data_finalized);
+
+        let expected = with_version(final_digest.finalize().to_vec());
+
+        assert_eq!(
+            ArrowDigester::hash_record_batch(&batch),
+            expected,
+            "Example N: list-of-struct record batch hash mismatch"
+        );
+    }
+}
diff --git a/tests/golden_files/schema_serialization_pretty.json b/tests/golden_files/schema_serialization_pretty.json
index 70cb27d..f2ec2db 100644
--- a/tests/golden_files/schema_serialization_pretty.json
+++ b/tests/golden_files/schema_serialization_pretty.json
@@ -1,6 +1,6 @@
 {
   "binary_name": {
-    "data_type": "Binary",
+    "data_type": "LargeBinary",
     "nullable": true
   },
   "bool_name": {
@@ -45,19 +45,9 @@
   "doubly_nested_struct_name": {
     "data_type": {
       "Struct": [
-        {
-          "data_type": "Int32",
-          "name": "outer_field",
-          "nullable": false
-        },
         {
           "data_type": {
             "Struct": [
-              {
-                "data_type": "Utf8",
-                "name": "middle_field",
-                "nullable": true
-              },
               {
                 "data_type": {
                   "Struct": [
@@ -75,11 +65,21 @@
                 },
                 "name": "inner",
                 "nullable": false
+              },
+              {
+                "data_type": "LargeUtf8",
+                "name": "middle_field",
+                "nullable": true
               }
             ]
           },
           "name": "middle",
           "nullable": false
+        },
+        {
+          "data_type": "Int32",
+          "name": "outer_field",
+          "nullable": false
         }
       ]
     },
@@ -117,7 +117,6 @@
     "data_type": {
       "LargeList": {
         "data_type": "Int32",
-        "name": "item",
         "nullable": true
       }
     },
@@ -129,9 +128,8 @@
   },
   "list_name": {
     "data_type": {
-      "List": {
+      "LargeList": {
         "data_type": "Int32",
-        "name": "item",
         "nullable": true
       }
     },
@@ -146,7 +144,7 @@
           "nullable": false
         },
         {
-          "data_type": "Utf8",
+          "data_type": "LargeUtf8",
           "name": "struct_field2",
           "nullable": true
         }
@@ -195,7 +193,7 @@
     "nullable": false
   },
   "utf8_name": {
-    "data_type": "Utf8",
+    "data_type": "LargeUtf8",
     "nullable": true
   }
 }