-
Notifications
You must be signed in to change notification settings - Fork 26
[DataOriented] Fastcache, perf, pruning #705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hughperkins
wants to merge
126
commits into
main
Choose a base branch
from
hp/data-oriented-qd-func-dataclass
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
126 commits
Select commit
Hold shift + click to select a range
06b7c6a
[Test] Pin behaviour for @qd.data_oriented with raw qd.ndarray members
hughperkins d4350ef
[Fix] Recurse through nested data_oriented / dataclass children when …
hughperkins 97afa6d
[Fix] Launch-context stale guard fires for @qd.data_oriented containe…
hughperkins 49a723b
[Test] Extend @qd.data_oriented + ndarray coverage: cross-container n…
hughperkins 9bdeca5
[Doc] @qd.data_oriented can contain ndarrays
hughperkins dc7997b
[Fix] Gap A: template-mapper spec key descends into data_oriented nda…
hughperkins 906ce19
[Test] Gap A: spec-key descent into data_oriented ndarray members
hughperkins a0db648
[Fix] Template-mapper args_hash invalidates when data_oriented ndarra…
hughperkins c9598ad
[Fix] Clear error for @qd.data_oriented field type inside typed-datac…
hughperkins 93893e5
[Perf] Per-class cache of data_oriented ndarray attribute paths for G…
hughperkins ce769a7
[Doc] Nesting compatibility matrix for compound types + spot tests
hughperkins dd4de40
[Doc] Fix @qd.struct ghost reference in compound_types
hughperkins 46825ab
[Test] Pin fastcache + @qd.data_oriented + ndarray end-to-end behavior
hughperkins ee5fbbb
[Doc] Fastcache with @qd.data_oriented: worked example, semantics, fo…
hughperkins b132b81
[Doc] Restructure fastcache.md: simple main body, Advanced subsection…
hughperkins 6d1c820
[Doc] Use 'member' consistently for compound-type members; drop ambig…
hughperkins 1de65b9
[Doc] Mirror qd.Template wording for @qd.data_oriented primitive memb…
hughperkins a648c3f
[Doc] @qd.data_oriented row: 'types and values' to mirror qd.Template…
hughperkins a55d360
[Doc] Tighten path-cache stability restriction: actual failure modes …
hughperkins 6667ba6
[Test] Fix fastcache cross-init tests: filter captured launches by ke…
hughperkins e9c50b4
[Style] pre-commit auto-fixes: black wrap + ruff import-sort
hughperkins abf242b
[Doc] Move @qd.kernel inside @qd.data_oriented class in the ndarray-m…
hughperkins 4c27e2e
[Doc] Document primitive members on @qd.data_oriented self as templat…
hughperkins 1f539e6
[Doc] State ndarray-member subscript behaviour directly instead of cr…
hughperkins 730cbcb
[Doc] Drop 'as with dataclasses.dataclass' cross-reference in ndarray…
hughperkins 57e1b95
[Doc] Simplify fastcache cross-link in @qd.data_oriented section: dro…
hughperkins d4ca211
[Doc] Drop ndarray-reassign note and tighten fastcache cross-link in …
hughperkins b72a7a7
[Doc] Drop ndarray subscript-access description in @qd.data_oriented …
hughperkins 18ff7bd
[Doc] Promote fastcache cross-link to its own ### Fastcache subsectio…
hughperkins 33f4744
[Doc] Rename '### ndarray members' to '### Tensor members'; cover qd.…
hughperkins 883243e
[Doc] @qd.data_oriented Fastcache subsection: spell out 'disabled for…
hughperkins 3504250
[Doc] Tensor members: shorten qd.tensor description to 'or qd.Tensor'
hughperkins cc01339
[Doc] Tensor members: simplify nested-container sentence to 'Nested @…
hughperkins df3113e
[Doc] Fastcache subsection: 'methods of @qd.data_oriented classes'
hughperkins 7f5fd12
[Doc] Tensor members: drop qd.Vector.ndarray / qd.Matrix.ndarray pare…
hughperkins e7fafeb
[Doc] Tensor members: drop the mixing-backends + nesting trailer sent…
hughperkins f9a35df
[Doc] Restrictions: drop redundant 'A few combinations are still unsu…
hughperkins d336dcd
[Doc] @qd.dataclass section opener: cut to the constraint
hughperkins 4c5f622
[Doc] Remove top-level Recommendation section
hughperkins 56a4399
[Doc] Expand @qd.dataclass section: what it does, when to use it, con…
hughperkins ef5f8a6
[Doc] @qd.dataclass section: drop use-cases / constraints / cross-ref…
hughperkins 06580f1
[Doc] @qd.dataclass section opener: explain the kernel-side vs python…
hughperkins 8899357
[Doc] Restore verbatim prose for the @qd.struct vs other-compound-typ…
hughperkins 8fef507
[Doc] Replace @qd.struct with @qd.dataclass in opener prose (actual A…
hughperkins 92f5fe1
[Doc] @qd.dataclass: 'element type of fields' not 'tensors'
hughperkins 9ea8e5b
[Doc] @qd.dataclass: add sentences about @qd.func methods and qd.type…
hughperkins 6ff0848
[Doc] @qd.dataclass methods sentence: 'Methods can be added to ... an…
hughperkins fd8cd0a
[Doc] @qd.dataclass section: move qd.types.struct paragraph to end wi…
hughperkins 004cd9a
[Doc] qd.types.struct sentence: drop 'useful when members are compute…
hughperkins bf85e4e
[Doc] @qd.dataclass: split into bare-struct example, then methods + @…
hughperkins ccaae54
[Doc] First @qd.dataclass example uses AOS layout (the unique-to-Stru…
hughperkins 820c01a
[Doc] Move 'Nesting compatibility' section to end of compound_types.md
hughperkins 06d2e86
[Doc] Overview table: dataclasses.dataclass supports differentiation …
hughperkins f7dd090
[Test] AD through dataclasses.dataclass with ndarray, field, and qd.t…
hughperkins 8c0377c
[Doc] compound_types: rephrase intro bullets to describe each type's …
hughperkins 71a53da
[Doc] compound_types: prefix dataclasses.dataclass with @ in intro/ta…
hughperkins 46fef24
[Test] AD dataclass: tensor(FIELD) member works when annotated as qd.…
hughperkins 18f995b
[Doc] tensor: note qd.Tensor is also the dataclass-member annotation
hughperkins 3ce0ab0
[Doc] compound_types: add 'Under the hood' subsection for each type
hughperkins 35be370
[Doc] compound_types: rewrite 'Under the hood' subsections at a highe…
hughperkins 94e455a
[Doc] compound_types: drop 'once' from compile-time capture phrasing
hughperkins 31b27d7
[Doc] compound_types: replace overview table with differentiating one
hughperkins 36dc933
[Doc] compound_types: drop 'historical reasons' line
hughperkins 07dc486
[Fix] _build_struct_nd_paths: handle NamedTuple via _asdict() fallback
hughperkins c7d6737
[wip] preserved baseline: stable_members mitigations + new failing test
93f597e
[fix] Option A: expand dataclass-instance args in @qd.func calls from…
c25f49c
[test] nested dataclasses + chained @qd.func calls from data_oriented…
8f64016
[Perf] Prune unused @qd.data_oriented ndarrays via existing pruning m…
hughperkins aa9a88f
[Fix] Fastcache hasher: skip QuadrantsCallable/BoundQuadrantsCallable…
hughperkins fd8c440
[Perf] Don't over-mark ndarrays during @qd.func dataclass-arg expansion
hughperkins e3a3d88
[Perf] TemplateMapper.lookup: only walk template-slot args, cache per…
hughperkins 067a471
[Fix] Walker robustness: cycle-safe + Pydantic-metaclass-safe is_data…
hughperkins cc1e380
[Style] Apply pre-commit (black + ruff): import order, single-line co…
hughperkins 34f8532
[Fix] stable_members: tolerate opaque members in fastcache hasher + c…
hughperkins 5e54902
[Fix] stable_members fastcache: only tolerate truly-opaque members, f…
hughperkins 55ecf95
[Fix] Metaclass-safe is_dataclass for walker over user objects
hughperkins 6d9c307
[Style] pre-commit: import formatting
hughperkins 49ffb3b
[Fix] Fastcache: skip opaque-typed members silently by default
hughperkins 7757907
[Doc] Fastcache: opaque-member silencing is the default; clarify stab…
hughperkins fb38fec
Revert "[Doc] Fastcache: opaque-member silencing is the default; clar…
hughperkins 7cabaa0
Revert "[Fix] Fastcache: skip opaque-typed members silently by default"
hughperkins b5b360a
[Fix] Fastcache: replace PARAM_INVALID / silent-skip with qualname fa…
hughperkins 3aa4fe1
[Fix] test_ad_dataclass: require data64 extension for f64 tests
hughperkins dce1305
[Refactor] Fastcache: two-level cache + pruning-driven narrow args walk
hughperkins 984ac40
[Doc] Fastcache: pruning-driven semantics; stable_members is launch-p…
hughperkins 12fb215
[Fix] Fastcache: full pruning coverage for data_oriented; remove qual…
hughperkins 356394e
[Test] Pin pruning-driven fastcache behaviour for @qd.data_oriented args
hughperkins 45129bc
[Doc] data_oriented(stable_members=...) docstring: correct the failur…
hughperkins 1f25d9c
[Fix] record_after_call: propagate chain paths through Attribute args
hughperkins 5fc9b4c
[Fix] Track @qd.func params in fn_param_names for chain-path seeding
hughperkins 89bb005
[Style] test docstrings: reflow at 120c per repo line-width
hughperkins 710ee47
[Fix] Fastcache: prune _predeclare_struct_ndarrays by flat-name on ca…
hughperkins 090f1a8
[CI] Fix linters, pyright, MockContext test, deleted-comment, line-wrap
hughperkins be4b030
[Refactor] Move fold_*_into_pruning from Kernel to Pruning
hughperkins 75c08f6
[Style] Reflow 3 docstring paragraphs to 120c (Check line wrapping)
hughperkins 29dd841
[Style] Reflow 3 more comment/docstring lines to 120c
hughperkins 4bd2d10
[Style] Reflow 3 more comment lines to 120c
hughperkins aef1a26
[Style] Manually reflow underwrapped prose to 120c
hughperkins 197d150
[Style] Reflow more underwrapped prose to 120c (round 2)
hughperkins 173b051
Merge remote-tracking branch 'origin/hp/data-oriented-ndarray-fix' in…
hughperkins a47a5ab
[Doc] fastcache.md: restore prose phrasing in unsupported-type + arg-…
hughperkins 5debfe4
[Doc] fastcache.md: drop redundant 'every child is subject to pruning…
hughperkins 4e714c7
[Doc] fastcache.md: revert @qd.data_oriented child-rule bullets to or…
hughperkins 39602c6
[Doc] fastcache.md: tighten recognised-but-unsupported sentence
hughperkins bd37c94
[Doc] fastcache.md: restore nested-dataclass + qd.field bullets in da…
hughperkins 59ce5ff
[Style] args_hasher: restore original 'field offset' comments on Scal…
hughperkins a63b834
[Docs] src_hasher: remove pre-refactor background paragraph from modu…
hughperkins f6c68d8
[Docs] src_hasher: correct safety-implication paragraph
hughperkins ae36b11
[Fix] Per-instance ndarray-path cache for @qd.data_oriented args
hughperkins c61d32c
[Test] Strengthen polymorphism + add cache-hit predeclare ndarray test
hughperkins 706f9b5
[Test] Add bug reproducer: needs_grad not folded into fastcache args_…
hughperkins 4398af7
[Fix] Fold needs_grad into fastcache narrow args_hash for ndarray leaves
hughperkins 8a7ead4
[Lint] Reorder imports in needs_grad reproducer test
hughperkins 8861dc0
Merge remote-tracking branch 'origin/main' into hp/data-oriented-qd-f…
hughperkins e8bcd18
[Doc] Fix tile16 -> tile link rename; reflow under-wrapped CI flags
hughperkins 44a535f
Merge origin/main into hp/data-oriented-qd-func-dataclass (post-#723)
hughperkins 6805ce3
Merge branch 'main' into hp/data-oriented-qd-func-dataclass
hughperkins e4bf125
[Doc] fastcache.md: simplify per Hugh review #r3399480151
hughperkins 2719dca
[Doc] fastcache.md: narrowing applies to all compound types, not just…
hughperkins 4a97b91
[Doc] fastcache.md: address Hugh review r3403946029
hughperkins cfeac8b
[Doc] compound_types.md: prominent stable_members recommendation
hughperkins ed3b6c2
Merge branch 'main' into hp/data-oriented-qd-func-dataclass
hughperkins 0415c05
Merge branch 'main' into hp/data-oriented-qd-func-dataclass
hughperkins 9fb8287
remove internal stuff
hughperkins 66d095f
[Refactor] Extract fastcache L1/L2 persistence orchestration to src_h…
hughperkins 0401c7a
[Style] test_data_oriented_ndarray: reflow under-wrapped comments to …
hughperkins File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -95,14 +95,17 @@ Fastcache supports the following parameter types: | |
| | `qd.types.NDArray` (scalar, vector, matrix) | Yes | dtype, ndim, layout | | ||
| | `torch.Tensor` | Yes | dtype, ndim | | ||
| | `numpy.ndarray` | Yes | dtype, ndim | | ||
| | `dataclasses.dataclass` | Yes | member types recursively; member values if annotated with `FIELD_METADATA_CACHE_VALUE` (see [Appendix — compound-type cache keying](#compound-type-cache-keying)) | | ||
| | `@qd.data_oriented` objects | Yes | member types recursively; primitive member types and values baked into kernel (see [Appendix — compound-type cache keying](#compound-type-cache-keying)) | | ||
| | `dataclasses.dataclass` | Yes | member types recursively (narrowed to members the kernel reads or writes); member values if annotated with `FIELD_METADATA_CACHE_VALUE` (see [Advanced — compound-type cache keying](#compound-type-cache-keying)) | | ||
| | `@qd.data_oriented` objects | Yes | member types recursively (narrowed to members the kernel reads or writes); primitive member types and values baked into kernel (see [Advanced — compound-type cache keying](#compound-type-cache-keying)) | | ||
| | `qd.Template` primitives (int, float, bool) | Yes | type and value (baked into kernel) | | ||
| | Non-template primitives (int, float, bool) | Yes | type only | | ||
| | `enum.Enum` | Yes | name and value | | ||
| | `qd.field` / `ScalarField` / `MatrixField` | **No** | — | | ||
| | `qd.field` / `ScalarField` / `MatrixField` at a kernel-read path | **No** | — | | ||
| | Anything else at a kernel-read path | **No** | — | | ||
|
|
||
| If any parameter is of an unsupported type, fastcache is disabled for that call and the kernel falls back to normal compilation. For `qd.field` / `ScalarField` / `MatrixField` arriving through a `qd.Tensor`-annotated parameter, this is silent — no warning is emitted. For other unsupported types, a warning is logged at the `warn` level identifying the offending parameter. | ||
| If any kernel-used parameter is of an unsupported type, fastcache is disabled for that call and the kernel falls back to normal compilation. For `qd.field` / `ScalarField` / `MatrixField` arriving through a `qd.Tensor`-annotated parameter, this is silent — no warning is emitted. For other unsupported types, a warning is logged at the `warn` level identifying the offending parameter. | ||
|
|
||
| Kernel-unused members of any type — including unrecognised ones — do **not** disable fastcache. Fastcache skips them entirely, so opaque metadata (UUIDs, Pydantic configs, parent back-pointers) attached to a `@qd.data_oriented` or `dataclasses.dataclass` instance is harmless as long as the kernel doesn't read it. | ||
|
|
||
| ### 3. Source code must be available | ||
|
|
||
|
|
@@ -120,6 +123,12 @@ Each compiled artifact is stored under a key derived from all of the following: | |
|
|
||
| When any of these change, the resulting key is different, so a new compilation occurs and a new entry is stored. Previous entries remain on disk — multiple cached versions coexist. You do not need to manually clear the cache when making code changes — the hash mismatch causes a transparent recompilation. | ||
|
|
||
| ### Two strict invariants | ||
|
|
||
| 1. **If the kernel does not read or write a variable, it is entirely ignored by fastcache.** It will not cause fastcache to fail, nor emit a warning, nor emit an error. | ||
|
|
||
| 2. **Unrecognised types at variables the kernel reads or writes must not be silently dropped or hashed by type-name.** If the value of such a variable has a type fastcache doesn't explicitly handle (Pydantic models, UUIDs, third-party tensor wrappers, …), fastcache is disabled for the call with a one-shot `[FASTCACHE][UNKNOWN_TYPE]` warning identifying the offending type plus an `[INVALID_FUNC]` log line confirming the cache is off. | ||
|
|
||
| ## Advanced | ||
|
|
||
| ### Diagnostics | ||
|
|
@@ -143,32 +152,25 @@ print(obs.cache_stored) # True if the compiled kernel was stored to cach | |
|
|
||
| On the first run you'll see `cache_stored=True` but `cache_loaded=False`. On the second run (after `qd.init`), `cache_loaded=True`. | ||
|
|
||
| ## Appendix | ||
|
|
||
| ### Compound-type cache keying | ||
|
|
||
| The args hasher walks compound-type kernel parameters recursively. For each leaf member it decides what (if anything) contributes to the cache key. The headline rules: | ||
| For `@qd.data_oriented` and `dataclasses.dataclass` kernel parameters, fastcache walks members recursively. Any members that are not themselves read or written by the kernel, nor contain members read or written by the kernel, are skipped during the walk (per the [strict invariants](#two-strict-invariants) above). Member-by-member behavior: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is redundant with:
|
||
|
|
||
| **`@qd.data_oriented`:** the walker descends into `vars(obj)`. For each child: | ||
| - **`qd.ndarray` member** — `(dtype, ndim, layout)` is included in the cache key. Element values are not. | ||
| - **Primitive (`int` / `float` / `bool` / `enum.Enum`) member.** The handling depends on the enclosing container: | ||
| - In a `@qd.data_oriented` instance — value is baked into the kernel, same as a `qd.Template` primitive. Two instances of the same class with different primitive member values get different cache entries. | ||
| - In a `dataclasses.dataclass` instance — only the type is included by default. To include the value too, annotate the field with `FIELD_METADATA_CACHE_VALUE`: | ||
|
|
||
| - `qd.ndarray` member — `(dtype, ndim, layout)` is included in the cache key. Element values are not. | ||
| - Primitive (`int` / `float` / `bool` / `enum.Enum`) member — value is baked into the kernel (same semantics as a `qd.Template` primitive). Two instances of the same class with different primitive member values get different cache entries. | ||
| - Nested `@qd.data_oriented` member — recurses. | ||
| - Nested `dataclasses.dataclass` member — recurses (with the dataclass rules below). | ||
| - `qd.field` member — fastcache is disabled for the entire kernel call. The kernel still runs via normal compilation; a warn-level log line is emitted. | ||
|
|
||
| **`dataclasses.dataclass`:** the walker descends into the declared members. For each member, only the *type* is included in the cache key by default — **not** the value. To include a member's value, annotate it: | ||
|
|
||
| ```python | ||
| import dataclasses | ||
| from quadrants.lang._fast_caching import FIELD_METADATA_CACHE_VALUE | ||
|
|
||
| @dataclasses.dataclass | ||
| class SimConfig: | ||
| num_layers: int = dataclasses.field(metadata={FIELD_METADATA_CACHE_VALUE: True}) | ||
| dt: float = dataclasses.field(metadata={FIELD_METADATA_CACHE_VALUE: True}) | ||
| ``` | ||
| ```python | ||
| import dataclasses | ||
| from quadrants.lang._fast_caching import FIELD_METADATA_CACHE_VALUE | ||
|
|
||
| This is necessary whenever the compiled kernel depends on the member's *value* rather than just its type (for example, when the value is used as a loop bound that the compiler bakes into the generated code). Without the annotation, two `SimConfig` instances with different `num_layers` values would share a fastcache key, and the second instance would silently load a kernel compiled for the wrong value. | ||
| @dataclasses.dataclass | ||
| class SimConfig: | ||
| num_layers: int = dataclasses.field(metadata={FIELD_METADATA_CACHE_VALUE: True}) | ||
| dt: float = dataclasses.field(metadata={FIELD_METADATA_CACHE_VALUE: True}) | ||
| ``` | ||
|
|
||
| Note the asymmetry: `@qd.data_oriented` primitive members are baked into the kernel automatically (same semantics as `qd.Template`); `dataclasses.dataclass` members contribute only their *type* to the cache key unless you opt in per-member. | ||
| Annotate any member whose *value* (not just type) affects the compiled kernel. Primarily this means any variable used inside [`qd.static`](static.md). | ||
| - **Nested `@qd.data_oriented` or `dataclasses.dataclass` member** — recurses with the same rules (so an `int` inside a nested `@qd.data_oriented` is still baked into the kernel; an `int` inside a nested `dataclasses.dataclass` still needs `FIELD_METADATA_CACHE_VALUE` to bake its value). | ||
| - **`qd.field` member** — fastcache is disabled for the entire kernel call. The kernel still runs via normal compilation; a warn-level log line is emitted. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this behaviour should be enforced via post-init freeze. But I think it is tricky to get right without annoying boilerplate on user-side and/or performance penalty.
Even Python dataclass does not offer a clean way to customise init when frozen.
https://docs.python.org/3/library/dataclasses.html#dataclasses