graph=True: checkpoint containing nested graph_do_while should not fall back to non-graph launch

## Summary

`qd.checkpoint(yield_on=flag)` containing a nested `qd.graph_do_while` triggers a non-graph fallback:

```
[I] graph=True: a qd.checkpoint() block containing a nested qd.graph_do_while is not yet
supported on the CUDA graph path; falling back to the non-graph launch.
```

This is correct for results but loses CUDA graph performance (10ms/step instead of <1ms).

## Use case

IPC Newton solver with checkpoint-based overflow handling. The natural structure is:

```python
@qd.kernel(graph=True)
def step(self):
    while qd.graph_do_while(newton_cond):  # Newton outer loop
        # assembly writes triplets...

        with qd.checkpoint(yield_on=self.triplet_overflow):
            # sort + reduce
            self.sort_radix()
            ...
            # PCG inner loop (nested graph_do_while)
            while qd.graph_do_while(self.pcg_cond):
                self.pcg_iteration()
```

The checkpoint needs to wrap everything after assembly (so overflow can yield before sort touches invalid memory). But PCG is a nested `graph_do_while` inside the checkpoint body.

cgq (the C++ reference implementation) supports this pattern: `CheckpointScope` containing a PCG `create_solve_graph` (which is a conditional WHILE subgraph). See `sim_engine_pipeline.cu` line 187-352, "Checkpoint 1b: Assemble + PCG".

## Root cause

`graph_manager.cpp` line 796-813:

```cpp
// Unsupported combined case: a `qd.checkpoint()` block whose body contains a nested
// `qd.graph_do_while` (one cp_id spanning more than one loop level). build_level's per-level IF
// grouping assumes a checkpoint's tasks are flat within a single level, so fall back to the
// non-graph launch path (correct results, just no on-device gating) rather than build a wrong graph.
```

The check rejects any checkpoint where tasks have different `graph_do_while_level_id` values.

## Minimal reproducer

```python
import numpy as np
import quadrants as qd
qd.init(qd.cuda)

@qd.data_oriented
class Repro:
    def __init__(self):
        self.data = qd.ndarray(qd.f64, shape=(64,))
        self.cond = qd.ndarray(qd.i32, shape=())
        self.overflow = qd.ndarray(qd.i32, shape=())
        self.iter_count = qd.ndarray(qd.i32, shape=())

    @qd.kernel(graph=True)
    def run(self):
        for i in range(64):
            self.data[i] = qd.f64(i)

        with qd.checkpoint(yield_on=self.overflow):
            for i in range(64):
                self.data[i] = self.data[i] + 1.0

            for _ in range(1):
                self.cond[()] = 1
                self.iter_count[()] = 0
            while qd.graph_do_while(self.cond):
                for i in range(64):
                    self.data[i] = self.data[i] * 1.001
                for _ in range(1):
                    self.iter_count[()] = self.iter_count[()] + 1
                    if self.iter_count[()] >= 3:
                        self.cond[()] = 0

r = Repro()
r.overflow.from_numpy(np.array(0, dtype=np.int32))
r.run()  # triggers fallback warning
```

## Expected behavior

The CUDA graph should be built with the nested `graph_do_while` as a conditional WHILE node inside the checkpoint's conditional IF body (matching cgq's architecture).

## Current workaround

Move the `qd.graph_do_while` outside the `qd.checkpoint` block. This is functionally correct but means the checkpoint cannot protect the nested loop from running on stale/overflow data.

## Environment

- quadrants branch `hp/qipc-integration`, commit `b3ba47e6a`
- CUDA SM 12.0 (Blackwell), Python 3.13.9, Windows 11


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

graph=True: checkpoint containing nested graph_do_while should not fall back to non-graph launch #750

Summary

Use case

Root cause

Minimal reproducer

Expected behavior

Current workaround

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

graph=True: checkpoint containing nested graph_do_while should not fall back to non-graph launch #750

Description

Summary

Use case

Root cause

Minimal reproducer

Expected behavior

Current workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions