fix: harden async scheduling admission follow-ups#680
Conversation
|
MkDocs preview: https://2d46e2e7.dd-docs-preview.pages.dev Fern preview: https://nvidia-preview-pr-680.docs.buildwithfern.com/nemo/datadesigner
|
Greptile SummaryThis PR hardens the async scheduling and request-admission stack with several targeted follow-up fixes: retry accounting is moved to the
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/models/clients/model_request_executor.py | Adds per-attempt acquire/release retry loop with _should_retry correctly guarding against admission errors, rate limits, and TIMEOUT-without-status-code; all exception paths release the lease before propagating. |
| packages/data-designer-engine/src/data_designer/engine/models/request_admission/controller.py | Introduces terminal-denial early exit in both sync and async blocking-wait loops, and bounds the _released set to 8192 entries via a parallel deque; lock is correctly released before propagating RequestAdmissionError. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/scheduling/task_admission.py | Mirrors the controller's bounded-history pattern for task lease release tracking; _remember_released correctly evicts the front of the deque from the set before appending. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py | Disables transport-layer retries on underlying adapters when ModelRequestExecutor wraps them, forwarding the full retry_config so every retry gets its own admission lease cycle. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/async_scheduler.py | Populates AsyncCapacityPlan with request resources, provider/model static caps, domain limits, runtime pressure snapshots, and observed maxima from the wired-in RequestPressureSnapshotProvider. |
| tests_e2e/tests/test_mcp_demo.py | Changes ordering assertion to use (msg_index, tool_call_index) tuple so multiple tool calls within the same assistant message are ordered correctly via lexicographic tuple comparison. |
Sequence Diagram
sequenceDiagram
participant Caller
participant ModelRequestExecutor
participant RequestAdmissionController
participant InnerClient
loop For each attempt (1..max_retries+1)
Caller->>ModelRequestExecutor: completion(request)
ModelRequestExecutor->>RequestAdmissionController: acquire_sync(item)
alt Hard/shutdown denial (terminal)
RequestAdmissionController-->>ModelRequestExecutor: raise RequestAdmissionError
ModelRequestExecutor-->>Caller: raise ProviderError(TIMEOUT) [no retry]
else Lease granted
RequestAdmissionController-->>ModelRequestExecutor: RequestAdmissionLease
ModelRequestExecutor->>InnerClient: call()
alt Success
InnerClient-->>ModelRequestExecutor: result
ModelRequestExecutor->>RequestAdmissionController: release(lease, success)
ModelRequestExecutor-->>Caller: result
else Retryable ProviderError (e.g. 503)
InnerClient-->>ModelRequestExecutor: ProviderError
ModelRequestExecutor->>RequestAdmissionController: release(lease, provider_failure)
Note over ModelRequestExecutor: _should_retry True, sleep, retry
else Non-retryable
InnerClient-->>ModelRequestExecutor: ProviderError
ModelRequestExecutor->>RequestAdmissionController: release(lease, outcome)
ModelRequestExecutor-->>Caller: raise ProviderError [no retry]
end
end
end
Reviews (5): Last reviewed commit: "refactor: bound released lease history w..." | Re-trigger Greptile
Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
aaea837 to
6aba2f4
Compare
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
📋 Summary
Follow-up fixes for the PR #661 review. This moves retry accounting to the request-admission boundary, fixes hard request-denial waits, fills request-admission capacity-plan fields, bounds release bookkeeping, stabilizes the idle regression harness, restores historical Fern versions, and hardens the MCP e2e ordering assertion. The request-pressure generation-domain normalization is intentionally left to PR #679, which already owns that change.
🔗 Related Issue
N/A. Follow-up to PR #661 and complements PR #679.
🔄 Changes
ModelRequestExecutorso every outbound attempt gets its own acquire/release lease cycle.RequestAdmissionErrorfor terminal hard-policy denials in blocking sync/async acquire paths, and keep task/request duplicate-release tracking bounded.AsyncCapacityPlanwith request resources, provider/model static caps, request limits, controller config snapshots, runtime request pressure, and observed maxima.--skip-runread-only for missing artifacts, add extra measured stress iterations, and tune the quick wide-stress threshold to match observed quick-mode noise.fern/versions/v0.5.8andfern/versions/v0.5.9snapshots back to the base branch content.🧪 Testing
uv run --group dev ruff check ...PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-config/tests/config/test_scheduling.py packages/data-designer-config/tests/config/test_run_config.py packages/data-designer-engine/tests/engine/dataset_builders/scheduling packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py packages/data-designer-engine/tests/engine/models/clients/test_factory.py packages/data-designer-engine/tests/engine/models/clients/test_model_request_executor.py packages/data-designer-engine/tests/engine/models/request_admission/test_controller.py packages/data-designer-engine/tests/engine/test_capacity.py packages/data-designer-engine/tests/engine/test_observability.py packages/data-designer-engine/tests/engine/test_async_scheduling_benchmark.py -q— 222 passeduv run --directory tests_e2e pytest -q— 8 passed, 8 warningsPYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev python scripts/benchmarks/run_async_scheduling_idle_regression.py --quick --artifact-dir .tmp/idle-regression-fixes --report-path .tmp/idle-regression-fixes/report.html --summary-path .tmp/idle-regression-fixes/summary.json --checks-path .tmp/idle-regression-fixes/checks.json— PASS, 0 errors, 0 warnings✅ Checklist