fix: agent_trust schema validation + feat: task pause/resume by The-Vaibhav-Yadav · Pull Request #490 · GetBindu/Bindu

The-Vaibhav-Yadav · 2026-04-20T20:49:14Z

Summary

Problem: agent_trust accepted any arbitrary dict with no type or schema enforcement, allowing silent misconfigurations in production. _handle_pause and
_handle_resume in Worker raised NotImplementedError, blocking any caller that issued pause/resume operations.
Why it matters: Invalid trust configs could reach runtime code unchecked. Unimplemented pause/resume caused unhandled exceptions that marked tasks as
failed instead of suspending them.
What changed: Added AgentTrustConfig Pydantic model and wired schema validation into ConfigValidator. Implemented _handle_pause (→ suspended) and
_handle_resume (→ submitted + re-queue) in the base Worker.
What did NOT change: Wire format (AgentTrust TypedDict), scheduler implementations, storage backends, ManifestWorker execution logic, and all existing
behavior.

Change Type (select all that apply)

Bug fix — agent_trust accepted any dict silently (issue [Bug]: Incomplete Agent Trust Validation in ConfigValidator #382)
Feature — pause/resume were unimplemented stubs (issue [Bug]: Unimplemented Task Pause/Resume Functionality in Base Worker #383)
Tests — new test cases for both

Scope:

Scheduler backends — _handle_pause/_handle_resume interact with the scheduler to re-queue on resume
Authentication / authorization — AgentTrustConfig validates the trust/identity config layer
Tests — new unit tests in both test_config_validator.py and test_manifest_worker.py

Linked Issue/PR

User-Visible / Behavior Changes

Passing an invalid agent_trust dict to bindufy(config=...) now raises ValueError with a descriptive message instead of silently proceeding.
Agents that receive a pause scheduler operation now correctly transition to suspended instead of raising an exception and marking the task failed.
Agents that receive a resume scheduler operation now re-queue the task for execution instead of raising an exception.

Security Impact (required)

New permissions/capabilities? No
Secrets/credentials handling changed? No
New/changed network calls? No
Database schema/migration changes? No
Authentication/authorization changes? No

Verification

Environment

OS: macOS Darwin 25.3.0
Python version: 3.13.7
Storage backend: memory (unit tests)
Scheduler backend: memory (unit tests)

Steps to Test

uv run pytest tests/unit/penguin/test_config_validator.py -v
uv run pytest tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume -v
uv run pytest tests/unit/ -q

Expected Behavior

All 845 unit tests pass, 0 failures.

Actual Behavior

All 845 unit tests pass, 0 failures.

Evidence (attach at least one)

Failing test before + passing after
Test output / logs

tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_valid_agent_trust_hydra PASSED
tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_agent_trust_missing_identity_provider_raises PASSED
tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_agent_trust_invalid_identity_provider_raises PASSED
... (15 total)

tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume::test_pause_working_task_transitions_to_suspended PASSED
tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume::test_resume_suspended_task_resubmits PASSED
... (8 total)

845 passed, 3 skipped, 7 warnings

Human Verification (required)

Verified scenarios: valid trust config passes, invalid identity_provider raises, invalid allowed_operations TrustLevel raises, pause no-ops on terminal
tasks, resume re-queues with correct task_id/context_id, both operations no-op on unknown task IDs.
Edge cases checked: agent_trust: None (default) is still accepted. Pausing an already-suspended task is a no-op. Resuming a non-suspended task is a no-op.
What you did NOT verify: end-to-end pause/resume over a live Redis scheduler or with a real storage backend.

Compatibility / Migration

Backward compatible? Yes — agent_trust: None (the default) is unchanged. All existing valid configs pass.
Config/env changes? No
Database migration needed? No

Failure Recovery (if this breaks)

How to disable/revert: revert bindu/penguin/config_validator.py to remove the _validate_agent_trust_config call; revert bindu/server/workers/base.py to
restore the NotImplementedError stubs.
Known bad symptoms: startup ValueError on a previously-valid config means the user's agent_trust dict was malformed and was previously silently ignored.

Risks and Mitigations

Risk: A user who was passing a malformed agent_trust dict will now get a ValueError at startup.
- Mitigation: The error message includes the exact validation failure from Pydantic, making it actionable. Backward-compatible for all correctly-formed
  configs.

Checklist

Tests pass (uv run pytest)
Pre-commit hooks pass (uv run pre-commit run --all-files)
Documentation updated (if needed)
Security impact assessed
Human verification completed
Backward compatibility considered

Summary by CodeRabbit

New Features
- Enhanced agent trust configuration validation: typed fields, sensible defaults, and stricter rejection of unknown keys.
- Implemented worker pause and resume operations to suspend and re-queue tasks safely.
Tests
- Added unit tests covering agent trust validation scenarios.
- Added unit tests covering worker pause/resume behavior, including requeue failure handling.

Previously agent_trust was accepted as any arbitrary dict with no type or schema checks, creating a silent misconfiguration surface. Adds AgentTrustConfig (Pydantic BaseModel) in types.py to validate identity_provider, allowed_operations TrustLevel values, and optional field types. Wires _validate_agent_trust_config into ConfigValidator so invalid trust configs raise ValueError with a descriptive message. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

_handle_pause transitions a task from any active state (submitted, working, input-required, auth-required) to 'suspended'. _handle_resume transitions it back to 'submitted' and re-queues a run_task operation so the worker re-executes with the full stored message history. Both methods no-op silently when the task is missing or in an incompatible state, matching the defensive pattern used by cancel_task. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-20T20:49:30Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8235cb9a-1cba-4237-8c5b-a057e9094fda

📥 Commits

Reviewing files that changed from the base of the PR and between 1314609 and 02253ba.

📒 Files selected for processing (5)

bindu/common/protocol/types.py
bindu/penguin/config_validator.py
bindu/server/workers/base.py
tests/unit/penguin/test_config_validator.py
tests/unit/server/workers/test_manifest_worker.py

✅ Files skipped from review due to trivial changes (1)

tests/unit/penguin/test_config_validator.py

🚧 Files skipped from review as they are similar to previous changes (2)

bindu/common/protocol/types.py
bindu/server/workers/base.py

📝 Walkthrough

Walkthrough

Added a Pydantic model AgentTrustConfig for validating the agent_trust config and wired schema enforcement into ConfigValidator. Implemented concrete _handle_pause and _handle_resume in the base Worker, updating task states in storage and re-queuing suspended tasks via the scheduler. Tests added for both features.

Changes

Cohort / File(s)	Summary
Agent Trust Schema & Validator `bindu/common/protocol/types.py`, `bindu/penguin/config_validator.py`	Added `AgentTrustConfig` (Pydantic) mirroring the trust TypedDict and integrated validation in `ConfigValidator._validate_agent_trust_config`, which raises `ValueError` on invalid `agent_trust` inputs.
Agent Trust Tests `tests/unit/penguin/test_config_validator.py`	New `TestAgentTrustValidation` suite covering valid `agent_trust` cases, required fields, type constraints, and failure modes (non-dict, missing/unknown `identity_provider`, bad types, invalid `allowed_operations`).
Worker Pause/Resume Implementation `bindu/server/workers/base.py`	Implemented `_handle_pause` to validate pauseable states, normalize `task_id`, load the task and set state to `"suspended"` when appropriate; implemented `_handle_resume` to normalize `task_id`, restore a suspended task to `"submitted"`, build `TaskSendParams`, and call `scheduler.run_task`, rolling back to `"suspended"` on enqueue failure.
Pause/Resume Tests `tests/unit/server/workers/test_manifest_worker.py`	New `TestWorkerPauseResume` suite verifying storage updates and scheduler interactions for pausing/resuming, no-ops for terminal/unknown states, and rollback behavior when `scheduler.run_task` raises.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Worker
    participant Storage
    participant Scheduler

    rect rgba(100, 150, 200, 0.5)
    Note over Client,Scheduler: Pause Flow
    Client->>Worker: _handle_pause(task_id)
    Worker->>Storage: load_task(task_id)
    Storage-->>Worker: task (non-terminal state)
    Worker->>Storage: update_task(state="suspended")
    Storage-->>Worker: confirmation
    Worker-->>Client: pause complete (logged)
    end

    rect rgba(200, 150, 100, 0.5)
    Note over Client,Scheduler: Resume Flow
    Client->>Worker: _handle_resume(task_id)
    Worker->>Storage: load_task(task_id)
    Storage-->>Worker: task (state="suspended")
    Worker->>Storage: update_task(state="submitted")
    Storage-->>Worker: confirmation
    Worker->>Scheduler: run_task(task_id, context_id)
    Scheduler-->>Worker: execution queued / error
    alt success
        Worker-->>Client: resume complete (logged)
    else failure
        Worker->>Storage: update_task(state="suspended")  %% rollback
        Storage-->>Worker: confirmation
        Worker-->>Client: error re-raised
    end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble schemas in the night,
I guard the trust with gentle bite,
Pause then resume — hops in a line,
Storage, scheduler, all align,
Tests hop by to prove it's right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the two main changes: agent_trust schema validation (bug fix) and task pause/resume implementation (feature).
Description check	✅ Passed	The description comprehensively covers all required template sections: summary, change type, scope, linked issues, behavior changes, security impact, verification, compatibility, and risks.
Linked Issues check	✅ Passed	All coding requirements from `#382` and `#383` are met: AgentTrustConfig model added with schema validation, ConfigValidator updated with _validate_agent_trust_config, Worker._handle_pause and _handle_resume implemented with unit tests.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to `#382` and `#383` requirements: agent_trust validation, pause/resume implementation, and supporting tests. No extraneous modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (6)

bindu/penguin/config_validator.py (1)
299-306: LGTM — minor: narrow the exception.

Wrapping pydantic validation into a ValueError with agent_trust in the message is exactly what issue #382 asks for, and from exc preserves the chain. Catching bare Exception is broader than needed; pydantic.ValidationError (plus TypeError for bad kwargs) would be more precise and avoid masking unrelated bugs in future refactors.
Proposed change
-        try:
-            AgentTrustConfig(**trust_config)
-        except Exception as exc:
-            raise ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc
+        try:
+            AgentTrustConfig(**trust_config)
+        except (pydantic.ValidationError, TypeError) as exc:
+            raise ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc
(requires import pydantic at top.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/penguin/config_validator.py` around lines 299 - 306, The except block
in _validate_agent_trust_config currently catches a bare Exception; narrow it to
pydantic.ValidationError and TypeError so only validation/argument errors from
AgentTrustConfig(**trust_config) are converted to the ValueError. Update the
except clause to catch pydantic.ValidationError and TypeError, keep the raise
ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc, and add the
required pydantic import if not already present.
tests/unit/penguin/test_config_validator.py (2)
100-113: Test would pass even if validation drops fields — assert on a refetch.

result["agent_trust"] == trust currently checks the stored dict, but _process_complex_fields does not replace agent_trust with the validated model (it only validates for side-effects), so this assertion only proves the original dict was preserved, not that AgentTrustConfig accepted every field. Consider also asserting via AgentTrustConfig(**trust) directly (or adding one test that passes an unknown field to confirm it is rejected — which it currently isn't; see related comment on AgentTrustConfig).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/penguin/test_config_validator.py` around lines 100 - 113, The test
test_valid_agent_trust_with_all_fields only compares the original dict to
result["agent_trust"], which doesn't prove validation preserved/accepted all
fields because _process_complex_fields only validates for side-effects; update
the test to assert the validated model by constructing AgentTrustConfig from the
stored data (e.g., call AgentTrustConfig(**result["agent_trust"]) or
AgentTrustConfig(**trust>) to ensure instantiation succeeds) or alternatively
assert that AgentTrustConfig(**trust) equals the stored/returned model;
reference ConfigValidator.validate_and_process, _process_complex_fields and
AgentTrustConfig when making the change.
79-86: Address Ruff RUF012 on BASE_CONFIG.

Ruff flags the mutable class attribute. Annotate as ClassVar or convert to a helper/fixture so it's clear this is immutable shared test data. Also note that _config_with_trust uses {**self.BASE_CONFIG, ...} which is only a shallow copy — the nested deployment dict is shared across tests; fine today since none mutate it, but worth a copy.deepcopy or fixture if tests ever start mutating.
Proposed change
-class TestAgentTrustValidation:
-    """Tests for agent_trust configuration validation."""
-
-    BASE_CONFIG = {
-        "author": "test@example.com",
-        "name": "TestAgent",
-        "deployment": {"url": "http://localhost:3773"},
-    }
+class TestAgentTrustValidation:
+    """Tests for agent_trust configuration validation."""
+
+    BASE_CONFIG: ClassVar[dict] = {
+        "author": "test@example.com",
+        "name": "TestAgent",
+        "deployment": {"url": "http://localhost:3773"},
+    }
(add from typing import ClassVar)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/penguin/test_config_validator.py` around lines 79 - 86,
BASE_CONFIG is a mutable class attribute and should be annotated as a ClassVar
to satisfy Ruff RUF012 and make intent explicit; update the test class to import
ClassVar from typing and declare BASE_CONFIG: ClassVar[dict] = {...}. Also
change _config_with_trust to return a deep copy of BASE_CONFIG before updating
agent_trust (e.g., use copy.deepcopy) to avoid sharing the nested deployment
dict between tests, or alternatively convert BASE_CONFIG into a fixture/helper
that yields a fresh dict per test; update references to BASE_CONFIG and
_config_with_trust accordingly.
bindu/common/protocol/types.py (2)
1684-1700: Consider extra="forbid" to catch typos in user config.

AgentTrustConfig inherits A2A_MODEL_CONFIG which doesn't set extra. Since this model is described as "config-layer validation" (distinct from the wire AgentTrust TypedDict) whose purpose is to reject invalid agent_trust dicts, it should reject unknown keys — otherwise {"identity_provider": "hydra", "identityProvidr": "custom"} or any misspelled optional field silently passes and gets dropped, defeating much of the validation promise of issue #382.

Additionally, because A2A_MODEL_CONFIG sets alias_generator=to_camel, populate_by_name=True, callers can also supply camelCase keys (identityProvider, inheritedRoles, ...) for a field that is otherwise specified in snake_case throughout the config surface. If the intent is config-layer (snake_case only), consider a dedicated ConfigDict(extra="forbid") here instead of reusing the wire-format config.
Proposed change
-    model_config = A2A_MODEL_CONFIG
+    model_config = pydantic.ConfigDict(extra="forbid")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/common/protocol/types.py` around lines 1684 - 1700, AgentTrustConfig
currently inherits A2A_MODEL_CONFIG (which sets
alias_generator=to_camel/populate_by_name) and does not forbid extra fields, so
misspelled or camelCase keys can be accepted silently; update AgentTrustConfig
to use a dedicated Pydantic config that sets extra="forbid" (and remove or
override alias_generator/populate_by_name if you want to restrict to snake_case)
so unknown keys are rejected — modify the model_config used by AgentTrustConfig
(or set model_config = {"extra": "forbid", ...} on the class) to enforce
extra="forbid" while preserving any other needed settings.
1694-1694: inherited_roles is looser than the wire-format type.

Wire-format AgentTrust.inherited_roles is List[KeycloakRole], but this config model accepts List[Dict[str, Any]] with no nested validation. Issue #382 asks for "nested/type validation for trust config fields", so consider validating role entries (e.g. List[KeycloakRole] or a dedicated config role model) rather than untyped dicts. If intentional (to defer role materialization), a comment noting this would help.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/common/protocol/types.py` at line 1694, The inherited_roles field on
AgentTrust is declared too loosely as List[Dict[str, Any]]; change it to
validate against the wire-format type by using List[KeycloakRole] (or introduce
a dedicated config model like KeycloakRoleConfig and use
List[KeycloakRoleConfig]) and add appropriate pydantic/type validation so each
role entry is validated rather than an untyped dict; if deferring
materialization is intentional, add a clear inline comment on
AgentTrust.inherited_roles explaining why dicts are used and where/when they
will be validated or converted.
tests/unit/server/workers/test_manifest_worker.py (1)
804-833: Add coverage for the auth-required pause path.

The implementation allows pausing "auth-required" tasks, but these tests only cover "working", "submitted", and "input-required". Parametrize the pauseable-state test and include "auth-required" so auth-gated tasks don’t regress.
Test refactor sketch
+    `@pytest.mark.parametrize`("state", ["submitted", "input-required", "auth-required"])
     `@pytest.mark.asyncio`
-    async def test_pause_submitted_task_transitions_to_suspended(self):
+    async def test_pause_pauseable_task_transitions_to_suspended(self, state):
         worker, mock_storage, _ = self._make_worker()
         task_id = uuid4()
-        mock_storage.load_task.return_value = self._make_task(task_id, uuid4(), "submitted")
+        mock_storage.load_task.return_value = self._make_task(task_id, uuid4(), state)
 
         await worker._handle_pause({"task_id": task_id})
 
         mock_storage.update_task.assert_called_once_with(task_id, state="suspended")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/server/workers/test_manifest_worker.py` around lines 804 - 833,
The three tests testing pause transitions
(test_pause_working_task_transitions_to_suspended,
test_pause_submitted_task_transitions_to_suspended,
test_pause_input_required_task_transitions_to_suspended) should be combined or
refactored into a single parametrized pytest async test that iterates over
pauseable states including "working", "submitted", "input-required", and
"auth-required"; ensure the test still constructs the worker via
self._make_worker(), sets mock_storage.load_task.return_value =
self._make_task(task_id, uuid4(), state), calls await
worker._handle_pause({"task_id": task_id}), and asserts
mock_storage.update_task.assert_called_once_with(task_id, state="suspended") for
each state.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindu/server/workers/base.py`:
- Around line 242-263: The handler _handle_pause currently allows pausing tasks
in "working" via the _PAUSEABLE_STATES set but only updates storage, risking
background execution continuing; either remove "working" from _PAUSEABLE_STATES
to prevent marking actively running tasks suspended until cooperative suspension
is implemented, or implement a cooperative suspension/cancellation handshake in
run_task (e.g., send a cancellation/suspend signal to the running executor and
wait for acknowledgement) and only call storage.update_task(task_id,
state="suspended") after the running task confirms it has stopped; reference
_PAUSEABLE_STATES, _handle_pause, run_task, and storage.update_task when making
the change.
- Around line 274-292: The resume flow currently sets state to "submitted"
before enqueueing which can leave tasks incorrectly transitioned if
scheduler.run_task fails; modify resume in the function that calls
_normalize_uuid/load_task (the resume handler in base.py) so the state
transition is atomic: either perform a storage compare-and-set/transactional
update (e.g., update_task with an expected previous state of "suspended") or
delay changing state until after scheduler.run_task succeeds and on any enqueue
error rollback to "suspended" (or use an outbox write + background enqueue to
ensure durability); reference and update the logic around load_task,
update_task, and scheduler.run_task to implement the CAS/rollback or outbox
pattern so resume never leaves a task in "submitted" unless it is durably
queued.

---

Nitpick comments:
In `@bindu/common/protocol/types.py`:
- Around line 1684-1700: AgentTrustConfig currently inherits A2A_MODEL_CONFIG
(which sets alias_generator=to_camel/populate_by_name) and does not forbid extra
fields, so misspelled or camelCase keys can be accepted silently; update
AgentTrustConfig to use a dedicated Pydantic config that sets extra="forbid"
(and remove or override alias_generator/populate_by_name if you want to restrict
to snake_case) so unknown keys are rejected — modify the model_config used by
AgentTrustConfig (or set model_config = {"extra": "forbid", ...} on the class)
to enforce extra="forbid" while preserving any other needed settings.
- Line 1694: The inherited_roles field on AgentTrust is declared too loosely as
List[Dict[str, Any]]; change it to validate against the wire-format type by
using List[KeycloakRole] (or introduce a dedicated config model like
KeycloakRoleConfig and use List[KeycloakRoleConfig]) and add appropriate
pydantic/type validation so each role entry is validated rather than an untyped
dict; if deferring materialization is intentional, add a clear inline comment on
AgentTrust.inherited_roles explaining why dicts are used and where/when they
will be validated or converted.

In `@bindu/penguin/config_validator.py`:
- Around line 299-306: The except block in _validate_agent_trust_config
currently catches a bare Exception; narrow it to pydantic.ValidationError and
TypeError so only validation/argument errors from
AgentTrustConfig(**trust_config) are converted to the ValueError. Update the
except clause to catch pydantic.ValidationError and TypeError, keep the raise
ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc, and add the
required pydantic import if not already present.

In `@tests/unit/penguin/test_config_validator.py`:
- Around line 100-113: The test test_valid_agent_trust_with_all_fields only
compares the original dict to result["agent_trust"], which doesn't prove
validation preserved/accepted all fields because _process_complex_fields only
validates for side-effects; update the test to assert the validated model by
constructing AgentTrustConfig from the stored data (e.g., call
AgentTrustConfig(**result["agent_trust"]) or AgentTrustConfig(**trust>) to
ensure instantiation succeeds) or alternatively assert that
AgentTrustConfig(**trust) equals the stored/returned model; reference
ConfigValidator.validate_and_process, _process_complex_fields and
AgentTrustConfig when making the change.
- Around line 79-86: BASE_CONFIG is a mutable class attribute and should be
annotated as a ClassVar to satisfy Ruff RUF012 and make intent explicit; update
the test class to import ClassVar from typing and declare BASE_CONFIG:
ClassVar[dict] = {...}. Also change _config_with_trust to return a deep copy of
BASE_CONFIG before updating agent_trust (e.g., use copy.deepcopy) to avoid
sharing the nested deployment dict between tests, or alternatively convert
BASE_CONFIG into a fixture/helper that yields a fresh dict per test; update
references to BASE_CONFIG and _config_with_trust accordingly.

In `@tests/unit/server/workers/test_manifest_worker.py`:
- Around line 804-833: The three tests testing pause transitions
(test_pause_working_task_transitions_to_suspended,
test_pause_submitted_task_transitions_to_suspended,
test_pause_input_required_task_transitions_to_suspended) should be combined or
refactored into a single parametrized pytest async test that iterates over
pauseable states including "working", "submitted", "input-required", and
"auth-required"; ensure the test still constructs the worker via
self._make_worker(), sets mock_storage.load_task.return_value =
self._make_task(task_id, uuid4(), state), calls await
worker._handle_pause({"task_id": task_id}), and asserts
mock_storage.update_task.assert_called_once_with(task_id, state="suspended") for
each state.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb088941-3b37-4927-8c08-f2fcc6009332

📥 Commits

Reviewing files that changed from the base of the PR and between bc1c015 and 1314609.

📒 Files selected for processing (5)

bindu/common/protocol/types.py
bindu/penguin/config_validator.py
bindu/server/workers/base.py
tests/unit/penguin/test_config_validator.py
tests/unit/server/workers/test_manifest_worker.py

coderabbitai · 2026-04-20T20:54:55Z

+    _PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"})
+
    async def _handle_pause(self, params: TaskIdParams) -> None:
-        """Handle pause operation.
+        """Pause a task by transitioning it to the 'suspended' state.

-        TODO: Implement task pause functionality
-        - Save current execution state
-        - Update task to 'suspended' state
-        - Release resources while preserving context
+        No-ops silently if the task is not found or already in a non-pauseable
+        state (terminal or already suspended), so callers do not need to guard.
        """
-        raise NotImplementedError("Pause operation not yet implemented")
+        task_id = self._normalize_uuid(params["task_id"])
+        task = await self.storage.load_task(task_id)
+        if task is None:
+            logger.warning(f"Pause requested for unknown task {task_id}")
+            return
+
+        current_state = task["status"]["state"]
+        if current_state not in self._PAUSEABLE_STATES:
+            logger.warning(
+                f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"
+            )
+            return
+
+        await self.storage.update_task(task_id, state="suspended")


⚠️ Potential issue | 🟠 Major

Don’t mark a working task suspended without stopping execution.

_handle_pause only rewrites storage state. If the task is actively executing, the worker/agent can continue and later overwrite "suspended" with "completed"/"failed", so pause appears successful while work continues. Either remove "working" until cooperative suspension/checkpointing is wired into run_task, or add that mechanism before setting "suspended".

Minimal safety direction

- _PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"}) + _PAUSEABLE_STATES = frozenset({"submitted", "input-required", "auth-required"})

If working pause support is required in this PR, the status update should be paired with a cancellation/suspension signal that running task execution observes before it can write a terminal state.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"})

async def _handle_pause(self, params: TaskIdParams) -> None:

"""Handle pause operation.

"""Pause a task by transitioning it to the 'suspended' state.

TODO: Implement task pause functionality

- Save current execution state

- Update task to 'suspended' state

- Release resources while preserving context

No-ops silently if the task is not found or already in a non-pauseable

state (terminal or already suspended), so callers do not need to guard.

"""

raise NotImplementedError("Pause operation not yet implemented")

task_id = self._normalize_uuid(params["task_id"])

task = await self.storage.load_task(task_id)

if task is None:

logger.warning(f"Pause requested for unknown task {task_id}")

return

current_state = task["status"]["state"]

if current_state not in self._PAUSEABLE_STATES:

logger.warning(

f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"

)

return

await self.storage.update_task(task_id, state="suspended")

_PAUSEABLE_STATES = frozenset({"submitted", "input-required", "auth-required"})

async def _handle_pause(self, params: TaskIdParams) -> None:

"""Pause a task by transitioning it to the 'suspended' state.

No-ops silently if the task is not found or already in a non-pauseable

state (terminal or already suspended), so callers do not need to guard.

"""

task_id = self._normalize_uuid(params["task_id"])

task = await self.storage.load_task(task_id)

if task is None:

logger.warning(f"Pause requested for unknown task {task_id}")

return

current_state = task["status"]["state"]

if current_state not in self._PAUSEABLE_STATES:

logger.warning(

f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"

)

return

await self.storage.update_task(task_id, state="suspended")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@bindu/server/workers/base.py` around lines 242 - 263, The handler _handle_pause currently allows pausing tasks in "working" via the _PAUSEABLE_STATES set but only updates storage, risking background execution continuing; either remove "working" from _PAUSEABLE_STATES to prevent marking actively running tasks suspended until cooperative suspension is implemented, or implement a cooperative suspension/cancellation handshake in run_task (e.g., send a cancellation/suspend signal to the running executor and wait for acknowledgement) and only call storage.update_task(task_id, state="suspended") after the running task confirms it has stopped; reference _PAUSEABLE_STATES, _handle_pause, run_task, and storage.update_task when making the change.

- Narrow except clause in _validate_agent_trust_config to pydantic.ValidationError and TypeError (was bare Exception) - Add docstring to _validate_agent_trust_config - Set extra="forbid" on AgentTrustConfig so unknown/camelCase keys are rejected rather than silently dropped - Annotate BASE_CONFIG as ClassVar and use deepcopy in _config_with_trust to prevent test state leaking across runs - Replace all-fields equality check with AgentTrustConfig round-trip to prove validation preserves field values - Collapse three identical pause-state tests into one @parametrize test covering all four pauseable states - Roll back task to "suspended" if scheduler.run_task fails during resume so the task is never stranded in submitted state without a worker Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

The-Vaibhav-Yadav · 2026-04-21T05:00:18Z

Addressed all review feedback in the latest commit — narrowed exception types, added extra="forbid", ClassVar annotation, parametrized tests, and resume rollback.

The-Vaibhav-Yadav and others added 2 commits April 21, 2026 02:08

coderabbitai bot reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: agent_trust schema validation + feat: task pause/resume#490

fix: agent_trust schema validation + feat: task pause/resume#490
The-Vaibhav-Yadav wants to merge 3 commits intoGetBindu:mainfrom
The-Vaibhav-Yadav:main

The-Vaibhav-Yadav commented Apr 20, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 20, 2026

Uh oh!

Uh oh!

The-Vaibhav-Yadav commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

The-Vaibhav-Yadav commented Apr 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

The-Vaibhav-Yadav commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

The-Vaibhav-Yadav commented Apr 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 20, 2026 •

edited

Loading