Skip to content

fix: agent_trust schema validation + feat: task pause/resume#490

Open
The-Vaibhav-Yadav wants to merge 3 commits intoGetBindu:mainfrom
The-Vaibhav-Yadav:main
Open

fix: agent_trust schema validation + feat: task pause/resume#490
The-Vaibhav-Yadav wants to merge 3 commits intoGetBindu:mainfrom
The-Vaibhav-Yadav:main

Conversation

@The-Vaibhav-Yadav
Copy link
Copy Markdown

@The-Vaibhav-Yadav The-Vaibhav-Yadav commented Apr 20, 2026

Summary

  • Problem: agent_trust accepted any arbitrary dict with no type or schema enforcement, allowing silent misconfigurations in production. _handle_pause and
    _handle_resume in Worker raised NotImplementedError, blocking any caller that issued pause/resume operations.
  • Why it matters: Invalid trust configs could reach runtime code unchecked. Unimplemented pause/resume caused unhandled exceptions that marked tasks as
    failed instead of suspending them.
  • What changed: Added AgentTrustConfig Pydantic model and wired schema validation into ConfigValidator. Implemented _handle_pause (→ suspended) and
    _handle_resume (→ submitted + re-queue) in the base Worker.
  • What did NOT change: Wire format (AgentTrust TypedDict), scheduler implementations, storage backends, ManifestWorker execution logic, and all existing
    behavior.

Change Type (select all that apply)

Scope:

  • Scheduler backends — _handle_pause/_handle_resume interact with the scheduler to re-queue on resume
  • Authentication / authorization — AgentTrustConfig validates the trust/identity config layer
  • Tests — new unit tests in both test_config_validator.py and test_manifest_worker.py

Linked Issue/PR

User-Visible / Behavior Changes

  • Passing an invalid agent_trust dict to bindufy(config=...) now raises ValueError with a descriptive message instead of silently proceeding.
  • Agents that receive a pause scheduler operation now correctly transition to suspended instead of raising an exception and marking the task failed.
  • Agents that receive a resume scheduler operation now re-queue the task for execution instead of raising an exception.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/credentials handling changed? No
  • New/changed network calls? No
  • Database schema/migration changes? No
  • Authentication/authorization changes? No

Verification

Environment

  • OS: macOS Darwin 25.3.0
  • Python version: 3.13.7
  • Storage backend: memory (unit tests)
  • Scheduler backend: memory (unit tests)

Steps to Test

  1. uv run pytest tests/unit/penguin/test_config_validator.py -v
  2. uv run pytest tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume -v
  3. uv run pytest tests/unit/ -q

Expected Behavior

  • All 845 unit tests pass, 0 failures.

Actual Behavior

  • All 845 unit tests pass, 0 failures.

Evidence (attach at least one)

  • Failing test before + passing after
  • Test output / logs

tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_valid_agent_trust_hydra PASSED
tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_agent_trust_missing_identity_provider_raises PASSED
tests/unit/penguin/test_config_validator.py::TestAgentTrustValidation::test_agent_trust_invalid_identity_provider_raises PASSED
... (15 total)

tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume::test_pause_working_task_transitions_to_suspended PASSED
tests/unit/server/workers/test_manifest_worker.py::TestWorkerPauseResume::test_resume_suspended_task_resubmits PASSED
... (8 total)

845 passed, 3 skipped, 7 warnings

Human Verification (required)

  • Verified scenarios: valid trust config passes, invalid identity_provider raises, invalid allowed_operations TrustLevel raises, pause no-ops on terminal
    tasks, resume re-queues with correct task_id/context_id, both operations no-op on unknown task IDs.
  • Edge cases checked: agent_trust: None (default) is still accepted. Pausing an already-suspended task is a no-op. Resuming a non-suspended task is a no-op.
  • What you did NOT verify: end-to-end pause/resume over a live Redis scheduler or with a real storage backend.

Compatibility / Migration

  • Backward compatible? Yes — agent_trust: None (the default) is unchanged. All existing valid configs pass.
  • Config/env changes? No
  • Database migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert: revert bindu/penguin/config_validator.py to remove the _validate_agent_trust_config call; revert bindu/server/workers/base.py to
    restore the NotImplementedError stubs.
  • Known bad symptoms: startup ValueError on a previously-valid config means the user's agent_trust dict was malformed and was previously silently ignored.

Risks and Mitigations

  • Risk: A user who was passing a malformed agent_trust dict will now get a ValueError at startup.
    • Mitigation: The error message includes the exact validation failure from Pydantic, making it actionable. Backward-compatible for all correctly-formed
      configs.

Checklist

  • Tests pass (uv run pytest)
  • Pre-commit hooks pass (uv run pre-commit run --all-files)
  • Documentation updated (if needed)
  • Security impact assessed
  • Human verification completed
  • Backward compatibility considered

Summary by CodeRabbit

  • New Features

    • Enhanced agent trust configuration validation: typed fields, sensible defaults, and stricter rejection of unknown keys.
    • Implemented worker pause and resume operations to suspend and re-queue tasks safely.
  • Tests

    • Added unit tests covering agent trust validation scenarios.
    • Added unit tests covering worker pause/resume behavior, including requeue failure handling.

The-Vaibhav-Yadav and others added 2 commits April 21, 2026 02:08
Previously agent_trust was accepted as any arbitrary dict with no type
or schema checks, creating a silent misconfiguration surface.

Adds AgentTrustConfig (Pydantic BaseModel) in types.py to validate
identity_provider, allowed_operations TrustLevel values, and optional
field types. Wires _validate_agent_trust_config into ConfigValidator so
invalid trust configs raise ValueError with a descriptive message.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
_handle_pause transitions a task from any active state (submitted,
working, input-required, auth-required) to 'suspended'. _handle_resume
transitions it back to 'submitted' and re-queues a run_task operation
so the worker re-executes with the full stored message history.

Both methods no-op silently when the task is missing or in an
incompatible state, matching the defensive pattern used by cancel_task.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8235cb9a-1cba-4237-8c5b-a057e9094fda

📥 Commits

Reviewing files that changed from the base of the PR and between 1314609 and 02253ba.

📒 Files selected for processing (5)
  • bindu/common/protocol/types.py
  • bindu/penguin/config_validator.py
  • bindu/server/workers/base.py
  • tests/unit/penguin/test_config_validator.py
  • tests/unit/server/workers/test_manifest_worker.py
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/penguin/test_config_validator.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • bindu/common/protocol/types.py
  • bindu/server/workers/base.py

📝 Walkthrough

Walkthrough

Added a Pydantic model AgentTrustConfig for validating the agent_trust config and wired schema enforcement into ConfigValidator. Implemented concrete _handle_pause and _handle_resume in the base Worker, updating task states in storage and re-queuing suspended tasks via the scheduler. Tests added for both features.

Changes

Cohort / File(s) Summary
Agent Trust Schema & Validator
bindu/common/protocol/types.py, bindu/penguin/config_validator.py
Added AgentTrustConfig (Pydantic) mirroring the trust TypedDict and integrated validation in ConfigValidator._validate_agent_trust_config, which raises ValueError on invalid agent_trust inputs.
Agent Trust Tests
tests/unit/penguin/test_config_validator.py
New TestAgentTrustValidation suite covering valid agent_trust cases, required fields, type constraints, and failure modes (non-dict, missing/unknown identity_provider, bad types, invalid allowed_operations).
Worker Pause/Resume Implementation
bindu/server/workers/base.py
Implemented _handle_pause to validate pauseable states, normalize task_id, load the task and set state to "suspended" when appropriate; implemented _handle_resume to normalize task_id, restore a suspended task to "submitted", build TaskSendParams, and call scheduler.run_task, rolling back to "suspended" on enqueue failure.
Pause/Resume Tests
tests/unit/server/workers/test_manifest_worker.py
New TestWorkerPauseResume suite verifying storage updates and scheduler interactions for pausing/resuming, no-ops for terminal/unknown states, and rollback behavior when scheduler.run_task raises.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Worker
    participant Storage
    participant Scheduler

    rect rgba(100, 150, 200, 0.5)
    Note over Client,Scheduler: Pause Flow
    Client->>Worker: _handle_pause(task_id)
    Worker->>Storage: load_task(task_id)
    Storage-->>Worker: task (non-terminal state)
    Worker->>Storage: update_task(state="suspended")
    Storage-->>Worker: confirmation
    Worker-->>Client: pause complete (logged)
    end

    rect rgba(200, 150, 100, 0.5)
    Note over Client,Scheduler: Resume Flow
    Client->>Worker: _handle_resume(task_id)
    Worker->>Storage: load_task(task_id)
    Storage-->>Worker: task (state="suspended")
    Worker->>Storage: update_task(state="submitted")
    Storage-->>Worker: confirmation
    Worker->>Scheduler: run_task(task_id, context_id)
    Scheduler-->>Worker: execution queued / error
    alt success
        Worker-->>Client: resume complete (logged)
    else failure
        Worker->>Storage: update_task(state="suspended")  %% rollback
        Storage-->>Worker: confirmation
        Worker-->>Client: error re-raised
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble schemas in the night,
I guard the trust with gentle bite,
Pause then resume — hops in a line,
Storage, scheduler, all align,
Tests hop by to prove it's right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the two main changes: agent_trust schema validation (bug fix) and task pause/resume implementation (feature).
Description check ✅ Passed The description comprehensively covers all required template sections: summary, change type, scope, linked issues, behavior changes, security impact, verification, compatibility, and risks.
Linked Issues check ✅ Passed All coding requirements from #382 and #383 are met: AgentTrustConfig model added with schema validation, ConfigValidator updated with _validate_agent_trust_config, Worker._handle_pause and _handle_resume implemented with unit tests.
Out of Scope Changes check ✅ Passed All changes are directly scoped to #382 and #383 requirements: agent_trust validation, pause/resume implementation, and supporting tests. No extraneous modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
bindu/penguin/config_validator.py (1)

299-306: LGTM — minor: narrow the exception.

Wrapping pydantic validation into a ValueError with agent_trust in the message is exactly what issue #382 asks for, and from exc preserves the chain. Catching bare Exception is broader than needed; pydantic.ValidationError (plus TypeError for bad kwargs) would be more precise and avoid masking unrelated bugs in future refactors.

Proposed change
-        try:
-            AgentTrustConfig(**trust_config)
-        except Exception as exc:
-            raise ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc
+        try:
+            AgentTrustConfig(**trust_config)
+        except (pydantic.ValidationError, TypeError) as exc:
+            raise ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc

(requires import pydantic at top.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/penguin/config_validator.py` around lines 299 - 306, The except block
in _validate_agent_trust_config currently catches a bare Exception; narrow it to
pydantic.ValidationError and TypeError so only validation/argument errors from
AgentTrustConfig(**trust_config) are converted to the ValueError. Update the
except clause to catch pydantic.ValidationError and TypeError, keep the raise
ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc, and add the
required pydantic import if not already present.
tests/unit/penguin/test_config_validator.py (2)

100-113: Test would pass even if validation drops fields — assert on a refetch.

result["agent_trust"] == trust currently checks the stored dict, but _process_complex_fields does not replace agent_trust with the validated model (it only validates for side-effects), so this assertion only proves the original dict was preserved, not that AgentTrustConfig accepted every field. Consider also asserting via AgentTrustConfig(**trust) directly (or adding one test that passes an unknown field to confirm it is rejected — which it currently isn't; see related comment on AgentTrustConfig).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/penguin/test_config_validator.py` around lines 100 - 113, The test
test_valid_agent_trust_with_all_fields only compares the original dict to
result["agent_trust"], which doesn't prove validation preserved/accepted all
fields because _process_complex_fields only validates for side-effects; update
the test to assert the validated model by constructing AgentTrustConfig from the
stored data (e.g., call AgentTrustConfig(**result["agent_trust"]) or
AgentTrustConfig(**trust>) to ensure instantiation succeeds) or alternatively
assert that AgentTrustConfig(**trust) equals the stored/returned model;
reference ConfigValidator.validate_and_process, _process_complex_fields and
AgentTrustConfig when making the change.

79-86: Address Ruff RUF012 on BASE_CONFIG.

Ruff flags the mutable class attribute. Annotate as ClassVar or convert to a helper/fixture so it's clear this is immutable shared test data. Also note that _config_with_trust uses {**self.BASE_CONFIG, ...} which is only a shallow copy — the nested deployment dict is shared across tests; fine today since none mutate it, but worth a copy.deepcopy or fixture if tests ever start mutating.

Proposed change
-class TestAgentTrustValidation:
-    """Tests for agent_trust configuration validation."""
-
-    BASE_CONFIG = {
-        "author": "test@example.com",
-        "name": "TestAgent",
-        "deployment": {"url": "http://localhost:3773"},
-    }
+class TestAgentTrustValidation:
+    """Tests for agent_trust configuration validation."""
+
+    BASE_CONFIG: ClassVar[dict] = {
+        "author": "test@example.com",
+        "name": "TestAgent",
+        "deployment": {"url": "http://localhost:3773"},
+    }

(add from typing import ClassVar)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/penguin/test_config_validator.py` around lines 79 - 86,
BASE_CONFIG is a mutable class attribute and should be annotated as a ClassVar
to satisfy Ruff RUF012 and make intent explicit; update the test class to import
ClassVar from typing and declare BASE_CONFIG: ClassVar[dict] = {...}. Also
change _config_with_trust to return a deep copy of BASE_CONFIG before updating
agent_trust (e.g., use copy.deepcopy) to avoid sharing the nested deployment
dict between tests, or alternatively convert BASE_CONFIG into a fixture/helper
that yields a fresh dict per test; update references to BASE_CONFIG and
_config_with_trust accordingly.
bindu/common/protocol/types.py (2)

1684-1700: Consider extra="forbid" to catch typos in user config.

AgentTrustConfig inherits A2A_MODEL_CONFIG which doesn't set extra. Since this model is described as "config-layer validation" (distinct from the wire AgentTrust TypedDict) whose purpose is to reject invalid agent_trust dicts, it should reject unknown keys — otherwise {"identity_provider": "hydra", "identityProvidr": "custom"} or any misspelled optional field silently passes and gets dropped, defeating much of the validation promise of issue #382.

Additionally, because A2A_MODEL_CONFIG sets alias_generator=to_camel, populate_by_name=True, callers can also supply camelCase keys (identityProvider, inheritedRoles, ...) for a field that is otherwise specified in snake_case throughout the config surface. If the intent is config-layer (snake_case only), consider a dedicated ConfigDict(extra="forbid") here instead of reusing the wire-format config.

Proposed change
-    model_config = A2A_MODEL_CONFIG
+    model_config = pydantic.ConfigDict(extra="forbid")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/common/protocol/types.py` around lines 1684 - 1700, AgentTrustConfig
currently inherits A2A_MODEL_CONFIG (which sets
alias_generator=to_camel/populate_by_name) and does not forbid extra fields, so
misspelled or camelCase keys can be accepted silently; update AgentTrustConfig
to use a dedicated Pydantic config that sets extra="forbid" (and remove or
override alias_generator/populate_by_name if you want to restrict to snake_case)
so unknown keys are rejected — modify the model_config used by AgentTrustConfig
(or set model_config = {"extra": "forbid", ...} on the class) to enforce
extra="forbid" while preserving any other needed settings.

1694-1694: inherited_roles is looser than the wire-format type.

Wire-format AgentTrust.inherited_roles is List[KeycloakRole], but this config model accepts List[Dict[str, Any]] with no nested validation. Issue #382 asks for "nested/type validation for trust config fields", so consider validating role entries (e.g. List[KeycloakRole] or a dedicated config role model) rather than untyped dicts. If intentional (to defer role materialization), a comment noting this would help.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/common/protocol/types.py` at line 1694, The inherited_roles field on
AgentTrust is declared too loosely as List[Dict[str, Any]]; change it to
validate against the wire-format type by using List[KeycloakRole] (or introduce
a dedicated config model like KeycloakRoleConfig and use
List[KeycloakRoleConfig]) and add appropriate pydantic/type validation so each
role entry is validated rather than an untyped dict; if deferring
materialization is intentional, add a clear inline comment on
AgentTrust.inherited_roles explaining why dicts are used and where/when they
will be validated or converted.
tests/unit/server/workers/test_manifest_worker.py (1)

804-833: Add coverage for the auth-required pause path.

The implementation allows pausing "auth-required" tasks, but these tests only cover "working", "submitted", and "input-required". Parametrize the pauseable-state test and include "auth-required" so auth-gated tasks don’t regress.

Test refactor sketch
+    `@pytest.mark.parametrize`("state", ["submitted", "input-required", "auth-required"])
     `@pytest.mark.asyncio`
-    async def test_pause_submitted_task_transitions_to_suspended(self):
+    async def test_pause_pauseable_task_transitions_to_suspended(self, state):
         worker, mock_storage, _ = self._make_worker()
         task_id = uuid4()
-        mock_storage.load_task.return_value = self._make_task(task_id, uuid4(), "submitted")
+        mock_storage.load_task.return_value = self._make_task(task_id, uuid4(), state)
 
         await worker._handle_pause({"task_id": task_id})
 
         mock_storage.update_task.assert_called_once_with(task_id, state="suspended")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/server/workers/test_manifest_worker.py` around lines 804 - 833,
The three tests testing pause transitions
(test_pause_working_task_transitions_to_suspended,
test_pause_submitted_task_transitions_to_suspended,
test_pause_input_required_task_transitions_to_suspended) should be combined or
refactored into a single parametrized pytest async test that iterates over
pauseable states including "working", "submitted", "input-required", and
"auth-required"; ensure the test still constructs the worker via
self._make_worker(), sets mock_storage.load_task.return_value =
self._make_task(task_id, uuid4(), state), calls await
worker._handle_pause({"task_id": task_id}), and asserts
mock_storage.update_task.assert_called_once_with(task_id, state="suspended") for
each state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindu/server/workers/base.py`:
- Around line 242-263: The handler _handle_pause currently allows pausing tasks
in "working" via the _PAUSEABLE_STATES set but only updates storage, risking
background execution continuing; either remove "working" from _PAUSEABLE_STATES
to prevent marking actively running tasks suspended until cooperative suspension
is implemented, or implement a cooperative suspension/cancellation handshake in
run_task (e.g., send a cancellation/suspend signal to the running executor and
wait for acknowledgement) and only call storage.update_task(task_id,
state="suspended") after the running task confirms it has stopped; reference
_PAUSEABLE_STATES, _handle_pause, run_task, and storage.update_task when making
the change.
- Around line 274-292: The resume flow currently sets state to "submitted"
before enqueueing which can leave tasks incorrectly transitioned if
scheduler.run_task fails; modify resume in the function that calls
_normalize_uuid/load_task (the resume handler in base.py) so the state
transition is atomic: either perform a storage compare-and-set/transactional
update (e.g., update_task with an expected previous state of "suspended") or
delay changing state until after scheduler.run_task succeeds and on any enqueue
error rollback to "suspended" (or use an outbox write + background enqueue to
ensure durability); reference and update the logic around load_task,
update_task, and scheduler.run_task to implement the CAS/rollback or outbox
pattern so resume never leaves a task in "submitted" unless it is durably
queued.

---

Nitpick comments:
In `@bindu/common/protocol/types.py`:
- Around line 1684-1700: AgentTrustConfig currently inherits A2A_MODEL_CONFIG
(which sets alias_generator=to_camel/populate_by_name) and does not forbid extra
fields, so misspelled or camelCase keys can be accepted silently; update
AgentTrustConfig to use a dedicated Pydantic config that sets extra="forbid"
(and remove or override alias_generator/populate_by_name if you want to restrict
to snake_case) so unknown keys are rejected — modify the model_config used by
AgentTrustConfig (or set model_config = {"extra": "forbid", ...} on the class)
to enforce extra="forbid" while preserving any other needed settings.
- Line 1694: The inherited_roles field on AgentTrust is declared too loosely as
List[Dict[str, Any]]; change it to validate against the wire-format type by
using List[KeycloakRole] (or introduce a dedicated config model like
KeycloakRoleConfig and use List[KeycloakRoleConfig]) and add appropriate
pydantic/type validation so each role entry is validated rather than an untyped
dict; if deferring materialization is intentional, add a clear inline comment on
AgentTrust.inherited_roles explaining why dicts are used and where/when they
will be validated or converted.

In `@bindu/penguin/config_validator.py`:
- Around line 299-306: The except block in _validate_agent_trust_config
currently catches a bare Exception; narrow it to pydantic.ValidationError and
TypeError so only validation/argument errors from
AgentTrustConfig(**trust_config) are converted to the ValueError. Update the
except clause to catch pydantic.ValidationError and TypeError, keep the raise
ValueError(f"Invalid 'agent_trust' configuration: {exc}") from exc, and add the
required pydantic import if not already present.

In `@tests/unit/penguin/test_config_validator.py`:
- Around line 100-113: The test test_valid_agent_trust_with_all_fields only
compares the original dict to result["agent_trust"], which doesn't prove
validation preserved/accepted all fields because _process_complex_fields only
validates for side-effects; update the test to assert the validated model by
constructing AgentTrustConfig from the stored data (e.g., call
AgentTrustConfig(**result["agent_trust"]) or AgentTrustConfig(**trust>) to
ensure instantiation succeeds) or alternatively assert that
AgentTrustConfig(**trust) equals the stored/returned model; reference
ConfigValidator.validate_and_process, _process_complex_fields and
AgentTrustConfig when making the change.
- Around line 79-86: BASE_CONFIG is a mutable class attribute and should be
annotated as a ClassVar to satisfy Ruff RUF012 and make intent explicit; update
the test class to import ClassVar from typing and declare BASE_CONFIG:
ClassVar[dict] = {...}. Also change _config_with_trust to return a deep copy of
BASE_CONFIG before updating agent_trust (e.g., use copy.deepcopy) to avoid
sharing the nested deployment dict between tests, or alternatively convert
BASE_CONFIG into a fixture/helper that yields a fresh dict per test; update
references to BASE_CONFIG and _config_with_trust accordingly.

In `@tests/unit/server/workers/test_manifest_worker.py`:
- Around line 804-833: The three tests testing pause transitions
(test_pause_working_task_transitions_to_suspended,
test_pause_submitted_task_transitions_to_suspended,
test_pause_input_required_task_transitions_to_suspended) should be combined or
refactored into a single parametrized pytest async test that iterates over
pauseable states including "working", "submitted", "input-required", and
"auth-required"; ensure the test still constructs the worker via
self._make_worker(), sets mock_storage.load_task.return_value =
self._make_task(task_id, uuid4(), state), calls await
worker._handle_pause({"task_id": task_id}), and asserts
mock_storage.update_task.assert_called_once_with(task_id, state="suspended") for
each state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb088941-3b37-4927-8c08-f2fcc6009332

📥 Commits

Reviewing files that changed from the base of the PR and between bc1c015 and 1314609.

📒 Files selected for processing (5)
  • bindu/common/protocol/types.py
  • bindu/penguin/config_validator.py
  • bindu/server/workers/base.py
  • tests/unit/penguin/test_config_validator.py
  • tests/unit/server/workers/test_manifest_worker.py

Comment on lines +242 to +263
_PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"})

async def _handle_pause(self, params: TaskIdParams) -> None:
"""Handle pause operation.
"""Pause a task by transitioning it to the 'suspended' state.

TODO: Implement task pause functionality
- Save current execution state
- Update task to 'suspended' state
- Release resources while preserving context
No-ops silently if the task is not found or already in a non-pauseable
state (terminal or already suspended), so callers do not need to guard.
"""
raise NotImplementedError("Pause operation not yet implemented")
task_id = self._normalize_uuid(params["task_id"])
task = await self.storage.load_task(task_id)
if task is None:
logger.warning(f"Pause requested for unknown task {task_id}")
return

current_state = task["status"]["state"]
if current_state not in self._PAUSEABLE_STATES:
logger.warning(
f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"
)
return

await self.storage.update_task(task_id, state="suspended")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t mark a working task suspended without stopping execution.

_handle_pause only rewrites storage state. If the task is actively executing, the worker/agent can continue and later overwrite "suspended" with "completed"/"failed", so pause appears successful while work continues. Either remove "working" until cooperative suspension/checkpointing is wired into run_task, or add that mechanism before setting "suspended".

Minimal safety direction
-    _PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"})
+    _PAUSEABLE_STATES = frozenset({"submitted", "input-required", "auth-required"})

If working pause support is required in this PR, the status update should be paired with a cancellation/suspension signal that running task execution observes before it can write a terminal state.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_PAUSEABLE_STATES = frozenset({"submitted", "working", "input-required", "auth-required"})
async def _handle_pause(self, params: TaskIdParams) -> None:
"""Handle pause operation.
"""Pause a task by transitioning it to the 'suspended' state.
TODO: Implement task pause functionality
- Save current execution state
- Update task to 'suspended' state
- Release resources while preserving context
No-ops silently if the task is not found or already in a non-pauseable
state (terminal or already suspended), so callers do not need to guard.
"""
raise NotImplementedError("Pause operation not yet implemented")
task_id = self._normalize_uuid(params["task_id"])
task = await self.storage.load_task(task_id)
if task is None:
logger.warning(f"Pause requested for unknown task {task_id}")
return
current_state = task["status"]["state"]
if current_state not in self._PAUSEABLE_STATES:
logger.warning(
f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"
)
return
await self.storage.update_task(task_id, state="suspended")
_PAUSEABLE_STATES = frozenset({"submitted", "input-required", "auth-required"})
async def _handle_pause(self, params: TaskIdParams) -> None:
"""Pause a task by transitioning it to the 'suspended' state.
No-ops silently if the task is not found or already in a non-pauseable
state (terminal or already suspended), so callers do not need to guard.
"""
task_id = self._normalize_uuid(params["task_id"])
task = await self.storage.load_task(task_id)
if task is None:
logger.warning(f"Pause requested for unknown task {task_id}")
return
current_state = task["status"]["state"]
if current_state not in self._PAUSEABLE_STATES:
logger.warning(
f"Cannot pause task {task_id}: state '{current_state}' is not pauseable"
)
return
await self.storage.update_task(task_id, state="suspended")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/server/workers/base.py` around lines 242 - 263, The handler
_handle_pause currently allows pausing tasks in "working" via the
_PAUSEABLE_STATES set but only updates storage, risking background execution
continuing; either remove "working" from _PAUSEABLE_STATES to prevent marking
actively running tasks suspended until cooperative suspension is implemented, or
implement a cooperative suspension/cancellation handshake in run_task (e.g.,
send a cancellation/suspend signal to the running executor and wait for
acknowledgement) and only call storage.update_task(task_id, state="suspended")
after the running task confirms it has stopped; reference _PAUSEABLE_STATES,
_handle_pause, run_task, and storage.update_task when making the change.

Comment thread bindu/server/workers/base.py Outdated
- Narrow except clause in _validate_agent_trust_config to
  pydantic.ValidationError and TypeError (was bare Exception)
- Add docstring to _validate_agent_trust_config
- Set extra="forbid" on AgentTrustConfig so unknown/camelCase keys
  are rejected rather than silently dropped
- Annotate BASE_CONFIG as ClassVar and use deepcopy in _config_with_trust
  to prevent test state leaking across runs
- Replace all-fields equality check with AgentTrustConfig round-trip
  to prove validation preserves field values
- Collapse three identical pause-state tests into one @parametrize test
  covering all four pauseable states
- Roll back task to "suspended" if scheduler.run_task fails during resume
  so the task is never stranded in submitted state without a worker

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@The-Vaibhav-Yadav
Copy link
Copy Markdown
Author

Addressed all review feedback in the latest commit — narrowed exception types, added extra="forbid", ClassVar annotation, parametrized tests, and resume rollback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Unimplemented Task Pause/Resume Functionality in Base Worker [Bug]: Incomplete Agent Trust Validation in ConfigValidator

1 participant