Kass/test cases batch 2 by thrialectics · Pull Request #695 · plastic-labs/honcho

thrialectics · 2026-05-15T22:48:40Z

Three L1 failure-mode reproducers for the unified runner (DEV-1698):

misinterpretation_correction_creates_orphan_fact — synthetic
deriver_no_memory_signal — 8 SWE-chat sessions
dreamer_no_dedup — 3 SWE-chat sessions

Runner additions: AddMessagesFromFixtureAction, SaveArtifactAction, continue_on_failure, and _run_metadata.json for replay (procedure in tests/unified/README.md).

All gates start at pass_if=true per verify-first methodology. SWE-chat sample under data/ is SALT-NLP/SWE-chat (ODC-BY); attribution embedded in the file.

Summary by CodeRabbit

Tests
- Enhanced unified test infrastructure to capture per-test run metadata and artifacts for better test reproducibility and debugging.
- Added support for additional test step types including message fixture loading.
- Added multiple new test fixtures for validating specific failure modes and system behaviors.
Documentation
- Updated test documentation with guidance on failure-mode reproducers, artifact inspection, and manual snapshot replay procedures.
Chores
- Updated configuration to properly manage test artifact directories.

Three orthogonal additions that together let the unified runner host tests over real conversational data: - `AddMessagesFromFixtureAction` -- ingest messages from a fixture JSON under tests/unified/test_cases/data/ rather than inline. Maps fixture role=user / role=assistant to configured peer IDs; preserves per- message timestamps when present. - `continue_on_failure` flag on TestDefinition -- when set, per-step assertion failures collect rather than abort, so suites with N independent scenarios aggregate per-session pass/fail. Infrastructure errors still abort. - `SaveArtifactAction` -- write a query result (get_representation / get_peer_card / get_context) to a named JSON file under a per-test artifact dir. No assertion coupling, so offline analysis is decoupled from the pass/fail gate. Per-test artifact dir is at tests/unified/artifacts/<test_name>_<utc_ts>/ (override with UNIFIED_TEST_ARTIFACTS_DIR; gitignored). Each test run also writes _run_metadata.json capturing honcho git SHA + dirty status, transport + model for each LLM-using agent, SHA-256 hashes of relevant prompt source files, judge model, and timestamp. Manual snapshot-replay procedure documented in tests/unified/README.md; automatic replay deferred.

L1 reproducer for the deriver_no_memory_signal failure mode: the deriver writes observations that don't add signal about the user or their work -- bare acknowledgments paraphrased as facts ("user said yes"), shell-history fragments ("user executed cd ~/foo"), instruction- event-records that look like durable user attributes but aren't. Inputs: 8 sessions sampled from SALT-NLP/SWE-chat (ODC-BY, https://arxiv.org/abs/2604.20779), persona-balanced across Expert Nitpicker / Vague Requester / Mind Changer. Two sessions from the underlying extract are excluded for data quality (one Japanese with a heavily skewed user/assistant role split; one near-monologue); documented in the fixture description. Per session: ingest messages, wait for queue empty, run an llm_judge asking whether the user's representation contains any Tier 1 bare- speech-act observation, then save_artifact-dump the representation for offline analysis. The suite uses continue_on_failure to aggregate per-session pass/fail. Judge prompt is calibrated with annotated example observations across three tiers (Tier 1 / Tier 2 / Signal) to distinguish bare-event records from specifics-anchored observations. Inputs file at tests/unified/test_cases/data/swe_chat_sample_persona_balanced.json -- the committed dataset extract. Metadata embedded in the file documents its provenance (SALT-NLP/SWE-chat, ODC-BY, selection criteria, extraction timestamp).

L1 reproducer for the dreamer's failure to consolidate duplicate observations. The dreamer's documented role is consolidation: rewriting / merging semantically related observations into better- stated ones, and deleting redundant observations. This fixture asserts the consolidation specifically against duplicates -- exact-text duplicates, near-duplicates (varying only in trivial details), and semantic duplicates (the same fact stated from different angles). Inputs: 3 sessions sampled from SALT-NLP/SWE-chat (indices 5, 6, 7 from swe_chat_sample_persona_balanced.json). Each session is ingested through the deriver, then dreamed, then queried for residual duplicates. Two-gate structure per session: - PRE-DREAM SETUP GATE: assert >=2 duplicate observations exist before the dream is scheduled. If the deriver did not produce duplicates, the test errors at the setup gate rather than silently passing the post-dream gate on no inputs. - POST-DREAM MAIN GATE (strict v1): assert ZERO duplicate observations remain after the dream. Strict gate; relative thresholds (post-dream count < pre-dream count) could be added later via runner-level cross-step state. Judge prompts are calibrated with three flavors of duplicate patterns (exact-content duplicates, near-twins of a single shell command differing only by embedded timestamp, and opposite-polarity semantic twins describing the same convention).

…ixture L1 reproducer for the failure mode in the wiki failure-modes catalog: the agent misinterprets the user's statement; the user explicitly corrects; both the wrong claim (from the agent's misread) and the corrected claim end up in storage with no supersession marker. End- state shape resembles time_drift_storage_no_supersession but the trigger is agent misinterpretation rather than user mind-change, and the wrong claim is one the user never literally stated. Setup: user_a clearly scopes their work to extracting a rate-limiter subsystem from a Python payment-service monolith into a standalone Go microservice. Assistant misreads as "full payment service migration to Go." User immediately corrects and reaffirms the narrow scope. Subsequent turns develop substantive technical content (gRPC RPCs, Redis token-bucket, Lua atomicity, circuit breaker trade-offs, testing strategy). The wrong claim never appears in a user turn -- only in the assistant's misread. Two-gate structure: positive control asserts the representation captures the correct-scope work; negative reproducer asserts the representation does NOT contain the assistant's misinterpretation (claims about migrating the whole payment service or rewriting the Python monolith). Both gates start at pass_if=true (invariant); per verify-first methodology, polarity flips to pass_if=false only after observed firing on real code.

coderabbitai · 2026-05-15T22:48:53Z

Walkthrough

This PR introduces end-to-end artifact capture and failure-mode reproducer test infrastructure. It extends the unified test schema with fixture-loading and error-continuation capabilities, implements metadata/artifact serialization in the executor, adds new step handlers, refines .gitignore rules, documents the reproducer pattern and replay procedures, and contributes three production test fixtures that validate agent behavior.

Changes

Unified Test Infrastructure and Fixtures

Layer / File(s)	Summary
Schema contracts for fixture loading and error continuation `tests/unified/schema.py`	`AddMessagesFromFixtureAction` step type enables loading session messages from external JSON fixtures with peer-ID mapping and optional turn truncation. `TestDefinition` gains `continue_on_failure` flag to control whether assertion failures abort or accumulate.
Executor metadata and artifact infrastructure `tests/unified/runner.py`	`UnifiedTestExecutor` gains `artifacts_root` parameter and implements per-test artifact directory creation, `_run_metadata.json` serialization (including git SHA, dirty state, agent models, prompt file hashes), JSON artifact writing with flat-path validation and serialization fallback, and failure classification between assertion errors (respecting `continue_on_failure`) and infrastructure errors.
New step action handlers `tests/unified/runner.py`	`AddMessagesFromFixtureAction` handler loads JSON fixtures, validates session index, maps fixture roles to configured peer IDs, parses optional ISO timestamps, and bulk-adds constructed messages. `SaveArtifactAction` handler fetches target data and writes serialized JSON artifacts to the test's artifacts directory.
Project configuration and documentation `.gitignore`, `tests/unified/README.md`	`.gitignore` updated to ignore `tests/unified/artifacts/` while explicitly including `tests/unified/test_cases/data/` for checked-in fixtures. README documents failure-mode reproducer test pattern, per-test artifact files, and manual replay procedures using recorded metadata.
Failure-mode reproducer test fixtures `tests/unified/test_cases/deriver_no_memory_signal.json`, `tests/unified/test_cases/dreamer_no_dedup.json`, `tests/unified/test_cases/misinterpretation_correction_creates_orphan_fact.json`	Three new test fixtures: `deriver_no_memory_signal` exercises Tier 1/2/Signal observation classification across 8 sessions with `continue_on_failure` aggregation; `dreamer_no_dedup` verifies duplicate consolidation with pre/post representation capture and two `llm_judge` gates per session; `misinterpretation_correction_creates_orphan_fact` captures scripted assistant misinterpretation and user correction, then asserts correct and incorrect observations via two-gate `llm_judge` pattern.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 A rabbit built a test warren so fine,
With artifacts captured, each snapshot divine,
Fixtures loaded from files, reproduction precise,
Failures can gather like carrots in ice,
From dreamer to deriver, the tests now align! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Kass/test cases batch 2' is vague and generic, referring only to author/branch name and batch numbering without describing the actual changes or improvements being made.	Provide a more descriptive title that summarizes the main contribution, such as 'Add failure-mode reproducers and artifact capture to unified test runner' or 'Introduce AddMessagesFromFixtureAction and test run metadata snapshotting'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kass/test-cases-batch-2

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

.gitignore (1)

10-12: ⚡ Quick win

Unignore fixture contents explicitly, not just the directory.

Given the broad data ignore rule, add an explicit recursive unignore so new fixture files under this directory are always tracked.

✅ Suggested fix

 !tests/unified/test_cases/data/
+!tests/unified/test_cases/data/**

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.gitignore around lines 10 - 12, The .gitignore currently ignores the
top-level data directory but only unignores the directory path
'!tests/unified/test_cases/data/' which does not unignore files inside; update
the unignore to be recursive so fixture files are tracked (i.e. replace or add a
recursive unignore for tests/unified/test_cases/data such that files under that
directory are not ignored while keeping the broader 'data' and 'redis-data'
ignore rules intact).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unified/runner.py`:
- Around line 514-520: The code currently builds fixture_path from FIXTURES_DIR
/ step.fixture_path and opens it directly, allowing path traversal (e.g., "../")
to escape the fixture root; fix by resolving the combined path (use
Path.resolve(strict=False) on FIXTURES_DIR / step.fixture_path) and verify the
resolved path is a child of FIXTURES_DIR.resolve() (e.g., compare parents or
string prefix) and raise FileNotFoundError or ValueError if it is outside the
fixture root before proceeding to open the file (references: FIXTURES_DIR,
step.fixture_path, fixture_path, and the with open(...) call).
- Around line 523-531: The runtime should reject negative fixture indices and
non-positive limits: before using step.fixture_session_index against
sessions_in_fixture, add a check that step.fixture_session_index is >= 0 and
raise an IndexError with a clear message if not (currently only upper bound is
checked); similarly, before slicing fixture_messages using step.limit, validate
that if step.limit is not None then step.limit > 0 and raise a ValueError (or
appropriate exception) for limit <= 0 with a descriptive message; update the
checks around fixture_session = sessions_in_fixture[...] and fixture_messages =
fixture_session.get("messages", []) to enforce these validations.
- Around line 346-347: Three f-strings are implicitly concatenated and must be
joined explicitly: replace the adjacent f-strings in the message built with
test_name and assertion_failures so they are joined with +; similarly join the
two adjacent f-strings that reference fixture_path, step.fixture_path, and
FIXTURES_DIR with +; and join the two adjacent f-strings that reference
step.fixture_session_index and sessions_in_fixture (len(sessions_in_fixture))
with +. Locate these by the symbols test_name, assertion_failures, fixture_path,
FIXTURES_DIR, step.fixture_path, step.fixture_session_index, and
sessions_in_fixture in tests/unified/runner.py and change the implicit
concatenations to use the + operator to form a single string expression.

In `@tests/unified/schema.py`:
- Around line 83-95: The fixture_session_index and limit fields currently allow
negative/zero values; update their Pydantic Field declarations to enforce
bounds: change fixture_session_index to require ge=0 (non-negative integer) and
change limit to require ge=1 (positive integer) by adding the appropriate Field
constraints (e.g., Field(..., ge=0, description=...) for fixture_session_index
and Field(None, ge=1, description=...) for limit) so invalid indices or limits
are rejected at validation time; ensure you keep the existing descriptions on
the Field calls.

---

Nitpick comments:
In @.gitignore:
- Around line 10-12: The .gitignore currently ignores the top-level data
directory but only unignores the directory path
'!tests/unified/test_cases/data/' which does not unignore files inside; update
the unignore to be recursive so fixture files are tracked (i.e. replace or add a
recursive unignore for tests/unified/test_cases/data such that files under that
directory are not ignored while keeping the broader 'data' and 'redis-data'
ignore rules intact).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 798cb782-e304-4602-90f0-17df0aeb026e

📥 Commits

Reviewing files that changed from the base of the PR and between 8fcbb54 and 2aac580.

📒 Files selected for processing (8)

.gitignore
tests/unified/README.md
tests/unified/runner.py
tests/unified/schema.py
tests/unified/test_cases/data/swe_chat_sample_persona_balanced.json
tests/unified/test_cases/deriver_no_memory_signal.json
tests/unified/test_cases/dreamer_no_dedup.json
tests/unified/test_cases/misinterpretation_correction_creates_orphan_fact.json

coderabbitai · 2026-05-15T22:53:35Z

+                f"Test {test_name} FAILED with {len(assertion_failures)} "
+                f"assertion failure(s) (continue_on_failure=true):"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the exact warning locations and then recheck after patching. rg -n "assertion failure\\(s\\)|resolved from|out of range" tests/unified/runner.py

Repository: plastic-labs/honcho

Length of output: 304

🏁 Script executed:

sed -n '345,348p' tests/unified/runner.py

Repository: plastic-labs/honcho

Length of output: 246

🏁 Script executed:

sed -n '516,519p' tests/unified/runner.py

Repository: plastic-labs/honcho

Length of output: 283

🏁 Script executed:

sed -n '524,527p' tests/unified/runner.py

Repository: plastic-labs/honcho

Length of output: 276

Fix implicit string concatenation at all three locations to comply with basedpyright's reportImplicitStringConcatenation check.

These lines use adjacent f-strings without explicit operators, which triggers CI warnings:

Lines 346-347: f"Test {test_name} FAILED with {len(assertion_failures)} " f"assertion failure(s) (continue_on_failure=true):"

Lines 517-518: f"Fixture not found: {fixture_path} (resolved from " f"fixture_path={step.fixture_path!r}, FIXTURES_DIR={FIXTURES_DIR})"

Lines 525-526: f"fixture_session_index={step.fixture_session_index} out of range " f"(fixture has {len(sessions_in_fixture)} sessions)"

Use explicit + operator to join the strings.

🧰 Tools

🪛 GitHub Actions: Static Analysis / 0_basedpyright.txt

[warning] 346-346: basedpyright warning: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Actions: Static Analysis / basedpyright

[warning] 346-346: basedpyright: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Check: basedpyright

[warning] 346-346:
Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unified/runner.py` around lines 346 - 347, Three f-strings are implicitly concatenated and must be joined explicitly: replace the adjacent f-strings in the message built with test_name and assertion_failures so they are joined with +; similarly join the two adjacent f-strings that reference fixture_path, step.fixture_path, and FIXTURES_DIR with +; and join the two adjacent f-strings that reference step.fixture_session_index and sessions_in_fixture (len(sessions_in_fixture)) with +. Locate these by the symbols test_name, assertion_failures, fixture_path, FIXTURES_DIR, step.fixture_path, step.fixture_session_index, and sessions_in_fixture in tests/unified/runner.py and change the implicit concatenations to use the + operator to form a single string expression.

coderabbitai · 2026-05-15T22:53:35Z

+            fixture_path = FIXTURES_DIR / step.fixture_path
+            if not fixture_path.is_file():
+                raise FileNotFoundError(
+                    f"Fixture not found: {fixture_path} (resolved from "
+                    f"fixture_path={step.fixture_path!r}, FIXTURES_DIR={FIXTURES_DIR})"
+                )
+            with open(fixture_path) as f:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate fixture paths stay under tests/unified/test_cases/data.

fixture_path is concatenated directly with FIXTURES_DIR, so values like ../... can escape the fixture root and read arbitrary files.

✅ Suggested fix

elif isinstance(step, AddMessagesFromFixtureAction): - fixture_path = FIXTURES_DIR / step.fixture_path + fixture_path = (FIXTURES_DIR / step.fixture_path).resolve() + fixtures_root = FIXTURES_DIR.resolve() + if fixtures_root not in fixture_path.parents: + raise ValueError( + f"Invalid fixture_path {step.fixture_path!r}: must resolve under {fixtures_root}" + ) if not fixture_path.is_file(): raise FileNotFoundError(

🧰 Tools

🪛 GitHub Actions: Static Analysis / 0_basedpyright.txt

[warning] 517-517: basedpyright warning: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Actions: Static Analysis / basedpyright

[warning] 517-517: basedpyright: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Check: basedpyright

[warning] 517-517:
Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unified/runner.py` around lines 514 - 520, The code currently builds fixture_path from FIXTURES_DIR / step.fixture_path and opens it directly, allowing path traversal (e.g., "../") to escape the fixture root; fix by resolving the combined path (use Path.resolve(strict=False) on FIXTURES_DIR / step.fixture_path) and verify the resolved path is a child of FIXTURES_DIR.resolve() (e.g., compare parents or string prefix) and raise FileNotFoundError or ValueError if it is outside the fixture root before proceeding to open the file (references: FIXTURES_DIR, step.fixture_path, fixture_path, and the with open(...) call).

coderabbitai · 2026-05-15T22:53:35Z

+            if step.fixture_session_index >= len(sessions_in_fixture):
+                raise IndexError(
+                    f"fixture_session_index={step.fixture_session_index} out of range "
+                    f"(fixture has {len(sessions_in_fixture)} sessions)"
+                )
+            fixture_session = sessions_in_fixture[step.fixture_session_index]
+            fixture_messages = fixture_session.get("messages", [])
+            if step.limit is not None:
+                fixture_messages = fixture_messages[: step.limit]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject negative fixture indices and non-positive limits at runtime.

Line 523 only validates the upper bound; negative fixture_session_index still loads from the end. Line 530 allows limit <= 0, which yields surprising slices.

✅ Suggested fix

sessions_in_fixture = fixture.get("sessions", []) + if step.fixture_session_index < 0: + raise IndexError( + f"fixture_session_index must be >= 0; got {step.fixture_session_index}" + ) if step.fixture_session_index >= len(sessions_in_fixture): raise IndexError( @@ fixture_messages = fixture_session.get("messages", []) if step.limit is not None: + if step.limit <= 0: + raise ValueError(f"limit must be > 0; got {step.limit}") fixture_messages = fixture_messages[: step.limit]

🧰 Tools

🪛 GitHub Actions: Static Analysis / 0_basedpyright.txt

[warning] 525-525: basedpyright warning: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Actions: Static Analysis / basedpyright

[warning] 525-525: basedpyright: Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🪛 GitHub Check: basedpyright

[warning] 525-525:
Implicit string concatenation not allowed (reportImplicitStringConcatenation)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unified/runner.py` around lines 523 - 531, The runtime should reject negative fixture indices and non-positive limits: before using step.fixture_session_index against sessions_in_fixture, add a check that step.fixture_session_index is >= 0 and raise an IndexError with a clear message if not (currently only upper bound is checked); similarly, before slicing fixture_messages using step.limit, validate that if step.limit is not None then step.limit > 0 and raise a ValueError (or appropriate exception) for limit <= 0 with a descriptive message; update the checks around fixture_session = sessions_in_fixture[...] and fixture_messages = fixture_session.get("messages", []) to enforce these validations.

coderabbitai · 2026-05-15T22:53:35Z

+    fixture_session_index: int = Field(
+        ..., description="Index into fixture['sessions'] selecting which session to load"
+    )
+    user_peer_id: str = Field(
+        ..., description="Honcho peer ID to attribute fixture role='user' messages to"
+    )
+    assistant_peer_id: str | None = Field(
+        None,
+        description="Honcho peer ID for fixture role='assistant' messages. If omitted, assistant messages are skipped.",
+    )
+    limit: int | None = Field(
+        None, description="Cap on number of conversational turns to ingest"
+    )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add bounds to fixture index and limit fields.

fixture_session_index and limit accept negative/non-positive values right now. At runtime, this can silently select unintended sessions or truncate turns unexpectedly.

✅ Suggested fix

class AddMessagesFromFixtureAction(TestStep): @@ - fixture_session_index: int = Field( - ..., description="Index into fixture['sessions'] selecting which session to load" - ) + fixture_session_index: int = Field( + ..., + ge=0, + description="Index into fixture['sessions'] selecting which session to load", + ) @@ - limit: int | None = Field( - None, description="Cap on number of conversational turns to ingest" - ) + limit: int | None = Field( + None, gt=0, description="Cap on number of conversational turns to ingest" + )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unified/schema.py` around lines 83 - 95, The fixture_session_index and limit fields currently allow negative/zero values; update their Pydantic Field declarations to enforce bounds: change fixture_session_index to require ge=0 (non-negative integer) and change limit to require ge=1 (positive integer) by adding the appropriate Field constraints (e.g., Field(..., ge=0, description=...) for fixture_session_index and Field(None, ge=1, description=...) for limit) so invalid indices or limits are rejected at validation time; ensure you keep the existing descriptions on the Field calls.

thrialectics added 4 commits May 12, 2026 15:29

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kass/test cases batch 2#695

Kass/test cases batch 2#695
thrialectics wants to merge 4 commits into
mainfrom
kass/test-cases-batch-2

thrialectics commented May 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 15, 2026

Uh oh!

coderabbitai Bot May 15, 2026

Uh oh!

coderabbitai Bot May 15, 2026

Uh oh!

coderabbitai Bot May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		f"Test {test_name} FAILED with {len(assertion_failures)} "
		f"assertion failure(s) (continue_on_failure=true):"

Conversation

thrialectics commented May 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thrialectics commented May 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading