From 304e54c0b4ae02f677b970a64c76003e31b67759 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Sun, 8 Mar 2026 12:40:45 +0100
Subject: [PATCH 01/23] [Move DISCO queue to core]: - AnchorPointsTaskQueue
 moved to core (maseval/core/task.py) - MMLUBenchmark no longer implements
 agents - Remove silent .get() fallbacks for required fields - Add mmlu = []
 extra to pyproject.toml - Add MMLU entry to BENCHMARKS.md - Update
 documentation with MMLU - Update CHANGELOG.md

---
 BENCHMARKS.md                      |  16 +-
 CHANGELOG.md                       |   5 +-
 docs/benchmark/mmlu.md             | 127 ++++++++++++++
 maseval/__init__.py                |   2 +
 maseval/benchmark/mmlu/__init__.py |  17 +-
 maseval/benchmark/mmlu/mmlu.py     | 267 +++++++++--------------------
 maseval/core/task.py               |  45 +++++
 mkdocs.yml                         |   3 +-
 pyproject.toml                     |   1 +
 9 files changed, 280 insertions(+), 203 deletions(-)
 create mode 100644 docs/benchmark/mmlu.md

diff --git a/BENCHMARKS.md b/BENCHMARKS.md
index fcbde7d3..0916ef69 100644
--- a/BENCHMARKS.md
+++ b/BENCHMARKS.md
@@ -79,7 +79,21 @@ CONVERSE evaluates contextual safety in agent-to-agent conversations. It focuses
 
 ---
 
-## 6. [Name of Next Benchmark]
+## 6. MMLU (Massive Multitask Language Understanding) (Beta)
+
+MMLU evaluates language models on multiple-choice questions spanning 57 academic subjects.  The MASEval integration includes anchor-point-based evaluation for DISCO prediction, allowing efficient estimation of full benchmark performance from a subset of tasks.
+
+> **Beta:** This benchmark has been implemented carefully, but we have not yet validated the results against the original implementation. Use with caution when comparing with existing results or the original paper's numbers. Contributions and compute donations welcome!
+
+### Source and License
+
+- **Original Paper:** [Measuring Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021)
+- **DISCO Paper:** [DISCO: DISCOvering key features for accurate prediction of LLM abilities on benchmarks](https://arxiv.org/abs/2407.12890) (Rubinstein et al., 2025)
+- **Dataset:** [arubique/flattened-MMLU](https://huggingface.co/datasets/arubique/flattened-MMLU)
+
+---
+
+## 7. [Name of Next Benchmark]
 
 (Description for the next benchmark...)
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
index c3f11572..c1427ccd 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `HuggingFaceMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `MMLUModelAgent`, `MMLUAgentAdapter`, `AnchorPointsTaskQueue`, `load_tasks()`, and `compute_benchmark_metrics()`. Optional extras: `lm-eval` (for `HuggingFaceMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
+- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `HuggingFaceMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `MMLUModelAgent`, `MMLUAgentAdapter`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `HuggingFaceMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
 
 - CONVERSE benchmark for contextual safety evaluation in adversarial agent-to-agent conversations, including `ConverseBenchmark`, `DefaultAgentConverseBenchmark`, `ConverseEnvironment`, `ConverseExternalAgent`, `PrivacyEvaluator`, `SecurityEvaluator`, and `load_tasks()` utilities for `travel`, `real_estate`, and `insurance` domains. Benchmark source files are now downloaded on first use via `ensure_data_exists()` instead of being bundled in the package. (PR: #28)
 
@@ -35,11 +35,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 **Examples**
 
 - MMLU benchmark example at `examples/mmlu_benchmark/` for evaluating HuggingFace models on MMLU with optional DISCO prediction (`--disco_model_path`, `--disco_transform_path`). Supports local data, HuggingFace dataset repos, and DISCO weights from .pkl/.npz or HF repos. (PR: #34)
+- MMLU benchmark documentation at `docs/benchmark/mmlu.md` with installation, quick start, and API reference. (PR: #34)
 - Added a dedicated runnable CONVERSE default benchmark example at `examples/converse_benchmark/default_converse_benchmark.py` for quick start with `DefaultAgentConverseBenchmark`. (PR: #28)
 - Gaia2 benchmark example with Google GenAI and OpenAI model support (PR: #26)
 
 **Core**
 
+- Added `AnchorPointsTaskQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import AnchorPointsTaskQueue`. (PR: #34)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
 - Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
@@ -86,6 +88,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
+- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `HuggingFaceMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `AnchorPointsTaskQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
new file mode 100644
index 00000000..623cf63d
--- /dev/null
+++ b/docs/benchmark/mmlu.md
@@ -0,0 +1,127 @@
+# MMLU: Massive Multitask Language Understanding (Beta)
+
+!!! warning "Beta"
+    This benchmark has been implemented carefully, but we have not yet validated the results against the original implementation. Use with caution when comparing with existing results or the original paper's numbers. Contributions and compute donations welcome!
+
+The **MMLU Benchmark** evaluates language models on multiple-choice questions spanning 57 academic subjects. The MASEval integration supports anchor-point-based evaluation for [DISCO](https://arxiv.org/abs/2407.12890) prediction, enabling efficient estimation of full benchmark performance from a subset of tasks.
+
+## Overview
+
+[MMLU](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021) is a widely used benchmark for measuring knowledge and reasoning across diverse domains. The MASEval implementation features:
+
+- **Log-likelihood MCQ evaluation** matching lm-evaluation-harness methodology
+- **Anchor-point task selection** via `AnchorPointsTaskQueue` for DISCO-style subset evaluation
+- **HuggingFace integration** with batched log-probability computation
+- **lm-eval compatibility** mode for exact numerical reproduction
+
+Check out the [BENCHMARKS.md](https://github.com/parameterlab/MASEval/blob/main/BENCHMARKS.md) file for more information including licenses.
+
+## Installation
+
+MMLU has an optional dependency extra (currently empty, as core MMLU requires no additional packages):
+
+```bash
+pip install maseval[mmlu]
+```
+
+For the HuggingFace implementation, also install transformers:
+
+```bash
+pip install maseval[mmlu,transformers]
+```
+
+For DISCO prediction support:
+
+```bash
+pip install maseval[disco]
+```
+
+For exact lm-evaluation-harness reproduction:
+
+```bash
+pip install maseval[lm-eval]
+```
+
+## Quick Start
+
+```python
+from maseval.benchmark.mmlu import (
+    HuggingFaceMMLUBenchmark,
+    load_tasks,
+    compute_benchmark_metrics,
+)
+
+# Load tasks (downloads from HuggingFace automatically)
+tasks = load_tasks(data_path="/path/to/mmlu_prompts_examples.json")
+
+# Create benchmark with HuggingFace model
+benchmark = HuggingFaceMMLUBenchmark(
+    model_id="meta-llama/Llama-2-7b-hf",
+    device="cuda:0",
+)
+
+# Run evaluation
+results = benchmark.run(
+    tasks=tasks,
+    agent_data={"model_id": "meta-llama/Llama-2-7b-hf"},
+)
+
+# Compute metrics
+metrics = compute_benchmark_metrics(results)
+print(f"Accuracy: {metrics['acc']:.4f}")
+```
+
+### With Anchor Points (DISCO)
+
+```python
+from maseval.benchmark.mmlu import load_tasks
+
+# Load tasks filtered to anchor points
+tasks = load_tasks(
+    data_path="/path/to/mmlu_prompts_examples.json",
+    anchor_points_path="/path/to/anchor_points.json",
+)
+
+# tasks is an AnchorPointsTaskQueue — only anchor tasks are evaluated
+print(f"Evaluating {len(tasks)} anchor tasks")
+```
+
+## Custom Benchmark Subclass
+
+`MMLUBenchmark` is a framework-agnostic base class. To use a different model backend, subclass it and implement `setup_agents()` and `get_model_adapter()`:
+
+```python
+from maseval.benchmark.mmlu import MMLUBenchmark, MMLUModelAgent, MMLUAgentAdapter
+
+class MyMMLUBenchmark(MMLUBenchmark):
+    def setup_agents(self, agent_data, environment, task, user, seed_generator):
+        model = self.get_model_adapter(agent_data["model_id"])
+        agent = MMLUModelAgent(model, name="mmlu_agent")
+        adapter = MMLUAgentAdapter(agent, "mmlu_agent")
+        return [adapter], {"mmlu_agent": adapter}
+
+    def get_model_adapter(self, model_id, **kwargs):
+        adapter = MyModelAdapter(model_id)
+        register_name = kwargs.get("register_name")
+        if register_name:
+            self.register("models", register_name, adapter)
+        return adapter
+```
+
+## API Reference
+
+::: maseval.benchmark.mmlu.MMLUBenchmark
+
+::: maseval.benchmark.mmlu.HuggingFaceMMLUBenchmark
+
+::: maseval.benchmark.mmlu.MMLUEnvironment
+
+::: maseval.benchmark.mmlu.MMLUEvaluator
+
+::: maseval.benchmark.mmlu.MMLUModelAgent
+
+::: maseval.benchmark.mmlu.MMLUAgentAdapter
+
+::: maseval.benchmark.mmlu.load_tasks
+
+::: maseval.benchmark.mmlu.compute_benchmark_metrics
diff --git a/maseval/__init__.py b/maseval/__init__.py
index 90d52cfa..387a3345 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -16,6 +16,7 @@
     BaseTaskQueue,
     TaskQueue,
     SequentialTaskQueue,
+    AnchorPointsTaskQueue,
     PriorityTaskQueue,
     AdaptiveTaskQueue,
 )
@@ -93,6 +94,7 @@
     "BaseTaskQueue",
     "TaskQueue",
     "SequentialTaskQueue",
+    "AnchorPointsTaskQueue",
     "PriorityTaskQueue",
     "AdaptiveTaskQueue",
     # Model adapters
diff --git a/maseval/benchmark/mmlu/__init__.py b/maseval/benchmark/mmlu/__init__.py
index 19e8fd32..ac5ac154 100644
--- a/maseval/benchmark/mmlu/__init__.py
+++ b/maseval/benchmark/mmlu/__init__.py
@@ -4,12 +4,10 @@
 
 Usage:
     from maseval.benchmark.mmlu import (
-        MMLUBenchmark,
-        MMLUEnvironment,
-        MMLUEvaluator,
+        HuggingFaceMMLUBenchmark,
         load_tasks,
-        AnchorPointsTaskQueue,
     )
+    from maseval import AnchorPointsTaskQueue
 
     # Load tasks and anchor points
     tasks = load_tasks(
@@ -17,18 +15,19 @@
         anchor_points_path="path/to/anchor_points.pkl",  # Optional
     )
 
-    # Create benchmark
-    benchmark = MMLUBenchmark()
-    results = benchmark.run(tasks=tasks, agent_data={"model_id": "gpt-4"})
+    # Run benchmark
+    benchmark = HuggingFaceMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
+    results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
+from maseval import AnchorPointsTaskQueue
+
 from .mmlu import (
     DEFAULT_AGENT_NAME,
     DEFAULT_BATCH_SIZE,
     DEFAULT_CHOICES,
     DEFAULT_DEVICE,
     DEFAULT_MODEL_REGISTER_NAME,
-    FALLBACK_MODEL_ID,
     MMLU_TASK_NAME,
     STATUS_SUCCESS,
     TARGET_DELIMITER,
@@ -39,7 +38,6 @@
     MMLUEvaluator,
     MMLUModelAgent,
     MMLUAgentAdapter,
-    AnchorPointsTaskQueue,
     load_tasks,
     compute_benchmark_metrics,
 )
@@ -50,7 +48,6 @@
     "DEFAULT_CHOICES",
     "DEFAULT_DEVICE",
     "DEFAULT_MODEL_REGISTER_NAME",
-    "FALLBACK_MODEL_ID",
     "MMLU_TASK_NAME",
     "STATUS_SUCCESS",
     "TARGET_DELIMITER",
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 6506402c..0b6de68a 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -8,32 +8,25 @@
 
 Usage:
     from maseval.benchmark.mmlu import (
-        MMLUBenchmark, load_tasks, AnchorPointsTaskQueue
+        HuggingFaceMMLUBenchmark, load_tasks,
     )
+    from maseval import AnchorPointsTaskQueue
 
-    # Load tasks filtered to anchor points
+    # Load tasks (optionally filtered to anchor points)
     tasks = load_tasks(
         data_path="/path/to/mmlu_prompts_examples.json",
         anchor_points_path="/path/to/anchor_points.pkl",
     )
 
-    # Create benchmark with HuggingFace model
-    class MyMMLUBenchmark(MMLUBenchmark):
-        def get_model_adapter(self, model_id, **kwargs):
-            from transformers import pipeline
-            from maseval.interface.inference import HuggingFaceModelAdapter
-            pipe = pipeline("text-generation", model=model_id)
-            return HuggingFaceModelAdapter(model=pipe, model_id=model_id)
-
-    benchmark = MyMMLUBenchmark()
-    results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b"})
+    # Run with the HuggingFace concrete implementation
+    benchmark = HuggingFaceMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
+    results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
 import json
 import pickle
-from abc import abstractmethod
 from pathlib import Path
-from typing import Any, Dict, Iterator, List, Optional, Sequence, Tuple, Union, cast
+from typing import Any, Dict, List, Optional, Sequence, Tuple, Union, cast
 
 # numpy is optional - only needed for anchor points processing
 try:
@@ -46,6 +39,7 @@ def get_model_adapter(self, model_id, **kwargs):
 
 from maseval import (
     AgentAdapter,
+    AnchorPointsTaskQueue,
     Benchmark,
     Environment,
     Evaluator,
@@ -55,7 +49,7 @@ def get_model_adapter(self, model_id, **kwargs):
     User,
     SeedGenerator,
 )
-from maseval.core.task import AdaptiveTaskQueue, SequentialTaskQueue
+from maseval.core.task import SequentialTaskQueue
 from maseval.core.tracing import TraceableMixin
 from maseval.core.config import ConfigurableMixin
 
@@ -72,109 +66,9 @@ def get_model_adapter(self, model_id, **kwargs):
 TARGET_DELIMITER = " "  # lm-eval convention for MCQ
 MMLU_TASK_NAME = "mmlu_prompts"
 TASK_TYPE_MMLU = "mmlu"
-FALLBACK_MODEL_ID = "unknown"
 STATUS_SUCCESS = "success"
 
 
-# =============================================================================
-# Task Queue
-# =============================================================================
-
-
-class AnchorPointsTaskQueue(AdaptiveTaskQueue):
-    """Task queue that iterates through tasks in anchor points order.
-
-    This queue is used for DISCO-based evaluation where we only evaluate
-    on a subset of anchor tasks and predict performance on the full dataset.
-
-    The queue iterates through tasks in the order specified by anchor_points,
-    and stops when all anchor tasks have been processed.
-    """
-
-    def __init__(self, tasks: List[Task], anchor_points: Optional[List[int]] = None):
-        """Initialize anchor points task queue.
-
-        Args:
-            tasks: Full list of tasks (ordered by doc_id).
-            anchor_points: Optional list of task indices (doc_ids) to evaluate.
-                If None, evaluates all tasks in order.
-        """
-        # If anchor_points provided, filter tasks to only include anchor tasks
-        # This dramatically improves performance by avoiding O(n²) iteration
-        if anchor_points is not None:
-            # Build index mapping for quick lookup
-            task_by_doc_id: Dict[int, Task] = {}
-            for i, task in enumerate(tasks):
-                doc_id = task.metadata.get("doc_id", i)
-                task_by_doc_id[doc_id] = task
-
-            # Filter to only anchor tasks, preserving anchor order
-            anchor_tasks = []
-            for doc_id in anchor_points:
-                task = task_by_doc_id.get(doc_id)
-                if task is not None:
-                    anchor_tasks.append(task)
-
-            # Store original for reference
-            self._all_tasks = tasks
-            self._task_by_doc_id = task_by_doc_id
-            tasks = anchor_tasks
-
-        super().__init__(tasks)
-        self._anchor_points = anchor_points
-        self._anchor_idx = 0
-
-        # Initialize state immediately (since __iter__ is overridden and skips initial_state())
-        self._state = self.initial_state()
-
-    def __iter__(self) -> Iterator[Task]:
-        """Yield tasks in anchor point order.
-
-        Since tasks are pre-filtered during __init__, we simply iterate
-        over the stored tasks in order. This avoids the infinite loop
-        issue in AdaptiveTaskQueue.__iter__ which relies on on_task_repeat_end
-        to remove tasks from _remaining.
-        """
-        return iter(self._tasks)
-
-    def initial_state(self) -> Dict[str, Any]:
-        """Initialize state for anchor point iteration."""
-        return {
-            "anchor_idx": 0,
-            "completed_anchors": [],
-        }
-
-    def select_next_task(self, remaining: Sequence[Task], state: Dict[str, Any]) -> Optional[Task]:
-        """Select the next anchor task to execute.
-
-        Args:
-            remaining: Tasks not yet executed.
-            state: Current state with anchor_idx.
-
-        Returns:
-            Next anchor task, or None if all anchors processed.
-        """
-        # Simply return the first remaining task since we pre-filtered to anchor tasks only
-        return remaining[0] if remaining else None
-
-    def update_state(self, task: Task, report: Dict[str, Any], state: Dict[str, Any]) -> Dict[str, Any]:
-        """Update state after task completion.
-
-        Args:
-            task: Completed task.
-            report: Execution report.
-            state: Current state.
-
-        Returns:
-            Updated state.
-        """
-        doc_id = task.metadata.get("doc_id")
-        state["completed_anchors"].append(doc_id)
-        state["anchor_idx"] += 1
-
-        return state
-
-
 # =============================================================================
 # Environment
 # =============================================================================
@@ -188,12 +82,18 @@ class MMLUEnvironment(Environment):
     """
 
     def setup_state(self, task_data: Dict[str, Any]) -> Dict[str, Any]:
-        """Initialize state from task data."""
+        """Initialize state from task data.
+
+        Args:
+            task_data: Must contain ``"query"`` (str) and ``"environment_data"``
+                (dict with optional ``"choices"``, ``"full_prompt"``, ``"use_full_prompt"``).
+        """
+        env_data = task_data["environment_data"]
         return {
-            "query": task_data.get("query", ""),
-            "choices": task_data.get("environment_data", {}).get("choices", []),
-            "full_prompt": task_data.get("environment_data", {}).get("full_prompt", ""),
-            "use_full_prompt": task_data.get("environment_data", {}).get("use_full_prompt", False),
+            "query": task_data["query"],
+            "choices": env_data.get("choices", DEFAULT_CHOICES),
+            "full_prompt": env_data.get("full_prompt", ""),
+            "use_full_prompt": env_data.get("use_full_prompt", False),
         }
 
     def create_tools(self) -> Dict[str, Any]:
@@ -203,11 +103,11 @@ def create_tools(self) -> Dict[str, Any]:
     def get_prompt(self) -> str:
         """Get the prompt to send to the model.
 
-        Returns full_prompt if use_full_prompt is True, otherwise query.
+        Returns ``full_prompt`` if ``use_full_prompt`` is True, otherwise ``query``.
         """
-        if self.state.get("use_full_prompt", False):
-            return self.state.get("full_prompt", self.state.get("query", ""))
-        return self.state.get("query", "")
+        if self.state["use_full_prompt"]:
+            return self.state["full_prompt"]
+        return self.state["query"]
 
 
 # =============================================================================
@@ -231,13 +131,14 @@ def __init__(
         """Initialize MMLU evaluator.
 
         Args:
-            task: Task being evaluated (contains gold answer).
+            task: Task being evaluated. Must have ``evaluation_data["gold"]`` (int)
+                with the correct answer index.
             environment: Environment (provides choices).
             user: Unused for MMLU.
         """
         self.task = task
         self.environment = environment
-        self.gold = task.evaluation_data.get("gold", 0)
+        self.gold = task.evaluation_data["gold"]
         self.choices = task.environment_data.get("choices", DEFAULT_CHOICES)
 
     def filter_traces(self, traces: Dict[str, Any]) -> Dict[str, Any]:
@@ -436,19 +337,12 @@ class MMLUBenchmark(Benchmark):
     Evaluates language models on MMLU multiple choice questions.
     Supports anchor point-based evaluation for DISCO prediction.
 
-    Users must subclass and implement:
-    - get_model_adapter() to provide model adapters
+    Subclasses must implement:
 
-    Usage:
-        class MyMMLUBenchmark(MMLUBenchmark):
-            def get_model_adapter(self, model_id, **kwargs):
-                from transformers import pipeline
-                from maseval.interface.inference import HuggingFaceModelAdapter
-                pipe = pipeline("text-generation", model=model_id)
-                return HuggingFaceModelAdapter(model=pipe, model_id=model_id)
+    - ``setup_agents()`` - create agents for MCQ evaluation
+    - ``get_model_adapter()`` - provide model adapters
 
-        benchmark = MyMMLUBenchmark()
-        results = benchmark.run(tasks=tasks, agent_data={"model_id": "llama-7b"})
+    For a ready-to-use implementation, see ``HuggingFaceMMLUBenchmark``.
     """
 
     def __init__(
@@ -480,7 +374,7 @@ def setup_environment(
             "query": task.query,
             "environment_data": {
                 **task.environment_data,
-                "use_full_prompt": self.use_full_prompt or agent_data.get("use_full_prompt", False),
+                "use_full_prompt": self.use_full_prompt,
             },
         }
         return MMLUEnvironment(task_data)
@@ -495,33 +389,6 @@ def setup_user(
         """MMLU doesn't use a user simulator."""
         return None
 
-    def setup_agents(
-        self,
-        agent_data: Dict[str, Any],
-        environment: Environment,
-        task: Task,
-        user: Optional[User],
-        seed_generator: SeedGenerator,
-    ) -> Tuple[Sequence[AgentAdapter], Dict[str, AgentAdapter]]:
-        """Create model agent for MCQ evaluation.
-
-        Args:
-            agent_data: Agent config with model_id.
-            environment: MMLU environment.
-            task: Current task.
-            user: Unused.
-
-        Returns:
-            Tuple of (agents_to_run, agents_dict).
-        """
-        model_id = agent_data.get("model_id", FALLBACK_MODEL_ID)
-        model = self.get_model_adapter(model_id, register_name=DEFAULT_MODEL_REGISTER_NAME)
-
-        agent = MMLUModelAgent(model, name=DEFAULT_AGENT_NAME)
-        adapter = MMLUAgentAdapter(agent, DEFAULT_AGENT_NAME)
-
-        return [adapter], {DEFAULT_AGENT_NAME: adapter}
-
     def setup_evaluators(
         self,
         environment: Environment,
@@ -548,21 +415,6 @@ def run_agents(
         agent = agents[0]
         return agent.run(prompt)
 
-    @abstractmethod
-    def get_model_adapter(self, model_id: str, **kwargs) -> ModelAdapter:
-        """Provide a ModelAdapter for the model.
-
-        Must be implemented by subclass.
-
-        Args:
-            model_id: Model identifier.
-            **kwargs: Additional arguments (e.g., register_name for tracing).
-
-        Returns:
-            ModelAdapter instance.
-        """
-        pass
-
     def evaluate(
         self,
         evaluators: Sequence[Evaluator],
@@ -598,7 +450,7 @@ def __init__(
         trust_remote_code: bool = True,
         use_full_prompt: bool = True,
         batch_size: int = DEFAULT_BATCH_SIZE,
-        **kwargs,
+        **kwargs: Any,
     ):
         """Initialize HuggingFace MMLU benchmark.
 
@@ -618,6 +470,34 @@ def __init__(
         self._model = None
         self._tokenizer = None
 
+    def setup_agents(
+        self,
+        agent_data: Dict[str, Any],
+        environment: Environment,
+        task: Task,
+        user: Optional[User],
+        seed_generator: SeedGenerator,
+    ) -> Tuple[Sequence[AgentAdapter], Dict[str, AgentAdapter]]:
+        """Create model agent for MCQ evaluation.
+
+        Args:
+            agent_data: Agent config. Must contain ``"model_id"`` (str).
+            environment: MMLU environment.
+            task: Current task.
+            user: Unused.
+            seed_generator: Seed generator (unused for MMLU).
+
+        Returns:
+            Tuple of (agents_to_run, agents_dict).
+        """
+        model_id = agent_data["model_id"]
+        model = self.get_model_adapter(model_id, register_name=DEFAULT_MODEL_REGISTER_NAME)
+
+        agent = MMLUModelAgent(model, name=DEFAULT_AGENT_NAME)
+        adapter = MMLUAgentAdapter(agent, DEFAULT_AGENT_NAME)
+
+        return [adapter], {DEFAULT_AGENT_NAME: adapter}
+
     def _load_model(self):
         """Lazy load the model and tokenizer for log-likelihood computation."""
         if self._model is None:
@@ -795,7 +675,7 @@ def _compute_logprobs_batched(self, prompts: list, choices_list: list) -> list:
 
         return all_logprobs
 
-    def precompute_all_logprobs_lmeval(self, tasks) -> dict:
+    def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, List[float]]:
         """Precompute log-likelihoods for ALL tasks using lm-eval's batching.
 
         CRITICAL: lm-evaluation-harness batches ALL requests together and uses
@@ -931,11 +811,11 @@ def _compute_logprobs_multi_token(self, prompt: str, choices: list) -> list:
 
     def run_agents(
         self,
-        agents,
-        task,
-        environment,
+        agents: Sequence[AgentAdapter],
+        task: Task,
+        environment: Environment,
         query: str = "",
-    ):
+    ) -> Any:
         """Execute log-likelihood based MCQ evaluation.
 
         Uses precomputed logprobs if available (for exact lm-eval match),
@@ -1017,7 +897,7 @@ def run_agents(
 
         return answer
 
-    def get_model_adapter(self, model_id: str, **kwargs):
+    def get_model_adapter(self, model_id: str, **kwargs: Any) -> ModelAdapter:
         """Provide a HuggingFace ModelAdapter.
 
         Note: For logprobs-based evaluation, we don't actually use the adapter
@@ -1028,7 +908,7 @@ def get_model_adapter(self, model_id: str, **kwargs):
             **kwargs: Additional arguments (e.g., register_name).
 
         Returns:
-            HuggingFaceModelAdapter instance.
+            ``HuggingFaceModelAdapter`` instance.
         """
         from maseval.interface.inference import HuggingFaceModelAdapter
 
@@ -1112,8 +992,15 @@ def load_tasks(
     # Convert to Tasks
     tasks = []
     for i, item in enumerate(data):
+        query = item.get("query") or item.get("example")
+        if query is None:
+            raise ValueError(f"MMLU task at index {i} has neither 'query' nor 'example' field")
+
+        if "gold" not in item:
+            raise ValueError(f"MMLU task at index {i} missing required 'gold' field (correct answer index)")
+
         task = Task(
-            query=item.get("query", item.get("example", "")),
+            query=query,
             id=f"mmlu_{i}",
             environment_data={
                 "choices": item.get("choices", DEFAULT_CHOICES),
@@ -1121,7 +1008,7 @@ def load_tasks(
                 "example": item.get("example", ""),
             },
             evaluation_data={
-                "gold": item.get("gold", 0),
+                "gold": item["gold"],
             },
             metadata={
                 "doc_id": i,
diff --git a/maseval/core/task.py b/maseval/core/task.py
index ed617943..081bba6b 100644
--- a/maseval/core/task.py
+++ b/maseval/core/task.py
@@ -273,6 +273,51 @@ def __iter__(self) -> Iterator[Task]:
         return iter(self._tasks)
 
 
+class AnchorPointsTaskQueue(SequentialTaskQueue):
+    """Task queue that evaluates a specified subset of tasks in a given order.
+
+    Used for anchor-point-based evaluation where performance on a full dataset
+    is predicted from results on a carefully selected subset. Anchor points are
+    integer indices into the original task list. Only tasks at those indices are
+    yielded, in the order specified by ``anchor_points``.
+
+    When ``anchor_points`` is ``None``, all tasks are yielded in their original order
+    (equivalent to ``SequentialTaskQueue``).
+
+    Attributes:
+        _all_tasks: The complete, unfiltered task list.
+        _anchor_points: The anchor-point indices, or ``None``.
+
+    Example:
+        ```python
+        # Evaluate only tasks at indices 0, 5, 12
+        queue = AnchorPointsTaskQueue(tasks, anchor_points=[0, 5, 12])
+
+        for task in queue:
+            result = execute(task)  # Only 3 tasks
+        ```
+    """
+
+    def __init__(self, tasks: Iterable[Task], anchor_points: Optional[List[int]] = None) -> None:
+        """Initialize anchor-points task queue.
+
+        Args:
+            tasks: Full list of tasks (ordered by index).
+            anchor_points: Indices into ``tasks`` selecting which tasks to evaluate
+                and in what order. If ``None``, evaluates all tasks in order.
+        """
+        all_tasks = list(tasks)
+        self._all_tasks: List[Task] = all_tasks
+        self._anchor_points: Optional[List[int]] = anchor_points
+
+        if anchor_points is not None:
+            task_by_index: Dict[int, Task] = {i: task for i, task in enumerate(all_tasks)}
+            filtered = [task_by_index[idx] for idx in anchor_points if idx in task_by_index]
+            super().__init__(filtered)
+        else:
+            super().__init__(all_tasks)
+
+
 class PriorityTaskQueue(BaseTaskQueue):
     """Execute tasks ordered by priority.
 
diff --git a/mkdocs.yml b/mkdocs.yml
index 4b489f50..153215e9 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -129,7 +129,8 @@ nav:
               - OpenAI: interface/inference/openai.md
       - Benchmarks:
           - ConVerse: benchmark/converse.md
+          - GAIA2: benchmark/gaia2.md
           - MACS: benchmark/macs.md
+          - MMLU: benchmark/mmlu.md
           - MultiAgentBench: benchmark/multiagentbench.md
           - Tau2: benchmark/tau2.md
-          - GAIA2: benchmark/gaia2.md
diff --git a/pyproject.toml b/pyproject.toml
index 51227d46..dc644b10 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -82,6 +82,7 @@ multiagentbench = [
 ]
 tau2 = ["docstring-parser>=0.16", "addict>=2.4.0"]
 converse = []
+mmlu = []
 
 # LM Evaluation Harness (for HuggingFaceMMLUBenchmark.precompute_all_logprobs_lmeval)
 lm-eval = ["lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main"]

From c0f81b9f71c882dbbc8b019ebbc8cd1c485afe5c Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 9 Mar 2026 12:03:37 +0100
Subject: [PATCH 02/23] [Move DISCO queue to core]: - Update dependencies

---
 docs/benchmark/mmlu.md | 12 +++++++++---
 pyproject.toml         | 16 +++++++++++++---
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index 623cf63d..bcb54c11 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -18,16 +18,22 @@ Check out the [BENCHMARKS.md](https://github.com/parameterlab/MASEval/blob/main/
 
 ## Installation
 
-MMLU has an optional dependency extra (currently empty, as core MMLU requires no additional packages):
+Install MMLU with all dependencies needed to run the HuggingFace benchmark and example script:
 
 ```bash
 pip install maseval[mmlu]
 ```
 
-For the HuggingFace implementation, also install transformers:
+Or with uv:
 
 ```bash
-pip install maseval[mmlu,transformers]
+uv sync --extra mmlu
+```
+
+This installs `transformers`, `torch`, `numpy`, and `huggingface_hub` (the latter two via `transformers`). You can then run the example:
+
+```bash
+python examples/mmlu_benchmark/mmlu_benchmark.py --model_id alignment-handbook/zephyr-7b-sft-full
 ```
 
 For DISCO prediction support:
diff --git a/pyproject.toml b/pyproject.toml
index dc644b10..c252adeb 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -82,10 +82,20 @@ multiagentbench = [
 ]
 tau2 = ["docstring-parser>=0.16", "addict>=2.4.0"]
 converse = []
-mmlu = []
+# HuggingFace model + tokenizer, default dataset download; numpy for example script and anchor-point loading;
+# lm-eval for --use_lmeval_batching (exact lm-evaluation-harness reproduction); aiohttp required by lm_eval.models.api_models
+mmlu = [
+    "transformers>=4.37.0",
+    "numpy>=1.20.0",
+    "aiohttp>=3.9.0",
+    "lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main",
+]
 
-# LM Evaluation Harness (for HuggingFaceMMLUBenchmark.precompute_all_logprobs_lmeval)
-lm-eval = ["lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main"]
+# LM Evaluation Harness (same as in mmlu; aiohttp required by lm_eval.models.api_models)
+lm-eval = [
+    "aiohttp>=3.9.0",
+    "lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main",
+]
 
 # DISCO prediction (for MMLU benchmark example)
 disco = [

From 6ad80a8d7bde522da002a2aa74bfa7bbb6d0ef7e Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 9 Mar 2026 12:36:44 +0100
Subject: [PATCH 03/23] [Move DISCO queue to core]: - Add
 InformativeSubsetQueue - Rename AnchorPointsTaskQueue to DISCOQueue - Make
 DISCOQueue a subclass of InformativeSubsetQueue

---
 CHANGELOG.md                       |  4 +-
 docs/benchmark/mmlu.md             |  4 +-
 maseval/__init__.py                |  6 ++-
 maseval/benchmark/mmlu/__init__.py |  7 +--
 maseval/benchmark/mmlu/mmlu.py     | 10 ++--
 maseval/core/task.py               | 73 +++++++++++++++++++++++-------
 6 files changed, 74 insertions(+), 30 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index c1427ccd..aec9785b 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -41,7 +41,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Core**
 
-- Added `AnchorPointsTaskQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import AnchorPointsTaskQueue`. (PR: #34)
+- Added `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import DISCOQueue`. (PR: #34)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
 - Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
@@ -88,7 +88,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `HuggingFaceMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `AnchorPointsTaskQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
+- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `HuggingFaceMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index bcb54c11..d3c8c88d 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -10,7 +10,7 @@ The **MMLU Benchmark** evaluates language models on multiple-choice questions sp
 [MMLU](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021) is a widely used benchmark for measuring knowledge and reasoning across diverse domains. The MASEval implementation features:
 
 - **Log-likelihood MCQ evaluation** matching lm-evaluation-harness methodology
-- **Anchor-point task selection** via `AnchorPointsTaskQueue` for DISCO-style subset evaluation
+- **Anchor-point task selection** via `DISCOQueue` for DISCO-style subset evaluation
 - **HuggingFace integration** with batched log-probability computation
 - **lm-eval compatibility** mode for exact numerical reproduction
 
@@ -88,7 +88,7 @@ tasks = load_tasks(
     anchor_points_path="/path/to/anchor_points.json",
 )
 
-# tasks is an AnchorPointsTaskQueue — only anchor tasks are evaluated
+# tasks is an DISCOQueue — only anchor tasks are evaluated
 print(f"Evaluating {len(tasks)} anchor tasks")
 ```
 
diff --git a/maseval/__init__.py b/maseval/__init__.py
index 387a3345..957fee9b 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -16,7 +16,8 @@
     BaseTaskQueue,
     TaskQueue,
     SequentialTaskQueue,
-    AnchorPointsTaskQueue,
+    InformativeSubsetQueue,
+    DISCOQueue,
     PriorityTaskQueue,
     AdaptiveTaskQueue,
 )
@@ -94,7 +95,8 @@
     "BaseTaskQueue",
     "TaskQueue",
     "SequentialTaskQueue",
-    "AnchorPointsTaskQueue",
+    "InformativeSubsetQueue",
+    "DISCOQueue",
     "PriorityTaskQueue",
     "AdaptiveTaskQueue",
     # Model adapters
diff --git a/maseval/benchmark/mmlu/__init__.py b/maseval/benchmark/mmlu/__init__.py
index ac5ac154..bc7b4360 100644
--- a/maseval/benchmark/mmlu/__init__.py
+++ b/maseval/benchmark/mmlu/__init__.py
@@ -7,7 +7,7 @@
         HuggingFaceMMLUBenchmark,
         load_tasks,
     )
-    from maseval import AnchorPointsTaskQueue
+    from maseval import DISCOQueue, InformativeSubsetQueue
 
     # Load tasks and anchor points
     tasks = load_tasks(
@@ -20,7 +20,7 @@
     results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
-from maseval import AnchorPointsTaskQueue
+from maseval import DISCOQueue
 
 from .mmlu import (
     DEFAULT_AGENT_NAME,
@@ -58,7 +58,8 @@
     "MMLUEvaluator",
     "MMLUModelAgent",
     "MMLUAgentAdapter",
-    "AnchorPointsTaskQueue",
+    "InformativeSubsetQueue",
+    "DISCOQueue",
     "load_tasks",
     "compute_benchmark_metrics",
 ]
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 0b6de68a..11159ebb 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -10,7 +10,7 @@
     from maseval.benchmark.mmlu import (
         HuggingFaceMMLUBenchmark, load_tasks,
     )
-    from maseval import AnchorPointsTaskQueue
+    from maseval import DISCOQueue
 
     # Load tasks (optionally filtered to anchor points)
     tasks = load_tasks(
@@ -39,7 +39,7 @@
 
 from maseval import (
     AgentAdapter,
-    AnchorPointsTaskQueue,
+    DISCOQueue,
     Benchmark,
     Environment,
     Evaluator,
@@ -963,13 +963,13 @@ def load_tasks(
     data_path: Union[str, Path],
     anchor_points_path: Optional[Union[str, Path]] = None,
     limit: Optional[int] = None,
-) -> Union[AnchorPointsTaskQueue, SequentialTaskQueue]:
+) -> Union[DISCOQueue, SequentialTaskQueue]:
     """Load MMLU tasks from JSON file.
 
     Args:
         data_path: Path to MMLU prompts JSON file (mmlu_prompts_examples.json format).
         anchor_points_path: Optional path to anchor points pickle file.
-            If provided, returns an AnchorPointsTaskQueue that evaluates
+            If provided, returns an DISCOQueue that evaluates
             only the anchor tasks in order.
         limit: Optional limit on number of tasks to load.
 
@@ -1024,7 +1024,7 @@ def load_tasks(
 
     # Create appropriate queue
     if anchor_points is not None:
-        return AnchorPointsTaskQueue(tasks, anchor_points)
+        return DISCOQueue(tasks, anchor_points)
     else:
         return SequentialTaskQueue(tasks)
 
diff --git a/maseval/core/task.py b/maseval/core/task.py
index 081bba6b..22ec5e0f 100644
--- a/maseval/core/task.py
+++ b/maseval/core/task.py
@@ -273,51 +273,92 @@ def __iter__(self) -> Iterator[Task]:
         return iter(self._tasks)
 
 
-class AnchorPointsTaskQueue(SequentialTaskQueue):
-    """Task queue that evaluates a specified subset of tasks in a given order.
+class InformativeSubsetQueue(SequentialTaskQueue):
+    """Evaluates an informative subset of tasks in a specified order.
 
-    Used for anchor-point-based evaluation where performance on a full dataset
-    is predicted from results on a carefully selected subset. Anchor points are
-    integer indices into the original task list. Only tasks at those indices are
-    yielded, in the order specified by ``anchor_points``.
+    Used for efficient evaluation where a carefully selected subset of tasks
+    can predict performance on the full dataset. The subset is defined by
+    ``indices`` — integer positions into the original task list. Only tasks
+    at those positions are yielded, in the order given by ``indices``.
 
-    When ``anchor_points`` is ``None``, all tasks are yielded in their original order
-    (equivalent to ``SequentialTaskQueue``).
+    The informativeness criterion (how the indices were chosen) is determined
+    by the caller or by a subclass. This base class is criterion-agnostic.
+
+    When ``indices`` is ``None``, all tasks are yielded in their original
+    order (equivalent to ``SequentialTaskQueue``).
 
     Attributes:
         _all_tasks: The complete, unfiltered task list.
-        _anchor_points: The anchor-point indices, or ``None``.
+        _indices: The subset indices, or ``None``.
 
     Example:
         ```python
         # Evaluate only tasks at indices 0, 5, 12
-        queue = AnchorPointsTaskQueue(tasks, anchor_points=[0, 5, 12])
+        queue = InformativeSubsetQueue(tasks, indices=[0, 5, 12])
 
         for task in queue:
             result = execute(task)  # Only 3 tasks
         ```
     """
 
-    def __init__(self, tasks: Iterable[Task], anchor_points: Optional[List[int]] = None) -> None:
-        """Initialize anchor-points task queue.
+    def __init__(self, tasks: Iterable[Task], indices: Optional[List[int]] = None) -> None:
+        """Initialize informative-subset task queue.
 
         Args:
             tasks: Full list of tasks (ordered by index).
-            anchor_points: Indices into ``tasks`` selecting which tasks to evaluate
+            indices: Positions into ``tasks`` selecting which tasks to evaluate
                 and in what order. If ``None``, evaluates all tasks in order.
         """
         all_tasks = list(tasks)
         self._all_tasks: List[Task] = all_tasks
-        self._anchor_points: Optional[List[int]] = anchor_points
+        self._indices: Optional[List[int]] = indices
 
-        if anchor_points is not None:
+        if indices is not None:
             task_by_index: Dict[int, Task] = {i: task for i, task in enumerate(all_tasks)}
-            filtered = [task_by_index[idx] for idx in anchor_points if idx in task_by_index]
+            filtered = [task_by_index[idx] for idx in indices if idx in task_by_index]
             super().__init__(filtered)
         else:
             super().__init__(all_tasks)
 
 
+class DISCOQueue(InformativeSubsetQueue):
+    """Diversity-based informative subset using DISCO anchor points.
+
+    Selects a diverse subset of tasks (anchor points) for evaluation. Full
+    benchmark performance is then predicted from results on this subset using
+    DISCO (DISCOvering key features for accurate prediction of LLM abilities
+    on benchmarks).
+
+    The informativeness criterion is **diversity**: anchor points are chosen
+    to maximise disagreement across models, so that a small evaluation set
+    captures the discriminative structure of the full benchmark.
+
+    Reference: `DISCO: DISCOvering key features for accurate prediction of
+    LLM abilities on benchmarks <https://arxiv.org/abs/2407.12890>`_
+
+    Example:
+        ```python
+        queue = DISCOQueue(tasks, anchor_points=[0, 5, 12])
+
+        for task in queue:
+            result = execute(task)  # Only 3 tasks
+        ```
+    """
+
+    def __init__(self, tasks: Iterable[Task], anchor_points: Optional[List[int]] = None) -> None:
+        """Initialize DISCO task queue.
+
+        Args:
+            tasks: Full list of tasks (ordered by index).
+            anchor_points: Diversity-selected indices into ``tasks``.
+                Typically loaded from a DISCO anchor-points file or
+                downloaded from a HuggingFace DISCO model repo.
+                If ``None``, evaluates all tasks in order.
+        """
+        self._anchor_points: Optional[List[int]] = anchor_points
+        super().__init__(tasks, indices=anchor_points)
+
+
 class PriorityTaskQueue(BaseTaskQueue):
     """Execute tasks ordered by priority.
 

From b498ce7c08da89156a785218483e2e6ad6ee413b Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Wed, 11 Mar 2026 17:28:24 +0100
Subject: [PATCH 04/23] [Move DISCO queue to core]: - Rename
 HuggingFaceMMLUBenchmark to DefaultMMLUBenchmark for consistency with other
 benchmarks

---
 CHANGELOG.md                              | 4 ++--
 docs/benchmark/mmlu.md                    | 6 +++---
 examples/mmlu_benchmark/mmlu_benchmark.py | 4 ++--
 maseval/benchmark/mmlu/__init__.py        | 8 ++++----
 maseval/benchmark/mmlu/mmlu.py            | 8 ++++----
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index aec9785b..40d441bb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `HuggingFaceMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `MMLUModelAgent`, `MMLUAgentAdapter`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `HuggingFaceMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
+- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `DefaultMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `MMLUModelAgent`, `MMLUAgentAdapter`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `DefaultMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
 
 - CONVERSE benchmark for contextual safety evaluation in adversarial agent-to-agent conversations, including `ConverseBenchmark`, `DefaultAgentConverseBenchmark`, `ConverseEnvironment`, `ConverseExternalAgent`, `PrivacyEvaluator`, `SecurityEvaluator`, and `load_tasks()` utilities for `travel`, `real_estate`, and `insurance` domains. Benchmark source files are now downloaded on first use via `ensure_data_exists()` instead of being bundled in the package. (PR: #28)
 
@@ -88,7 +88,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `HuggingFaceMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
+- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index d3c8c88d..965d1a5f 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -52,7 +52,7 @@ pip install maseval[lm-eval]
 
 ```python
 from maseval.benchmark.mmlu import (
-    HuggingFaceMMLUBenchmark,
+    DefaultMMLUBenchmark,
     load_tasks,
     compute_benchmark_metrics,
 )
@@ -61,7 +61,7 @@ from maseval.benchmark.mmlu import (
 tasks = load_tasks(data_path="/path/to/mmlu_prompts_examples.json")
 
 # Create benchmark with HuggingFace model
-benchmark = HuggingFaceMMLUBenchmark(
+benchmark = DefaultMMLUBenchmark(
     model_id="meta-llama/Llama-2-7b-hf",
     device="cuda:0",
 )
@@ -118,7 +118,7 @@ class MyMMLUBenchmark(MMLUBenchmark):
 
 ::: maseval.benchmark.mmlu.MMLUBenchmark
 
-::: maseval.benchmark.mmlu.HuggingFaceMMLUBenchmark
+::: maseval.benchmark.mmlu.DefaultMMLUBenchmark
 
 ::: maseval.benchmark.mmlu.MMLUEnvironment
 
diff --git a/examples/mmlu_benchmark/mmlu_benchmark.py b/examples/mmlu_benchmark/mmlu_benchmark.py
index 023915bd..101aeeba 100644
--- a/examples/mmlu_benchmark/mmlu_benchmark.py
+++ b/examples/mmlu_benchmark/mmlu_benchmark.py
@@ -52,7 +52,7 @@
 # MMLU benchmark imports
 from maseval.benchmark.mmlu import (
     DEFAULT_DEVICE,
-    HuggingFaceMMLUBenchmark,
+    DefaultMMLUBenchmark,
     load_tasks,
     compute_benchmark_metrics,
 )
@@ -691,7 +691,7 @@ def main():
     )
 
     # Create benchmark
-    benchmark = HuggingFaceMMLUBenchmark(
+    benchmark = DefaultMMLUBenchmark(
         model_id=args.model_id,
         device=args.device,
         trust_remote_code=True,
diff --git a/maseval/benchmark/mmlu/__init__.py b/maseval/benchmark/mmlu/__init__.py
index bc7b4360..dd9fd3dc 100644
--- a/maseval/benchmark/mmlu/__init__.py
+++ b/maseval/benchmark/mmlu/__init__.py
@@ -4,7 +4,7 @@
 
 Usage:
     from maseval.benchmark.mmlu import (
-        HuggingFaceMMLUBenchmark,
+        DefaultMMLUBenchmark,
         load_tasks,
     )
     from maseval import DISCOQueue, InformativeSubsetQueue
@@ -16,7 +16,7 @@
     )
 
     # Run benchmark
-    benchmark = HuggingFaceMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
+    benchmark = DefaultMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
     results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
@@ -33,7 +33,7 @@
     TARGET_DELIMITER,
     TASK_TYPE_MMLU,
     MMLUBenchmark,
-    HuggingFaceMMLUBenchmark,
+    DefaultMMLUBenchmark,
     MMLUEnvironment,
     MMLUEvaluator,
     MMLUModelAgent,
@@ -53,7 +53,7 @@
     "TARGET_DELIMITER",
     "TASK_TYPE_MMLU",
     "MMLUBenchmark",
-    "HuggingFaceMMLUBenchmark",
+    "DefaultMMLUBenchmark",
     "MMLUEnvironment",
     "MMLUEvaluator",
     "MMLUModelAgent",
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 11159ebb..870781ff 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -8,7 +8,7 @@
 
 Usage:
     from maseval.benchmark.mmlu import (
-        HuggingFaceMMLUBenchmark, load_tasks,
+        DefaultMMLUBenchmark, load_tasks,
     )
     from maseval import DISCOQueue
 
@@ -19,7 +19,7 @@
     )
 
     # Run with the HuggingFace concrete implementation
-    benchmark = HuggingFaceMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
+    benchmark = DefaultMMLUBenchmark(model_id="meta-llama/Llama-2-7b-hf")
     results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
@@ -342,7 +342,7 @@ class MMLUBenchmark(Benchmark):
     - ``setup_agents()`` - create agents for MCQ evaluation
     - ``get_model_adapter()`` - provide model adapters
 
-    For a ready-to-use implementation, see ``HuggingFaceMMLUBenchmark``.
+    For a ready-to-use implementation, see ``DefaultMMLUBenchmark``.
     """
 
     def __init__(
@@ -431,7 +431,7 @@ def evaluate(
         return results
 
 
-class HuggingFaceMMLUBenchmark(MMLUBenchmark):
+class DefaultMMLUBenchmark(MMLUBenchmark):
     """MMLU Benchmark using HuggingFace transformers models.
 
     This concrete implementation uses log-likelihood based MCQ evaluation

From 14bcb3f7d6d4c56d67d739de039b3d7fa8edd3ee Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Wed, 11 Mar 2026 17:51:22 +0100
Subject: [PATCH 05/23] [Move DISCO queue to core] Add ModelScorer,
 ModelAgentAdapter, and rename HuggingFaceModelAdapter Introduce two new core
 abstractions and refactor the HuggingFace inference layer: - ModelScorer
 (maseval.core.scorer): ABC for log-likelihood scoring,   parallel to
 ModelAdapter for generation. Methods: loglikelihood(),  
 loglikelihood_batch(), loglikelihood_choices(). - ModelAgentAdapter
 (maseval.core.agent): generic adapter wrapping any   ModelAdapter as an
 AgentAdapter, replacing benchmark-specific wrappers   like
 MMLUModelAgent/MMLUAgentAdapter. - HuggingFaceModelAdapter renamed to
 HuggingFacePipelineModelAdapter   (old name kept as backwards-compatible
 alias). - HuggingFaceModelScorer (maseval.interface.inference): concrete  
 ModelScorer backed by AutoModelForCausalLM, with single-token   optimisation
 for MCQ evaluation. Extracted from DefaultMMLUBenchmark. -
 DefaultMMLUBenchmark refactored to delegate scoring to  
 HuggingFaceModelScorer and use ModelAgentAdapter.

---
 CHANGELOG.md                                  |  11 +-
 docs/benchmark/mmlu.md                        |  10 +-
 maseval/__init__.py                           |   7 +-
 maseval/benchmark/mmlu/__init__.py            |   4 -
 maseval/benchmark/mmlu/mmlu.py                | 448 ++----------------
 maseval/core/agent.py                         |  76 ++-
 maseval/core/model.py                         |   2 +-
 maseval/core/scorer.py                        | 276 +++++++++++
 maseval/interface/inference/__init__.py       |  39 +-
 maseval/interface/inference/huggingface.py    |  35 +-
 .../interface/inference/huggingface_scorer.py | 264 +++++++++++
 11 files changed, 722 insertions(+), 450 deletions(-)
 create mode 100644 maseval/core/scorer.py
 create mode 100644 maseval/interface/inference/huggingface_scorer.py

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 40d441bb..c6508428 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `DefaultMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `MMLUModelAgent`, `MMLUAgentAdapter`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `DefaultMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
+- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `DefaultMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `DefaultMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
 
 - CONVERSE benchmark for contextual safety evaluation in adversarial agent-to-agent conversations, including `ConverseBenchmark`, `DefaultAgentConverseBenchmark`, `ConverseEnvironment`, `ConverseExternalAgent`, `PrivacyEvaluator`, `SecurityEvaluator`, and `load_tasks()` utilities for `travel`, `real_estate`, and `insurance` domains. Benchmark source files are now downloaded on first use via `ensure_data_exists()` instead of being bundled in the package. (PR: #28)
 
@@ -42,16 +42,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 **Core**
 
 - Added `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import DISCOQueue`. (PR: #34)
+- Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #PR_NUMBER_PLACEHOLDER)
+- Added `ModelAgentAdapter` in `maseval.core.agent` — a generic adapter that wraps any `ModelAdapter` as an `AgentAdapter` for direct model evaluation (replaces benchmark-specific agent wrappers). (PR: #PR_NUMBER_PLACEHOLDER)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
 - Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
 - Added `seed` parameter to `ModelAdapter.__init__` for deterministic model inference (PR: #24)
 - Added `SeedingError` exception for providers that don't support seeding (Anthropic models raise this if seed is provided) (PR: #24)
-- Added seed support to interface adapters: `OpenAIModelAdapter`, `GoogleGenAIModelAdapter`, `LiteLLMModelAdapter`, `HuggingFaceModelAdapter` pass seeds to underlying APIs (PR: #24)
+- Added seed support to interface adapters: `OpenAIModelAdapter`, `GoogleGenAIModelAdapter`, `LiteLLMModelAdapter`, `HuggingFacePipelineModelAdapter` pass seeds to underlying APIs (PR: #24)
 - Added `UserExhaustedError` exception in `maseval.core.exceptions` for flow control when a user's turns are exhausted (PR: #39)
 
 **Interface**
 
+- Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #PR_NUMBER_PLACEHOLDER)
+- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. The old name remains as a backwards-compatible alias. (PR: #PR_NUMBER_PLACEHOLDER)
+
 - CAMEL-AI integration: `CamelAgentAdapter` and `CamelLLMUser` for evaluating CAMEL-AI ChatAgent-based systems (PR: #22)
   - Added `CamelAgentUser` for using a CAMEL ChatAgent as the user in agent-to-agent evaluation (PR: #22)
   - Added `camel_role_playing_execution_loop()` for benchmarks using CAMEL's RolePlaying semantics (PR: #22)
@@ -88,7 +93,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). (PR: #34)
+- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). `DefaultMMLUBenchmark` now delegates log-likelihood computation to `HuggingFaceModelScorer` and uses `ModelAgentAdapter` instead of the MMLU-specific `MMLUModelAgent`/`MMLUAgentAdapter` (removed). (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index 965d1a5f..7514ad18 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -97,13 +97,13 @@ print(f"Evaluating {len(tasks)} anchor tasks")
 `MMLUBenchmark` is a framework-agnostic base class. To use a different model backend, subclass it and implement `setup_agents()` and `get_model_adapter()`:
 
 ```python
-from maseval.benchmark.mmlu import MMLUBenchmark, MMLUModelAgent, MMLUAgentAdapter
+from maseval import ModelAgentAdapter
+from maseval.benchmark.mmlu import MMLUBenchmark
 
 class MyMMLUBenchmark(MMLUBenchmark):
     def setup_agents(self, agent_data, environment, task, user, seed_generator):
         model = self.get_model_adapter(agent_data["model_id"])
-        agent = MMLUModelAgent(model, name="mmlu_agent")
-        adapter = MMLUAgentAdapter(agent, "mmlu_agent")
+        adapter = ModelAgentAdapter(model, name="mmlu_agent")
         return [adapter], {"mmlu_agent": adapter}
 
     def get_model_adapter(self, model_id, **kwargs):
@@ -124,10 +124,6 @@ class MyMMLUBenchmark(MMLUBenchmark):
 
 ::: maseval.benchmark.mmlu.MMLUEvaluator
 
-::: maseval.benchmark.mmlu.MMLUModelAgent
-
-::: maseval.benchmark.mmlu.MMLUAgentAdapter
-
 ::: maseval.benchmark.mmlu.load_tasks
 
 ::: maseval.benchmark.mmlu.compute_benchmark_metrics
diff --git a/maseval/__init__.py b/maseval/__init__.py
index 957fee9b..2aa5b927 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -22,7 +22,7 @@
     AdaptiveTaskQueue,
 )
 from .core.environment import Environment
-from .core.agent import AgentAdapter
+from .core.agent import AgentAdapter, ModelAgentAdapter
 from .core.benchmark import Benchmark, TaskExecutionStatus
 from .core.callback_handler import CallbackHandler
 from .core.callback import BenchmarkCallback, EnvironmentCallback, AgentCallback
@@ -35,6 +35,7 @@
     UserSimulatorError,
 )
 from .core.model import ModelAdapter, ChatResponse
+from .core.scorer import ModelScorer
 from .core.user import User, LLMUser, AgenticLLMUser, TerminationReason
 from .core.evaluator import Evaluator
 from .core.history import MessageHistory, ToolInvocationHistory
@@ -63,6 +64,7 @@
     # Core abstractions
     "Environment",
     "AgentAdapter",
+    "ModelAgentAdapter",
     "Benchmark",
     "TaskExecutionStatus",
     # Callbacks
@@ -99,9 +101,10 @@
     "DISCOQueue",
     "PriorityTaskQueue",
     "AdaptiveTaskQueue",
-    # Model adapters
+    # Model adapters and scorers
     "ModelAdapter",
     "ChatResponse",
+    "ModelScorer",
     # Exceptions and validation
     "MASEvalError",
     "AgentError",
diff --git a/maseval/benchmark/mmlu/__init__.py b/maseval/benchmark/mmlu/__init__.py
index dd9fd3dc..6c6f751c 100644
--- a/maseval/benchmark/mmlu/__init__.py
+++ b/maseval/benchmark/mmlu/__init__.py
@@ -36,8 +36,6 @@
     DefaultMMLUBenchmark,
     MMLUEnvironment,
     MMLUEvaluator,
-    MMLUModelAgent,
-    MMLUAgentAdapter,
     load_tasks,
     compute_benchmark_metrics,
 )
@@ -56,8 +54,6 @@
     "DefaultMMLUBenchmark",
     "MMLUEnvironment",
     "MMLUEvaluator",
-    "MMLUModelAgent",
-    "MMLUAgentAdapter",
     "InformativeSubsetQueue",
     "DISCOQueue",
     "load_tasks",
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 870781ff..ef895e65 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -43,15 +43,13 @@
     Benchmark,
     Environment,
     Evaluator,
-    MessageHistory,
     ModelAdapter,
+    ModelAgentAdapter,
     Task,
     User,
     SeedGenerator,
 )
 from maseval.core.task import SequentialTaskQueue
-from maseval.core.tracing import TraceableMixin
-from maseval.core.config import ConfigurableMixin
 
 
 # =============================================================================
@@ -229,103 +227,6 @@ def _parse_answer(self, response: str) -> int:
         return -1
 
 
-# =============================================================================
-# Model Adapter Wrapper for MCQ
-# =============================================================================
-
-
-class MMLUModelAgent(TraceableMixin, ConfigurableMixin):
-    """Simple agent wrapper that passes prompts to a model for MCQ evaluation.
-
-    This is a minimal agent that just forwards prompts to the model
-    and returns the response. It supports tracing for MASEval integration.
-    """
-
-    def __init__(self, model: ModelAdapter, name: str = DEFAULT_AGENT_NAME):
-        """Initialize MMLU model agent.
-
-        Args:
-            model: ModelAdapter to use for generation.
-            name: Agent name for tracing.
-        """
-        super().__init__()
-        self.model = model
-        self.name = name
-        self._messages: List[Dict[str, Any]] = []
-
-    def run(self, prompt: str) -> str:
-        """Run the model on a prompt.
-
-        Args:
-            prompt: The prompt to send to the model.
-
-        Returns:
-            Model's response string.
-        """
-        # Record input message
-        self._messages.append({"role": "user", "content": prompt})
-
-        # Generate response
-        response = self.model.generate(prompt)
-
-        # Record output message
-        self._messages.append({"role": "assistant", "content": response})
-
-        return response
-
-    def gather_traces(self) -> Dict[str, Any]:
-        """Gather traces for this agent."""
-        return {
-            **super().gather_traces(),
-            "name": self.name,
-            "messages": list(self._messages),
-        }
-
-    def gather_config(self) -> Dict[str, Any]:
-        """Gather configuration."""
-        return {
-            **super().gather_config(),
-            "name": self.name,
-            "model_id": self.model.model_id,
-        }
-
-
-class MMLUAgentAdapter(AgentAdapter):
-    """AgentAdapter wrapper for MMLUModelAgent."""
-
-    def __init__(self, agent: MMLUModelAgent, name: str):
-        """Initialize adapter.
-
-        Args:
-            agent: MMLUModelAgent instance.
-            name: Adapter name.
-        """
-        super().__init__(agent, name)
-
-    def _run_agent(self, query: str) -> Any:
-        """Execute the agent."""
-        return self.agent.run(query)
-
-    def get_messages(self) -> MessageHistory:
-        """Get agent messages."""
-        return MessageHistory(self.agent._messages)
-
-    def gather_traces(self) -> Dict[str, Any]:
-        """Gather execution traces from this agent."""
-        from maseval.core.tracing import TraceableMixin
-
-        messages = self.get_messages()
-        return {
-            **TraceableMixin.gather_traces(self),
-            "name": self.name,
-            "agent_type": type(self.agent).__name__,
-            "message_count": len(messages),
-            "messages": messages.to_list(),
-            "callbacks": [type(cb).__name__ for cb in self.callbacks],
-            "logs": self.logs,
-        }
-
-
 # =============================================================================
 # Benchmark
 # =============================================================================
@@ -435,12 +336,14 @@ class DefaultMMLUBenchmark(MMLUBenchmark):
     """MMLU Benchmark using HuggingFace transformers models.
 
     This concrete implementation uses log-likelihood based MCQ evaluation
-    with the same optimizations as lm-evaluation-harness:
+    via ``HuggingFaceModelScorer``, with the same optimisations as
+    lm-evaluation-harness:
 
-    1. Single forward pass per question (one-token continuation optimization)
-    2. Batching multiple questions together
-    3. Efficient log-softmax computation
-    4. Proper left-padding for batch processing
+    1. Single forward pass per question (one-token continuation optimisation)
+    2. Efficient log-softmax computation
+    3. Proper left-padding for batch processing
+
+    Agents are created using the generic ``ModelAgentAdapter``.
     """
 
     def __init__(
@@ -459,16 +362,22 @@ def __init__(
             device: Device to run model on.
             trust_remote_code: Trust remote code when loading model (default True).
             use_full_prompt: Use full prompt with few-shot examples (default True).
-            batch_size: Batch size for evaluation (number of questions per batch).
-            **kwargs: Additional arguments passed to MMLUBenchmark.
+            batch_size: Batch size for lm-eval batching (number of questions per batch).
+            **kwargs: Additional arguments passed to ``MMLUBenchmark``.
         """
         super().__init__(use_full_prompt=use_full_prompt, **kwargs)
         self._model_id = model_id
         self._device = device
         self._trust_remote_code = trust_remote_code
         self._batch_size = batch_size
-        self._model = None
-        self._tokenizer = None
+
+        from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+        self._scorer = HuggingFaceModelScorer(
+            model_id=model_id,
+            device=device,
+            trust_remote_code=trust_remote_code,
+        )
 
     def setup_agents(
         self,
@@ -492,189 +401,9 @@ def setup_agents(
         """
         model_id = agent_data["model_id"]
         model = self.get_model_adapter(model_id, register_name=DEFAULT_MODEL_REGISTER_NAME)
-
-        agent = MMLUModelAgent(model, name=DEFAULT_AGENT_NAME)
-        adapter = MMLUAgentAdapter(agent, DEFAULT_AGENT_NAME)
-
+        adapter = ModelAgentAdapter(model, DEFAULT_AGENT_NAME)
         return [adapter], {DEFAULT_AGENT_NAME: adapter}
 
-    def _load_model(self):
-        """Lazy load the model and tokenizer for log-likelihood computation."""
-        if self._model is None:
-            from transformers import AutoModelForCausalLM, AutoTokenizer
-
-            print(f"Loading model: {self._model_id}")
-            self._tokenizer = AutoTokenizer.from_pretrained(
-                self._model_id,
-                trust_remote_code=self._trust_remote_code,
-            )
-            self._tokenizer.padding_side = "left"
-            if self._tokenizer.pad_token is None:
-                self._tokenizer.pad_token = self._tokenizer.eos_token
-
-            # Load model with torch_dtype="auto" to match lm-evaluation-harness exactly
-            # This uses the model's native dtype (bfloat16 for most modern models)
-            # Then move to device manually
-            self._model = AutoModelForCausalLM.from_pretrained(
-                self._model_id,
-                trust_remote_code=self._trust_remote_code,
-                torch_dtype="auto",
-            )
-            self._model = self._model.to(self._device)
-            self._model.eval()
-
-            # Note: We don't pre-cache choice token IDs here because they depend on context.
-            # Token IDs are computed dynamically in _get_choice_token_id_in_context()
-            # to match lm-evaluation-harness behavior exactly.
-
-        return self._model, self._tokenizer
-
-    def _get_choice_token_id_separate(self, choice: str) -> Optional[int]:
-        """Get the token ID for a choice when tokenized SEPARATELY.
-
-        CRITICAL: lm-evaluation-harness encodes context and continuation separately,
-        then concatenates. This means "A" is always tokenized standalone (token 330),
-        NOT in context after "Answer:" (which would be token 28741).
-
-        We must match this behavior to get identical log-likelihood values.
-
-        Args:
-            choice: The choice string (e.g., "A").
-
-        Returns:
-            Token ID for the choice (standalone tokenization), or None if multi-token.
-        """
-        _, tokenizer = self._load_model()
-
-        # Tokenize choice ALONE (not in context) - this is how lm-eval does it
-        choice_tokens = tokenizer.encode(choice, add_special_tokens=False)
-
-        if len(choice_tokens) == 1:
-            return choice_tokens[0]
-        else:
-            # Multi-token choice - return None to trigger multi-token fallback
-            return None
-
-    def _encode_pair(self, context: str, continuation: str) -> tuple:
-        """Encode a context-continuation pair like lm-evaluation-harness.
-
-        This matches lm-eval's _encode_pair method exactly:
-        1. Encode whole = context + continuation
-        2. Encode context alone
-        3. continuation_enc = whole[len(context_enc):]
-
-        This handles tokenization boundary effects correctly.
-
-        Args:
-            context: The context/prompt string.
-            continuation: The continuation string (e.g., " A" with target_delimiter).
-
-        Returns:
-            Tuple of (context_enc, continuation_enc) token lists.
-        """
-        _, tokenizer = self._load_model()
-
-        # Handle trailing spaces in context (move to continuation)
-        n_spaces = len(context) - len(context.rstrip())
-        if n_spaces > 0:
-            continuation = context[-n_spaces:] + continuation
-            context = context[:-n_spaces]
-
-        # Encode whole string together, then split
-        whole_enc = tokenizer.encode(context + continuation, add_special_tokens=True)
-        context_enc = tokenizer.encode(context, add_special_tokens=True)
-
-        # Continuation tokens are what's left after context
-        continuation_enc = whole_enc[len(context_enc) :]
-
-        return context_enc, continuation_enc
-
-    def _compute_logprobs_single_token(self, prompt: str, choices: list) -> list:
-        """Compute log-likelihoods using single-token optimization.
-
-        For MCQ with single-letter answers (A, B, C, D), we can compute all
-        choices in one forward pass since they share the same context.
-
-        IMPORTANT: To match lm-evaluation-harness EXACTLY:
-        1. Use target_delimiter=" " before choices (e.g., " A" not "A")
-        2. Use _encode_pair to handle tokenization boundaries correctly
-        3. Input = (context + continuation)[:-1]
-        4. Apply log_softmax to get log probabilities
-
-        Args:
-            prompt: The prompt/question text.
-            choices: List of answer choice strings (e.g., ["A", "B", "C", "D"]).
-
-        Returns:
-            List of log-likelihoods, one per choice.
-        """
-        import torch
-
-        model, _ = self._load_model()
-
-        # lm-eval uses target_delimiter=" " for multiple choice tasks
-        target_delimiter = TARGET_DELIMITER
-
-        # Encode first choice to get the shared context
-        first_continuation = f"{target_delimiter}{choices[0]}"
-        context_enc, first_cont_enc = self._encode_pair(prompt, first_continuation)
-
-        # Build input: (context + continuation)[:-1]
-        full_sequence = context_enc + first_cont_enc
-        input_tokens = full_sequence[:-1]  # Remove last token
-
-        input_ids = torch.tensor([input_tokens], dtype=torch.long, device=self._device)
-
-        with torch.no_grad():
-            outputs = model(input_ids)
-            logits = outputs.logits[0]  # (seq_len, vocab_size)
-
-            # Select logits at position where continuation is predicted
-            # For single-token continuation, this is the last position
-            inplen = len(input_tokens)
-            contlen = len(first_cont_enc)
-            selected_logits = logits[inplen - contlen : inplen]
-
-            # Compute log-softmax
-            log_probs = torch.nn.functional.log_softmax(selected_logits, dim=-1)
-
-            # Get log prob for each choice's continuation token
-            logprobs = []
-            for choice in choices:
-                continuation = f"{target_delimiter}{choice}"
-                _, cont_enc = self._encode_pair(prompt, continuation)
-
-                # Sum log probs for multi-token continuations
-                total = 0.0
-                for i, token_id in enumerate(cont_enc):
-                    total += log_probs[i, token_id].item()
-                logprobs.append(total)
-
-        return logprobs
-
-    def _compute_logprobs_batched(self, prompts: list, choices_list: list) -> list:
-        """Compute log-likelihoods for a batch of prompts.
-
-        For exact match with lm-evaluation-harness, we process each prompt
-        individually using _compute_logprobs_single_token which uses the
-        correct _encode_pair tokenization logic.
-
-        Args:
-            prompts: List of prompt strings.
-            choices_list: List of choice lists (one per prompt).
-
-        Returns:
-            List of log-likelihood lists, one per prompt.
-        """
-        # For exact match with lm-eval, process individually
-        # This ensures correct tokenization via _encode_pair
-        all_logprobs = []
-        for prompt, choices in zip(prompts, choices_list):
-            logprobs = self._compute_logprobs_single_token(prompt, choices)
-            all_logprobs.append(logprobs)
-
-        return all_logprobs
-
     def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, List[float]]:
         """Precompute log-likelihoods for ALL tasks using lm-eval's batching.
 
@@ -755,60 +484,6 @@ def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, Lis
 
         return doc_logprobs
 
-    def _compute_logprobs_multi_token(self, prompt: str, choices: list) -> list:
-        """Compute log-likelihoods for multi-token continuations.
-
-        This is the fallback for when answer choices have multiple tokens.
-        Uses _encode_pair to match lm-evaluation-harness exactly.
-
-        Args:
-            prompt: The prompt/question text.
-            choices: List of answer choice strings.
-
-        Returns:
-            List of log-likelihoods, one per choice.
-        """
-        import torch
-
-        model, _ = self._load_model()
-
-        # lm-eval uses target_delimiter=" " for multiple choice tasks
-        target_delimiter = TARGET_DELIMITER
-
-        all_logprobs = []
-        for choice in choices:
-            continuation = f"{target_delimiter}{choice}"
-
-            # Use _encode_pair for correct tokenization
-            context_enc, continuation_enc = self._encode_pair(prompt, continuation)
-
-            # Build input: (context + continuation)[:-1]
-            full_sequence = context_enc + continuation_enc
-            input_tokens = full_sequence[:-1]
-
-            input_ids = torch.tensor([input_tokens], dtype=torch.long, device=self._device)
-
-            with torch.no_grad():
-                outputs = model(input_ids)
-                logits = outputs.logits[0]  # (seq_len, vocab_size)
-
-                # Select continuation logits
-                inplen = len(input_tokens)
-                contlen = len(continuation_enc)
-                selected = logits[inplen - contlen : inplen]
-
-                # Compute log-softmax
-                log_probs = torch.nn.functional.log_softmax(selected, dim=-1)
-
-                # Sum log probs for all continuation tokens
-                total = 0.0
-                for i, token_id in enumerate(continuation_enc):
-                    total += log_probs[i, token_id].item()
-
-                all_logprobs.append(total)
-
-        return all_logprobs
-
     def run_agents(
         self,
         agents: Sequence[AgentAdapter],
@@ -819,111 +494,62 @@ def run_agents(
         """Execute log-likelihood based MCQ evaluation.
 
         Uses precomputed logprobs if available (for exact lm-eval match),
-        otherwise falls back to single-forward-pass optimization for
-        single-token answers, or multi-token batched computation.
+        otherwise delegates to ``HuggingFaceModelScorer.loglikelihood_choices()``
+        which automatically picks single-token or multi-token scoring.
         """
-        # Get the prompt from environment
         prompt = environment.get_prompt()
         choices = environment.state.get("choices", DEFAULT_CHOICES)
         doc_id = task.metadata.get("doc_id") if task else None
 
-        # Check if we have precomputed logprobs (for exact lm-eval match)
         if hasattr(self, "_precomputed_logprobs") and doc_id is not None:
             logprobs = self._precomputed_logprobs.get(doc_id)
             if logprobs is not None:
-                # Use precomputed values for exact match
                 best_idx = logprobs.index(max(logprobs))
                 answer = choices[best_idx]
-
-                # Store logprobs in environment for later retrieval
                 environment.state["logprobs"] = logprobs
                 environment.state["predicted_idx"] = best_idx
-
-                # Record in agent messages for tracing
                 agent = agents[0]
-                agent.agent._messages.append({"role": "user", "content": prompt})
-                agent.agent._messages.append(
-                    {
-                        "role": "assistant",
-                        "content": answer,
-                        "logprobs": logprobs,
-                    }
-                )
-
+                agent._messages.append({"role": "user", "content": prompt})
+                agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
                 return answer
 
-        # Fall back to computing logprobs on-the-fly
-        # Load model
-        self._load_model()
-
-        # lm-eval uses target_delimiter=" " for multiple choice tasks
-        target_delimiter = TARGET_DELIMITER
-
-        # Check if all choices result in single-token continuations
-        # using _encode_pair to get the correct tokenization
-        all_single_token = True
-        for choice in choices:
-            continuation = f"{target_delimiter}{choice}"
-            _, cont_enc = self._encode_pair(prompt, continuation)
-            if len(cont_enc) != 1:
-                all_single_token = False
-                break
+        logprobs = self._scorer.loglikelihood_choices(prompt, choices, delimiter=TARGET_DELIMITER)
 
-        if all_single_token:
-            # Use optimized single-token path (one forward pass)
-            logprobs = self._compute_logprobs_single_token(prompt, choices)
-        else:
-            # Fall back to multi-token computation
-            logprobs = self._compute_logprobs_multi_token(prompt, choices)
-
-        # Select the choice with highest log-probability
         best_idx = logprobs.index(max(logprobs))
         answer = choices[best_idx]
-
-        # Store logprobs in environment for later retrieval if needed
         environment.state["logprobs"] = logprobs
         environment.state["predicted_idx"] = best_idx
 
-        # Record in agent messages for tracing
         agent = agents[0]
-        agent.agent._messages.append({"role": "user", "content": prompt})
-        agent.agent._messages.append(
-            {
-                "role": "assistant",
-                "content": answer,
-                "logprobs": logprobs,
-            }
-        )
-
+        agent._messages.append({"role": "user", "content": prompt})
+        agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
         return answer
 
     def get_model_adapter(self, model_id: str, **kwargs: Any) -> ModelAdapter:
-        """Provide a HuggingFace ModelAdapter.
+        """Provide a HuggingFace ``ModelAdapter``.
 
-        Note: For logprobs-based evaluation, we don't actually use the adapter
-        for generation. This is kept for API compatibility.
+        The returned adapter is a placeholder — actual evaluation uses
+        ``HuggingFaceModelScorer`` for log-likelihood scoring. The adapter
+        is required by the ``Benchmark`` contract for ``setup_agents()``.
 
         Args:
             model_id: Model identifier (ignored, uses instance model_id).
-            **kwargs: Additional arguments (e.g., register_name).
+            **kwargs: Additional arguments (e.g., ``register_name``).
 
         Returns:
-            ``HuggingFaceModelAdapter`` instance.
+            ``HuggingFacePipelineModelAdapter`` instance.
         """
-        from maseval.interface.inference import HuggingFaceModelAdapter
+        from maseval.interface.inference import HuggingFacePipelineModelAdapter
 
-        # Create a minimal adapter for compatibility
-        # The actual evaluation uses _compute_logprobs_*
-        class DummyCallable:
-            def __call__(self, prompt, **kwargs):
+        class _DummyCallable:
+            def __call__(self, prompt: str, **kw: Any) -> str:
                 return ""
 
-        adapter = HuggingFaceModelAdapter(
-            model=DummyCallable(),
+        adapter = HuggingFacePipelineModelAdapter(
+            model=_DummyCallable(),
             model_id=self._model_id,
         )
 
-        # Register for tracing if requested
         register_name = kwargs.get("register_name")
         if register_name:
             self.register("models", register_name, adapter)
diff --git a/maseval/core/agent.py b/maseval/core/agent.py
index 97011527..e76a3ea6 100644
--- a/maseval/core/agent.py
+++ b/maseval/core/agent.py
@@ -1,11 +1,16 @@
+from __future__ import annotations
+
 from abc import ABC, abstractmethod
-from typing import List, Any, Optional, Dict
+from typing import TYPE_CHECKING, List, Any, Optional, Dict
 
 from .callback import AgentCallback
 from .history import MessageHistory
 from .tracing import TraceableMixin
 from .config import ConfigurableMixin
 
+if TYPE_CHECKING:
+    from .model import ModelAdapter
+
 
 class AgentAdapter(ABC, TraceableMixin, ConfigurableMixin):
     """Wraps an agent from any framework to provide a standard interface.
@@ -186,3 +191,72 @@ def gather_config(self) -> Dict[str, Any]:
 
     def __repr__(self):
         return f"AgentAdapter(name={self.name}, agent_type={type(self.agent).__name__})"
+
+
+class ModelAgentAdapter(AgentAdapter):
+    """Wraps a ``ModelAdapter`` as an ``AgentAdapter`` for direct model evaluation.
+
+    Use this when a benchmark needs to plug a model directly into the agent
+    slot without an agentic framework. The adapter forwards queries to
+    ``ModelAdapter.generate()`` and records the conversation for tracing.
+
+    Example:
+        ```python
+        from maseval import ModelAgentAdapter
+        from maseval.interface.inference import LiteLLMModelAdapter
+
+        model = LiteLLMModelAdapter(model_id="gpt-4")
+        agent = ModelAgentAdapter(model, name="evaluator")
+        result = agent.run("What is the capital of France?")
+        ```
+    """
+
+    def __init__(
+        self,
+        model: ModelAdapter,
+        name: str,
+        callbacks: Optional[List[AgentCallback]] = None,
+    ):
+        """Initialize a model-backed agent adapter.
+
+        Args:
+            model: ``ModelAdapter`` instance used for generation.
+            name: Agent name for tracing and identification.
+            callbacks: Optional agent callbacks.
+        """
+        super().__init__(model, name, callbacks)
+        self._messages: List[Dict[str, Any]] = []
+
+    @property
+    def model(self) -> ModelAdapter:
+        """The underlying ``ModelAdapter``."""
+        return self.agent
+
+    def _run_agent(self, query: str) -> str:
+        """Generate a response by forwarding the query to the model.
+
+        Args:
+            query: The prompt to send to the model.
+
+        Returns:
+            The model's text response.
+        """
+        self._messages.append({"role": "user", "content": query})
+        response = self.agent.generate(query)
+        self._messages.append({"role": "assistant", "content": response})
+        return response
+
+    def get_messages(self) -> MessageHistory:
+        """Return the recorded conversation history."""
+        return MessageHistory(self._messages)
+
+    def gather_config(self) -> Dict[str, Any]:
+        """Gather configuration including model identifier.
+
+        Returns:
+            Dictionary containing agent and model configuration.
+        """
+        return {
+            **super().gather_config(),
+            "model_id": self.agent.model_id,
+        }
diff --git a/maseval/core/model.py b/maseval/core/model.py
index cac1c2ed..d62d204c 100644
--- a/maseval/core/model.py
+++ b/maseval/core/model.py
@@ -155,7 +155,7 @@ class ModelAdapter(ABC, TraceableMixin, ConfigurableMixin):
     See maseval.interface.inference for concrete implementations:
         - AnthropicModelAdapter
         - GoogleGenAIModelAdapter
-        - HuggingFaceModelAdapter
+        - HuggingFacePipelineModelAdapter (alias: HuggingFaceModelAdapter)
         - LiteLLMModelAdapter
         - OpenAIModelAdapter
 
diff --git a/maseval/core/scorer.py b/maseval/core/scorer.py
new file mode 100644
index 00000000..aed7d672
--- /dev/null
+++ b/maseval/core/scorer.py
@@ -0,0 +1,276 @@
+"""Core model scorer abstractions for likelihood-based evaluation.
+
+This module provides the base `ModelScorer` class for computing token-level
+scores (log-likelihoods) from language models. While `ModelAdapter` handles
+text generation (``chat``, ``generate``), ``ModelScorer`` handles scoring by
+computing how likely a model considers a given continuation.
+
+See `maseval.interface.inference` for concrete implementations.
+
+Example:
+    ```python
+    from maseval.interface.inference import HuggingFaceModelScorer
+
+    scorer = HuggingFaceModelScorer(
+        model_id="meta-llama/Llama-2-7b-hf",
+        device="cuda:0",
+    )
+
+    # Single pair
+    ll = scorer.loglikelihood("The capital of France is", " Paris")
+
+    # MCQ evaluation
+    logprobs = scorer.loglikelihood_choices(
+        "What is 2+2?\\nA) 3\\nB) 4\\nC) 5\\nD) 6\\nAnswer:",
+        choices=["A", "B", "C", "D"],
+    )
+    best = ["A", "B", "C", "D"][logprobs.index(max(logprobs))]
+    ```
+"""
+
+from __future__ import annotations
+
+import time
+from abc import ABC, abstractmethod
+from datetime import datetime
+from typing import Any, Dict, List, Optional, Tuple
+
+from .config import ConfigurableMixin
+from .tracing import TraceableMixin
+
+
+class ModelScorer(ABC, TraceableMixin, ConfigurableMixin):
+    """Abstract base class for model scorers.
+
+    ``ModelScorer`` provides a consistent interface for computing token-level
+    log-likelihoods from language models. All scorers implement the same
+    methods, so you can swap providers without changing evaluation code.
+
+    To use a scorer:
+
+    1. Create an instance with provider-specific configuration
+    2. Call ``loglikelihood()`` for single context-continuation pairs
+    3. Call ``loglikelihood_batch()`` for efficient batch computation
+    4. Call ``loglikelihood_choices()`` for MCQ evaluation
+
+    Implementing a custom scorer:
+
+    Subclass ``ModelScorer`` and implement:
+
+    - ``model_id`` property: Return the model identifier string
+    - ``_loglikelihood_impl()``: Score a single (context, continuation) pair
+
+    Optionally override:
+
+    - ``_loglikelihood_batch_impl()``: Optimised batch scoring
+    - ``loglikelihood_choices()``: MCQ-specific optimisations (e.g. shared-context single-pass)
+    """
+
+    def __init__(self, seed: Optional[int] = None):
+        """Initialize the model scorer.
+
+        Args:
+            seed: Seed for deterministic scoring. Passed to the underlying
+                model if supported.
+        """
+        super().__init__()
+        self._seed = seed
+        self.logs: List[Dict[str, Any]] = []
+
+    @property
+    def seed(self) -> Optional[int]:
+        """Seed for deterministic scoring, or None if unseeded."""
+        return self._seed
+
+    @property
+    @abstractmethod
+    def model_id(self) -> str:
+        """The identifier for the underlying model.
+
+        Returns:
+            A string identifying the model (e.g., ``"meta-llama/Llama-2-7b-hf"``).
+        """
+
+    def loglikelihood(self, context: str, continuation: str) -> float:
+        """Compute the log-likelihood of ``continuation`` given ``context``.
+
+        Args:
+            context: The conditioning text (prompt).
+            continuation: The text whose likelihood is scored.
+
+        Returns:
+            Log-likelihood (negative float; higher = more likely).
+        """
+        start_time = time.time()
+        try:
+            result = self._loglikelihood_impl(context, continuation)
+            duration = time.time() - start_time
+            self.logs.append(
+                {
+                    "timestamp": datetime.now().isoformat(),
+                    "type": "loglikelihood",
+                    "duration_seconds": duration,
+                    "status": "success",
+                }
+            )
+            return result
+        except Exception as e:
+            duration = time.time() - start_time
+            self.logs.append(
+                {
+                    "timestamp": datetime.now().isoformat(),
+                    "type": "loglikelihood",
+                    "duration_seconds": duration,
+                    "status": "error",
+                    "error": str(e),
+                    "error_type": type(e).__name__,
+                }
+            )
+            raise
+
+    @abstractmethod
+    def _loglikelihood_impl(self, context: str, continuation: str) -> float:
+        """Internal implementation for single-pair scoring.
+
+        Subclasses must implement this. The base class handles timing
+        and error logging.
+
+        Args:
+            context: The conditioning text.
+            continuation: The text to score.
+
+        Returns:
+            Log-likelihood of the continuation.
+        """
+
+    def loglikelihood_batch(self, pairs: List[Tuple[str, str]]) -> List[float]:
+        """Compute log-likelihoods for a batch of (context, continuation) pairs.
+
+        Override ``_loglikelihood_batch_impl`` for provider-specific batching
+        optimisations. The default loops over ``_loglikelihood_impl``.
+
+        Args:
+            pairs: List of (context, continuation) tuples.
+
+        Returns:
+            List of log-likelihoods, one per pair.
+        """
+        start_time = time.time()
+        try:
+            results = self._loglikelihood_batch_impl(pairs)
+            duration = time.time() - start_time
+            self.logs.append(
+                {
+                    "timestamp": datetime.now().isoformat(),
+                    "type": "loglikelihood_batch",
+                    "batch_size": len(pairs),
+                    "duration_seconds": duration,
+                    "status": "success",
+                }
+            )
+            return results
+        except Exception as e:
+            duration = time.time() - start_time
+            self.logs.append(
+                {
+                    "timestamp": datetime.now().isoformat(),
+                    "type": "loglikelihood_batch",
+                    "batch_size": len(pairs),
+                    "duration_seconds": duration,
+                    "status": "error",
+                    "error": str(e),
+                    "error_type": type(e).__name__,
+                }
+            )
+            raise
+
+    def _loglikelihood_batch_impl(self, pairs: List[Tuple[str, str]]) -> List[float]:
+        """Default batch implementation — loops over ``_loglikelihood_impl``.
+
+        Override in subclasses for provider-specific batching.
+
+        Args:
+            pairs: List of (context, continuation) tuples.
+
+        Returns:
+            List of log-likelihoods.
+        """
+        return [self._loglikelihood_impl(ctx, cont) for ctx, cont in pairs]
+
+    def loglikelihood_choices(
+        self,
+        context: str,
+        choices: List[str],
+        delimiter: str = " ",
+    ) -> List[float]:
+        """Compute log-likelihoods for multiple-choice continuations.
+
+        Convenience method for MCQ evaluation. Each choice is prepended with
+        ``delimiter`` before scoring (e.g. ``" A"``, ``" B"``).
+
+        Subclasses may override this for optimised shared-context scoring
+        (e.g. single forward pass for single-token choices).
+
+        Args:
+            context: The question/prompt text.
+            choices: Answer choice strings (e.g. ``["A", "B", "C", "D"]``).
+            delimiter: String prepended to each choice (default ``" "``).
+
+        Returns:
+            List of log-likelihoods, one per choice.
+        """
+        pairs = [(context, f"{delimiter}{c}") for c in choices]
+        return self.loglikelihood_batch(pairs)
+
+    def gather_traces(self) -> Dict[str, Any]:
+        """Gather execution traces from this scorer.
+
+        Output fields:
+
+        - ``type`` - Component class name
+        - ``gathered_at`` - ISO timestamp
+        - ``model_id`` - Model identifier
+        - ``total_calls`` - Number of scoring calls
+        - ``successful_calls`` - Number of successful calls
+        - ``failed_calls`` - Number of failed calls
+        - ``total_duration_seconds`` - Total time spent in calls
+        - ``logs`` - List of individual call records
+
+        Returns:
+            Dictionary containing scorer execution traces.
+        """
+        total_calls = len(self.logs)
+        successful_calls = sum(1 for call in self.logs if call["status"] == "success")
+        failed_calls = total_calls - successful_calls
+        total_duration = sum(call["duration_seconds"] for call in self.logs)
+
+        return {
+            **super().gather_traces(),
+            "model_id": self.model_id,
+            "total_calls": total_calls,
+            "successful_calls": successful_calls,
+            "failed_calls": failed_calls,
+            "total_duration_seconds": total_duration,
+            "logs": self.logs,
+        }
+
+    def gather_config(self) -> Dict[str, Any]:
+        """Gather configuration from this scorer.
+
+        Output fields:
+
+        - ``type`` - Component class name
+        - ``gathered_at`` - ISO timestamp
+        - ``model_id`` - Model identifier
+        - ``scorer_type`` - The specific scorer class name
+        - ``seed`` - Seed for deterministic scoring, or None if unseeded
+
+        Returns:
+            Dictionary containing scorer configuration.
+        """
+        return {
+            **super().gather_config(),
+            "model_id": self.model_id,
+            "scorer_type": type(self).__name__,
+            "seed": self._seed,
+        }
diff --git a/maseval/interface/inference/__init__.py b/maseval/interface/inference/__init__.py
index e6765d1e..549c719b 100644
--- a/maseval/interface/inference/__init__.py
+++ b/maseval/interface/inference/__init__.py
@@ -1,14 +1,20 @@
-"""Inference model adapters for various providers.
+"""Inference model adapters and scorers for various providers.
 
-This package contains concrete implementations of ModelAdapter for different
-inference providers. Each adapter requires the corresponding optional dependency.
+This package contains concrete implementations of ``ModelAdapter`` and
+``ModelScorer`` for different inference providers. Each adapter/scorer
+requires the corresponding optional dependency.
 
-Available adapters:
-    - AnthropicModelAdapter: Anthropic Claude models (requires anthropic)
-    - GoogleGenAIModelAdapter: Google Gemini models (requires google-genai)
-    - HuggingFaceModelAdapter: HuggingFace transformers (requires transformers)
-    - LiteLLMModelAdapter: 100+ providers via LiteLLM (requires litellm)
-    - OpenAIModelAdapter: OpenAI and compatible APIs (requires openai)
+Available adapters (text generation):
+
+- ``AnthropicModelAdapter``: Anthropic Claude models (requires ``anthropic``)
+- ``GoogleGenAIModelAdapter``: Google Gemini models (requires ``google-genai``)
+- ``HuggingFacePipelineModelAdapter``: HuggingFace pipelines (requires ``transformers``)
+- ``LiteLLMModelAdapter``: 100+ providers via LiteLLM (requires ``litellm``)
+- ``OpenAIModelAdapter``: OpenAI and compatible APIs (requires ``openai``)
+
+Available scorers (log-likelihood):
+
+- ``HuggingFaceModelScorer``: HuggingFace causal LMs (requires ``transformers``)
 
 Example:
     ```python
@@ -49,13 +55,26 @@
 
 # Conditionally import HuggingFace adapter
 try:
-    from .huggingface import HuggingFaceModelAdapter, ToolCallingNotSupportedError  # noqa: F401
+    from .huggingface import (  # noqa: F401
+        HuggingFacePipelineModelAdapter,
+        HuggingFaceModelAdapter,
+        ToolCallingNotSupportedError,
+    )
 
+    __all__.append("HuggingFacePipelineModelAdapter")
     __all__.append("HuggingFaceModelAdapter")
     __all__.append("ToolCallingNotSupportedError")
 except ImportError:
     pass
 
+# Conditionally import HuggingFace scorer
+try:
+    from .huggingface_scorer import HuggingFaceModelScorer  # noqa: F401
+
+    __all__.append("HuggingFaceModelScorer")
+except ImportError:
+    pass
+
 # Conditionally import LiteLLM adapter
 try:
     from .litellm import LiteLLMModelAdapter  # noqa: F401
diff --git a/maseval/interface/inference/huggingface.py b/maseval/interface/inference/huggingface.py
index 45fac7e8..f765eb49 100644
--- a/maseval/interface/inference/huggingface.py
+++ b/maseval/interface/inference/huggingface.py
@@ -1,7 +1,10 @@
-"""HuggingFace model adapter.
+"""HuggingFace pipeline model adapter.
 
-This adapter works with HuggingFace transformers pipelines and models.
-It supports both simple callable models and full pipeline objects.
+This adapter works with HuggingFace transformers pipelines and callables
+for text generation via ``chat()`` and ``generate()``.
+
+For log-likelihood scoring (e.g. MCQ evaluation), see
+``HuggingFaceModelScorer`` in ``maseval.interface.inference.huggingface_scorer``.
 
 Requires transformers to be installed:
     pip install maseval[transformers]
@@ -9,11 +12,11 @@
 Example:
     ```python
     from transformers import pipeline
-    from maseval.interface.inference import HuggingFaceModelAdapter
+    from maseval.interface.inference import HuggingFacePipelineModelAdapter
 
     # Using a pipeline
     pipe = pipeline("text-generation", model="meta-llama/Llama-3.1-8B-Instruct")
-    model = HuggingFaceModelAdapter(model=pipe, model_id="llama-3.1-8b")
+    model = HuggingFacePipelineModelAdapter(model=pipe, model_id="llama-3.1-8b")
 
     # Simple generation
     response = model.generate("Hello!")
@@ -42,12 +45,18 @@ class ToolCallingNotSupportedError(Exception):
     pass
 
 
-class HuggingFaceModelAdapter(ModelAdapter):
-    """Adapter for HuggingFace transformers models and pipelines.
+class HuggingFacePipelineModelAdapter(ModelAdapter):
+    """Adapter for HuggingFace transformers pipelines and callables.
+
+    Wraps a HuggingFace ``pipeline()`` object (or any text-generation callable)
+    for use with the ``ModelAdapter`` interface (``chat()``, ``generate()``).
+
+    For log-likelihood scoring, see ``HuggingFaceModelScorer``.
 
     Works with:
-        - transformers.pipeline() objects
-        - Any callable that accepts a prompt and returns text
+
+    - ``transformers.pipeline()`` objects
+    - Any callable that accepts a prompt and returns text
 
     For chat functionality, the adapter uses the tokenizer's chat template
     if available. This provides proper formatting for instruction-tuned models.
@@ -55,8 +64,8 @@ class HuggingFaceModelAdapter(ModelAdapter):
     Tool calling support:
         Tool calling is only supported if the model's chat template explicitly
         supports it. If you pass tools and the model doesn't support them,
-        a ToolCallingNotSupportedError is raised. For reliable tool calling,
-        consider using LiteLLMModelAdapter instead.
+        a ``ToolCallingNotSupportedError`` is raised. For reliable tool calling,
+        consider using ``LiteLLMModelAdapter`` instead.
     """
 
     def __init__(
@@ -378,3 +387,7 @@ def gather_config(self) -> Dict[str, Any]:
             base_config["pipeline_config"] = pipeline_config
 
         return base_config
+
+
+# Backwards compatibility alias
+HuggingFaceModelAdapter = HuggingFacePipelineModelAdapter
diff --git a/maseval/interface/inference/huggingface_scorer.py b/maseval/interface/inference/huggingface_scorer.py
new file mode 100644
index 00000000..53d43dad
--- /dev/null
+++ b/maseval/interface/inference/huggingface_scorer.py
@@ -0,0 +1,264 @@
+"""HuggingFace model scorer for log-likelihood evaluation.
+
+Wraps a raw HuggingFace ``AutoModelForCausalLM`` (not a pipeline) and
+exposes ``loglikelihood()`` for scoring context-continuation pairs. Designed
+for MCQ-style evaluation where the best answer is chosen by highest
+log-likelihood.
+
+For text generation (``chat()``, ``generate()``), see
+``HuggingFacePipelineModelAdapter`` in ``maseval.interface.inference.huggingface``.
+
+Requires transformers and torch:
+    pip install maseval[transformers]
+
+Example:
+    ```python
+    from maseval.interface.inference import HuggingFaceModelScorer
+
+    scorer = HuggingFaceModelScorer(
+        model_id="meta-llama/Llama-2-7b-hf",
+        device="cuda:0",
+    )
+
+    # Score a single continuation
+    ll = scorer.loglikelihood("The capital of France is", " Paris")
+
+    # MCQ: pick the most likely answer
+    logprobs = scorer.loglikelihood_choices(
+        context="What is 2+2? Answer:",
+        choices=["A", "B", "C", "D"],
+    )
+    best_idx = logprobs.index(max(logprobs))
+    ```
+"""
+
+from __future__ import annotations
+
+from typing import Any, Dict, List, Optional, Tuple
+
+from maseval.core.scorer import ModelScorer
+
+
+class HuggingFaceModelScorer(ModelScorer):
+    """Log-likelihood scorer backed by a HuggingFace causal language model.
+
+    Loads the model lazily on first use. Supports:
+
+    - Single-token optimisation: when all continuations map to a single token,
+      one forward pass scores every choice.
+    - Multi-token fallback: separate forward pass per continuation.
+    - ``loglikelihood_choices()`` override that picks the optimal path
+      automatically.
+
+    The tokenisation strategy matches ``lm-evaluation-harness``: context and
+    continuation are encoded separately, then concatenated to handle
+    tokenisation-boundary effects correctly.
+    """
+
+    def __init__(
+        self,
+        model_id: str,
+        device: str = "cuda:0",
+        trust_remote_code: bool = True,
+        seed: Optional[int] = None,
+    ):
+        """Initialize HuggingFace model scorer.
+
+        Args:
+            model_id: HuggingFace model identifier
+                (e.g. ``"meta-llama/Llama-2-7b-hf"``).
+            device: Torch device string (e.g. ``"cuda:0"``, ``"cpu"``).
+            trust_remote_code: Trust remote code when loading the model.
+            seed: Seed for deterministic scoring.
+        """
+        super().__init__(seed=seed)
+        self._model_id = model_id
+        self._device = device
+        self._trust_remote_code = trust_remote_code
+        self._model: Any = None
+        self._tokenizer: Any = None
+
+    @property
+    def model_id(self) -> str:
+        return self._model_id
+
+    # ------------------------------------------------------------------
+    # Model loading
+    # ------------------------------------------------------------------
+
+    def _load_model(self) -> Tuple[Any, Any]:
+        """Lazy-load the model and tokenizer.
+
+        Returns:
+            Tuple of (model, tokenizer).
+        """
+        if self._model is None:
+            from transformers import AutoModelForCausalLM, AutoTokenizer
+
+            self._tokenizer = AutoTokenizer.from_pretrained(
+                self._model_id,
+                trust_remote_code=self._trust_remote_code,
+            )
+            self._tokenizer.padding_side = "left"
+            if self._tokenizer.pad_token is None:
+                self._tokenizer.pad_token = self._tokenizer.eos_token
+
+            self._model = AutoModelForCausalLM.from_pretrained(
+                self._model_id,
+                trust_remote_code=self._trust_remote_code,
+                torch_dtype="auto",
+            )
+            self._model = self._model.to(self._device)
+            self._model.eval()
+
+        return self._model, self._tokenizer
+
+    # ------------------------------------------------------------------
+    # Tokenisation helpers (matches lm-evaluation-harness)
+    # ------------------------------------------------------------------
+
+    def _encode_pair(self, context: str, continuation: str) -> Tuple[List[int], List[int]]:
+        """Encode a context-continuation pair like lm-evaluation-harness.
+
+        1. Encode ``whole = context + continuation``
+        2. Encode ``context`` alone
+        3. ``continuation_enc = whole[len(context_enc):]``
+
+        Args:
+            context: The context/prompt string.
+            continuation: The continuation string.
+
+        Returns:
+            Tuple of (context_enc, continuation_enc) token lists.
+        """
+        _, tokenizer = self._load_model()
+
+        n_spaces = len(context) - len(context.rstrip())
+        if n_spaces > 0:
+            continuation = context[-n_spaces:] + continuation
+            context = context[:-n_spaces]
+
+        whole_enc = tokenizer.encode(context + continuation, add_special_tokens=True)
+        context_enc = tokenizer.encode(context, add_special_tokens=True)
+
+        continuation_enc = whole_enc[len(context_enc) :]
+        return context_enc, continuation_enc
+
+    # ------------------------------------------------------------------
+    # Core scoring
+    # ------------------------------------------------------------------
+
+    def _loglikelihood_impl(self, context: str, continuation: str) -> float:
+        """Score a single (context, continuation) pair.
+
+        Uses ``_encode_pair`` for correct tokenisation, then computes the
+        sum of per-token log-probabilities over the continuation.
+        """
+        import torch
+
+        model, _ = self._load_model()
+
+        context_enc, continuation_enc = self._encode_pair(context, continuation)
+        full_sequence = context_enc + continuation_enc
+        input_tokens = full_sequence[:-1]
+
+        input_ids = torch.tensor([input_tokens], dtype=torch.long, device=self._device)
+
+        with torch.no_grad():
+            logits = model(input_ids).logits[0]
+            inplen = len(input_tokens)
+            contlen = len(continuation_enc)
+            selected = logits[inplen - contlen : inplen]
+            log_probs = torch.nn.functional.log_softmax(selected, dim=-1)
+
+            total = 0.0
+            for i, token_id in enumerate(continuation_enc):
+                total += log_probs[i, token_id].item()
+
+        return total
+
+    # ------------------------------------------------------------------
+    # MCQ optimisation
+    # ------------------------------------------------------------------
+
+    def loglikelihood_choices(
+        self,
+        context: str,
+        choices: List[str],
+        delimiter: str = " ",
+    ) -> List[float]:
+        """Score multiple-choice continuations with shared-context optimisation.
+
+        When every ``delimiter + choice`` maps to a single continuation token,
+        all choices are scored in **one** forward pass. Otherwise falls back to
+        per-choice scoring via ``_loglikelihood_impl``.
+
+        Args:
+            context: The question/prompt text.
+            choices: Answer choice strings (e.g. ``["A", "B", "C", "D"]``).
+            delimiter: String prepended to each choice (default ``" "``).
+
+        Returns:
+            List of log-likelihoods, one per choice.
+        """
+        model, _ = self._load_model()
+
+        continuations = [f"{delimiter}{c}" for c in choices]
+        encoded_continuations = [self._encode_pair(context, cont) for cont in continuations]
+
+        all_single_token = all(len(cont_enc) == 1 for _, cont_enc in encoded_continuations)
+
+        if all_single_token:
+            return self._score_single_token(context, choices, delimiter, encoded_continuations)
+
+        return [self._loglikelihood_impl(context, cont) for cont in continuations]
+
+    def _score_single_token(
+        self,
+        context: str,
+        choices: List[str],
+        delimiter: str,
+        encoded_continuations: List[Tuple[List[int], List[int]]],
+    ) -> List[float]:
+        """One-forward-pass scoring for single-token continuations."""
+        import torch
+
+        model, _ = self._load_model()
+
+        context_enc, first_cont_enc = encoded_continuations[0]
+        full_sequence = context_enc + first_cont_enc
+        input_tokens = full_sequence[:-1]
+
+        input_ids = torch.tensor([input_tokens], dtype=torch.long, device=self._device)
+
+        with torch.no_grad():
+            logits = model(input_ids).logits[0]
+            inplen = len(input_tokens)
+            contlen = len(first_cont_enc)
+            selected_logits = logits[inplen - contlen : inplen]
+            log_probs = torch.nn.functional.log_softmax(selected_logits, dim=-1)
+
+            logprobs: List[float] = []
+            for _, cont_enc in encoded_continuations:
+                total = 0.0
+                for i, token_id in enumerate(cont_enc):
+                    total += log_probs[i, token_id].item()
+                logprobs.append(total)
+
+        return logprobs
+
+    # ------------------------------------------------------------------
+    # Tracing
+    # ------------------------------------------------------------------
+
+    def gather_config(self) -> Dict[str, Any]:
+        """Gather configuration including device and model settings.
+
+        Returns:
+            Dictionary containing scorer configuration.
+        """
+        return {
+            **super().gather_config(),
+            "device": self._device,
+            "trust_remote_code": self._trust_remote_code,
+        }

From 079ef47fae7769bca32dc4bb1aee33134f49288b Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Thu, 12 Mar 2026 09:00:35 +0100
Subject: [PATCH 06/23] [Move DISCO queue to core]: - Replace all .get() calls
 on required fields by explicit dict lookup.

---
 maseval/__init__.py            |  2 ++
 maseval/benchmark/mmlu/mmlu.py | 55 +++++++++++++++++-----------------
 maseval/core/exceptions.py     | 38 +++++++++++++++++++++++
 3 files changed, 67 insertions(+), 28 deletions(-)

diff --git a/maseval/__init__.py b/maseval/__init__.py
index 2aa5b927..bde2e121 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -49,6 +49,7 @@
     UserError,
     UserExhaustedError,
     TaskTimeoutError,
+    get_with_assert,
     validate_argument_type,
     validate_required_arguments,
     validate_no_extra_arguments,
@@ -106,6 +107,7 @@
     "ChatResponse",
     "ModelScorer",
     # Exceptions and validation
+    "get_with_assert",
     "MASEvalError",
     "AgentError",
     "EnvironmentError",
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index ef895e65..79ff8ce3 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -84,14 +84,14 @@ def setup_state(self, task_data: Dict[str, Any]) -> Dict[str, Any]:
 
         Args:
             task_data: Must contain ``"query"`` (str) and ``"environment_data"``
-                (dict with optional ``"choices"``, ``"full_prompt"``, ``"use_full_prompt"``).
+                (dict with ``"choices"``, ``"full_prompt"``, ``"use_full_prompt"``).
         """
         env_data = task_data["environment_data"]
         return {
             "query": task_data["query"],
-            "choices": env_data.get("choices", DEFAULT_CHOICES),
-            "full_prompt": env_data.get("full_prompt", ""),
-            "use_full_prompt": env_data.get("use_full_prompt", False),
+            "choices": env_data["choices"],
+            "full_prompt": env_data["full_prompt"],
+            "use_full_prompt": env_data["use_full_prompt"],
         }
 
     def create_tools(self) -> Dict[str, Any]:
@@ -137,7 +137,7 @@ def __init__(
         self.task = task
         self.environment = environment
         self.gold = task.evaluation_data["gold"]
-        self.choices = task.environment_data.get("choices", DEFAULT_CHOICES)
+        self.choices = task.environment_data["choices"]
 
     def filter_traces(self, traces: Dict[str, Any]) -> Dict[str, Any]:
         """Extract relevant traces for evaluation.
@@ -175,11 +175,11 @@ def __call__(self, traces: Dict[str, Any], final_answer: Optional[str] = None) -
             "predicted": predicted,
             "gold": self.gold,
             "correct": correct,
-            "doc_id": self.task.metadata.get("doc_id"),
+            "doc_id": self.task.metadata["doc_id"],
         }
 
         # Extract logprobs from traces if available (for logprobs-based evaluation)
-        messages = traces.get("messages", [])
+        messages = traces["messages"]
         for msg in messages:
             if isinstance(msg, dict) and "logprobs" in msg:
                 result["logprobs"] = msg["logprobs"]
@@ -445,7 +445,7 @@ def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, Lis
         instance_map = {}  # (doc_id, choice_idx) -> position in results
 
         for task in tasks:
-            doc_id = task.metadata.get("doc_id")
+            doc_id = task.metadata["doc_id"]
             # Get prompt from task - use full_prompt from environment_data if available
             if self.use_full_prompt and "full_prompt" in task.environment_data:
                 prompt = task.environment_data["full_prompt"]
@@ -471,7 +471,7 @@ def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, Lis
         # Map results back to doc_ids
         doc_logprobs = {}
         for task in tasks:
-            doc_id = task.metadata.get("doc_id")
+            doc_id = task.metadata["doc_id"]
             logprobs = []
             for i in range(len(choices)):
                 pos = instance_map[(doc_id, i)]
@@ -498,20 +498,19 @@ def run_agents(
         which automatically picks single-token or multi-token scoring.
         """
         prompt = environment.get_prompt()
-        choices = environment.state.get("choices", DEFAULT_CHOICES)
-        doc_id = task.metadata.get("doc_id") if task else None
-
-        if hasattr(self, "_precomputed_logprobs") and doc_id is not None:
-            logprobs = self._precomputed_logprobs.get(doc_id)
-            if logprobs is not None:
-                best_idx = logprobs.index(max(logprobs))
-                answer = choices[best_idx]
-                environment.state["logprobs"] = logprobs
-                environment.state["predicted_idx"] = best_idx
-                agent = agents[0]
-                agent._messages.append({"role": "user", "content": prompt})
-                agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
-                return answer
+        choices = environment.state["choices"]
+        doc_id = task.metadata["doc_id"]
+
+        if hasattr(self, "_precomputed_logprobs") and doc_id in self._precomputed_logprobs:
+            logprobs = self._precomputed_logprobs[doc_id]
+            best_idx = logprobs.index(max(logprobs))
+            answer = choices[best_idx]
+            environment.state["logprobs"] = logprobs
+            environment.state["predicted_idx"] = best_idx
+            agent = agents[0]
+            agent._messages.append({"role": "user", "content": prompt})
+            agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
+            return answer
 
         logprobs = self._scorer.loglikelihood_choices(prompt, choices, delimiter=TARGET_DELIMITER)
 
@@ -677,14 +676,14 @@ def compute_benchmark_metrics(results: List[Dict[str, Any]]) -> Dict[str, Any]:
     acc_norm_sum = 0.0
 
     for res in results:
-        if res.get("status") != STATUS_SUCCESS:
+        if res["status"] != STATUS_SUCCESS:
             continue
 
-        evals = res.get("eval") or []
+        evals = res["eval"] or []
         for entry in evals:
-            acc_sum += entry.get("acc", 0.0)
-            acc_norm_sum += entry.get("acc_norm", 0.0)
-            if entry.get("correct", False):
+            acc_sum += entry["acc"]
+            acc_norm_sum += entry["acc_norm"]
+            if entry["correct"]:
                 correct_count += 1
 
     return {
diff --git a/maseval/core/exceptions.py b/maseval/core/exceptions.py
index e4c8c0f1..b3e297c0 100644
--- a/maseval/core/exceptions.py
+++ b/maseval/core/exceptions.py
@@ -308,6 +308,44 @@ def __init__(
 # =============================================================================
 
 
+def get_with_assert(container: Any, key: Any, error_msg: Optional[str] = None) -> Any:
+    """Get a value from a container, raising ``KeyError`` if not found.
+
+    Use instead of ``dict.get(key, default)`` when the key is **required**.
+    A missing key means a bug — not a case to paper over with a fallback.
+
+    Supports nested access via a list of keys::
+
+        get_with_assert(task, ["metadata", "doc_id"])
+        # equivalent to: task["metadata"]["doc_id"] but with a clear error
+
+    Args:
+        container: Dictionary or other container supporting ``in`` and ``[]``.
+        key: Key to look up. Pass a list for nested access.
+        error_msg: Custom error message. If ``None``, a descriptive default
+            is generated.
+
+    Returns:
+        The value at the given key.
+
+    Raises:
+        KeyError: If the key is not found in the container.
+    """
+    if isinstance(key, list):
+        assert len(key) > 0
+        value = get_with_assert(container, key[0], error_msg)
+        if len(key) == 1:
+            return value
+        return get_with_assert(value, key[1:], error_msg)
+
+    if key not in container:
+        if error_msg is None:
+            error_msg = f'Required key "{key}" not in container: {container}'
+        raise KeyError(error_msg)
+
+    return container[key]
+
+
 def validate_argument_type(
     value: Any,
     expected_type: str,

From e23b1df36cc468c7b9108add2afe32f72ed69a70 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Thu, 12 Mar 2026 10:04:39 +0100
Subject: [PATCH 07/23] [Move DISCO queue to core] Remove dummy implementations
 and tighten data access in MMLU benchmark

- Replace silent .get() fallbacks with direct dict access for required
  fields (choices, doc_id, gold, acc, etc.) so missing data fails fast
- Add get_with_assert utility to maseval.core.exceptions for required
  key lookups with clear error messages
- Remove _DummyCallable from DefaultMMLUBenchmark.get_model_adapter();
  raise NotImplementedError instead since scoring uses HuggingFaceModelScorer
- Restructure DefaultMMLUBenchmark.setup_agents() to use a scorer-backed
  adapter directly instead of routing through get_model_adapter()
- Remove redundant MMLUBenchmark.setup_user() override (base class
  already returns None)
- Remove ModelAgentAdapter (no consumers) from core, exports, and docs
---
 CHANGELOG.md                   |  3 +-
 docs/benchmark/mmlu.md         | 19 +++++++-
 maseval/__init__.py            |  3 +-
 maseval/benchmark/mmlu/mmlu.py | 86 +++++++++++++++++-----------------
 maseval/core/agent.py          | 74 +----------------------------
 5 files changed, 63 insertions(+), 122 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index c6508428..bea3eedd 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -43,7 +43,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 - Added `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import DISCOQueue`. (PR: #34)
 - Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #PR_NUMBER_PLACEHOLDER)
-- Added `ModelAgentAdapter` in `maseval.core.agent` — a generic adapter that wraps any `ModelAdapter` as an `AgentAdapter` for direct model evaluation (replaces benchmark-specific agent wrappers). (PR: #PR_NUMBER_PLACEHOLDER)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
 - Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
@@ -93,7 +92,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). `DefaultMMLUBenchmark` now delegates log-likelihood computation to `HuggingFaceModelScorer` and uses `ModelAgentAdapter` instead of the MMLU-specific `MMLUModelAgent`/`MMLUAgentAdapter` (removed). (PR: #34)
+- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). `DefaultMMLUBenchmark` now delegates log-likelihood computation to `HuggingFaceModelScorer` and uses a scorer-backed adapter instead of the MMLU-specific `MMLUModelAgent`/`MMLUAgentAdapter` (removed). (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index 7514ad18..d2e58544 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -97,13 +97,28 @@ print(f"Evaluating {len(tasks)} anchor tasks")
 `MMLUBenchmark` is a framework-agnostic base class. To use a different model backend, subclass it and implement `setup_agents()` and `get_model_adapter()`:
 
 ```python
-from maseval import ModelAgentAdapter
+from maseval import AgentAdapter
+from maseval.core.history import MessageHistory
 from maseval.benchmark.mmlu import MMLUBenchmark
 
+class MyAgentAdapter(AgentAdapter):
+    def __init__(self, model, name):
+        super().__init__(model, name)
+        self._messages = []
+
+    def _run_agent(self, query):
+        self._messages.append({"role": "user", "content": query})
+        response = self.agent.generate(query)
+        self._messages.append({"role": "assistant", "content": response})
+        return response
+
+    def get_messages(self):
+        return MessageHistory(self._messages)
+
 class MyMMLUBenchmark(MMLUBenchmark):
     def setup_agents(self, agent_data, environment, task, user, seed_generator):
         model = self.get_model_adapter(agent_data["model_id"])
-        adapter = ModelAgentAdapter(model, name="mmlu_agent")
+        adapter = MyAgentAdapter(model, name="mmlu_agent")
         return [adapter], {"mmlu_agent": adapter}
 
     def get_model_adapter(self, model_id, **kwargs):
diff --git a/maseval/__init__.py b/maseval/__init__.py
index bde2e121..c6fa6cec 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -22,7 +22,7 @@
     AdaptiveTaskQueue,
 )
 from .core.environment import Environment
-from .core.agent import AgentAdapter, ModelAgentAdapter
+from .core.agent import AgentAdapter
 from .core.benchmark import Benchmark, TaskExecutionStatus
 from .core.callback_handler import CallbackHandler
 from .core.callback import BenchmarkCallback, EnvironmentCallback, AgentCallback
@@ -65,7 +65,6 @@
     # Core abstractions
     "Environment",
     "AgentAdapter",
-    "ModelAgentAdapter",
     "Benchmark",
     "TaskExecutionStatus",
     # Callbacks
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 79ff8ce3..e59f41b8 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -44,11 +44,11 @@
     Environment,
     Evaluator,
     ModelAdapter,
-    ModelAgentAdapter,
     Task,
     User,
     SeedGenerator,
 )
+from maseval.core.history import MessageHistory
 from maseval.core.task import SequentialTaskQueue
 
 
@@ -67,6 +67,34 @@
 STATUS_SUCCESS = "success"
 
 
+# =============================================================================
+# Agent adapter for scorer-based evaluation
+# =============================================================================
+
+
+class _ScorerBackedAdapter(AgentAdapter):
+    """Agent adapter for benchmarks that use scorer-based evaluation.
+
+    This adapter is a message container for tracing — the benchmark's
+    ``run_agents()`` drives evaluation via a ``ModelScorer`` and records
+    results here.  Calling ``agent.run()`` directly is an error because
+    there is no generation model behind this adapter.
+    """
+
+    def __init__(self, scorer: Any, name: str) -> None:
+        super().__init__(agent_instance=scorer, name=name)
+        self._messages: List[Dict[str, Any]] = []
+
+    def _run_agent(self, query: str) -> Any:
+        raise NotImplementedError(
+            f"{type(self).__name__} is backed by a ModelScorer, not a generation model. "
+            "Use benchmark.run_agents() instead of calling agent.run() directly."
+        )
+
+    def get_messages(self) -> MessageHistory:
+        return MessageHistory(self._messages)
+
+
 # =============================================================================
 # Environment
 # =============================================================================
@@ -280,16 +308,6 @@ def setup_environment(
         }
         return MMLUEnvironment(task_data)
 
-    def setup_user(
-        self,
-        agent_data: Dict[str, Any],
-        environment: Environment,
-        task: Task,
-        seed_generator: SeedGenerator,
-    ) -> Optional[User]:
-        """MMLU doesn't use a user simulator."""
-        return None
-
     def setup_evaluators(
         self,
         environment: Environment,
@@ -343,7 +361,7 @@ class DefaultMMLUBenchmark(MMLUBenchmark):
     2. Efficient log-softmax computation
     3. Proper left-padding for batch processing
 
-    Agents are created using the generic ``ModelAgentAdapter``.
+    Agents are created using a scorer-backed adapter (see ``_ScorerBackedAdapter``).
     """
 
     def __init__(
@@ -387,10 +405,13 @@ def setup_agents(
         user: Optional[User],
         seed_generator: SeedGenerator,
     ) -> Tuple[Sequence[AgentAdapter], Dict[str, AgentAdapter]]:
-        """Create model agent for MCQ evaluation.
+        """Create scorer-backed agent for MCQ evaluation.
+
+        The returned adapter is a tracing container — actual evaluation is
+        driven by ``self._scorer`` in ``run_agents()``.
 
         Args:
-            agent_data: Agent config. Must contain ``"model_id"`` (str).
+            agent_data: Agent config (unused; model is set at ``__init__``).
             environment: MMLU environment.
             task: Current task.
             user: Unused.
@@ -399,9 +420,7 @@ def setup_agents(
         Returns:
             Tuple of (agents_to_run, agents_dict).
         """
-        model_id = agent_data["model_id"]
-        model = self.get_model_adapter(model_id, register_name=DEFAULT_MODEL_REGISTER_NAME)
-        adapter = ModelAgentAdapter(model, DEFAULT_AGENT_NAME)
+        adapter = _ScorerBackedAdapter(self._scorer, DEFAULT_AGENT_NAME)
         return [adapter], {DEFAULT_AGENT_NAME: adapter}
 
     def precompute_all_logprobs_lmeval(self, tasks: Sequence[Task]) -> Dict[Any, List[float]]:
@@ -525,36 +544,17 @@ def run_agents(
         return answer
 
     def get_model_adapter(self, model_id: str, **kwargs: Any) -> ModelAdapter:
-        """Provide a HuggingFace ``ModelAdapter``.
-
-        The returned adapter is a placeholder — actual evaluation uses
-        ``HuggingFaceModelScorer`` for log-likelihood scoring. The adapter
-        is required by the ``Benchmark`` contract for ``setup_agents()``.
+        """Not used — ``DefaultMMLUBenchmark`` uses ``HuggingFaceModelScorer``.
 
-        Args:
-            model_id: Model identifier (ignored, uses instance model_id).
-            **kwargs: Additional arguments (e.g., ``register_name``).
-
-        Returns:
-            ``HuggingFacePipelineModelAdapter`` instance.
+        Raises:
+            NotImplementedError: Always. Use ``HuggingFaceModelScorer`` via
+                ``self._scorer`` for log-likelihood evaluation.
         """
-        from maseval.interface.inference import HuggingFacePipelineModelAdapter
-
-        class _DummyCallable:
-            def __call__(self, prompt: str, **kw: Any) -> str:
-                return ""
-
-        adapter = HuggingFacePipelineModelAdapter(
-            model=_DummyCallable(),
-            model_id=self._model_id,
+        raise NotImplementedError(
+            "DefaultMMLUBenchmark uses HuggingFaceModelScorer for log-likelihood "
+            "evaluation, not a generation ModelAdapter. Access the scorer via self._scorer."
         )
 
-        register_name = kwargs.get("register_name")
-        if register_name:
-            self.register("models", register_name, adapter)
-
-        return adapter
-
 
 # =============================================================================
 # Data Loading
diff --git a/maseval/core/agent.py b/maseval/core/agent.py
index e76a3ea6..1f0aeb9b 100644
--- a/maseval/core/agent.py
+++ b/maseval/core/agent.py
@@ -1,16 +1,13 @@
 from __future__ import annotations
 
 from abc import ABC, abstractmethod
-from typing import TYPE_CHECKING, List, Any, Optional, Dict
+from typing import List, Any, Optional, Dict
 
 from .callback import AgentCallback
 from .history import MessageHistory
 from .tracing import TraceableMixin
 from .config import ConfigurableMixin
 
-if TYPE_CHECKING:
-    from .model import ModelAdapter
-
 
 class AgentAdapter(ABC, TraceableMixin, ConfigurableMixin):
     """Wraps an agent from any framework to provide a standard interface.
@@ -191,72 +188,3 @@ def gather_config(self) -> Dict[str, Any]:
 
     def __repr__(self):
         return f"AgentAdapter(name={self.name}, agent_type={type(self.agent).__name__})"
-
-
-class ModelAgentAdapter(AgentAdapter):
-    """Wraps a ``ModelAdapter`` as an ``AgentAdapter`` for direct model evaluation.
-
-    Use this when a benchmark needs to plug a model directly into the agent
-    slot without an agentic framework. The adapter forwards queries to
-    ``ModelAdapter.generate()`` and records the conversation for tracing.
-
-    Example:
-        ```python
-        from maseval import ModelAgentAdapter
-        from maseval.interface.inference import LiteLLMModelAdapter
-
-        model = LiteLLMModelAdapter(model_id="gpt-4")
-        agent = ModelAgentAdapter(model, name="evaluator")
-        result = agent.run("What is the capital of France?")
-        ```
-    """
-
-    def __init__(
-        self,
-        model: ModelAdapter,
-        name: str,
-        callbacks: Optional[List[AgentCallback]] = None,
-    ):
-        """Initialize a model-backed agent adapter.
-
-        Args:
-            model: ``ModelAdapter`` instance used for generation.
-            name: Agent name for tracing and identification.
-            callbacks: Optional agent callbacks.
-        """
-        super().__init__(model, name, callbacks)
-        self._messages: List[Dict[str, Any]] = []
-
-    @property
-    def model(self) -> ModelAdapter:
-        """The underlying ``ModelAdapter``."""
-        return self.agent
-
-    def _run_agent(self, query: str) -> str:
-        """Generate a response by forwarding the query to the model.
-
-        Args:
-            query: The prompt to send to the model.
-
-        Returns:
-            The model's text response.
-        """
-        self._messages.append({"role": "user", "content": query})
-        response = self.agent.generate(query)
-        self._messages.append({"role": "assistant", "content": response})
-        return response
-
-    def get_messages(self) -> MessageHistory:
-        """Return the recorded conversation history."""
-        return MessageHistory(self._messages)
-
-    def gather_config(self) -> Dict[str, Any]:
-        """Gather configuration including model identifier.
-
-        Returns:
-            Dictionary containing agent and model configuration.
-        """
-        return {
-            **super().gather_config(),
-            "model_id": self.agent.model_id,
-        }

From dd46f1a35a9fff535167cc7b896a3990cc41fe08 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Fri, 13 Mar 2026 07:54:59 +0100
Subject: [PATCH 08/23] [Move DISCO queue to core]: - Update BENCHMARKS.md

---
 BENCHMARKS.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/BENCHMARKS.md b/BENCHMARKS.md
index 0916ef69..0cc5473c 100644
--- a/BENCHMARKS.md
+++ b/BENCHMARKS.md
@@ -81,10 +81,12 @@ CONVERSE evaluates contextual safety in agent-to-agent conversations. It focuses
 
 ## 6. MMLU (Massive Multitask Language Understanding) (Beta)
 
-MMLU evaluates language models on multiple-choice questions spanning 57 academic subjects.  The MASEval integration includes anchor-point-based evaluation for DISCO prediction, allowing efficient estimation of full benchmark performance from a subset of tasks.
+MMLU evaluates language models on multiple-choice questions spanning 57 academic subjects. The MASEval integration includes anchor-point-based evaluation for DISCO prediction, allowing efficient estimation of full benchmark performance from a subset of tasks.
 
 > **Beta:** This benchmark has been implemented carefully, but we have not yet validated the results against the original implementation. Use with caution when comparing with existing results or the original paper's numbers. Contributions and compute donations welcome!
 
+> **Implemented:** A ready-to-use implementation is available via `DefaultMMLUBenchmark` with HuggingFace model support. Install with `pip install maseval[mmlu]`. See the [MMLU documentation](docs/benchmark/mmlu.md) for usage details.
+
 ### Source and License
 
 - **Original Paper:** [Measuring Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021)

From 3779e2e84f304d5895432ba1e6ef66659b6f3b7c Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Fri, 13 Mar 2026 10:18:48 +0100
Subject: [PATCH 09/23] [Move DISCO queue to core]: - Update links in mmlu.md

---
 docs/benchmark/mmlu.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index d2e58544..348f7aaa 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -3,13 +3,13 @@
 !!! warning "Beta"
     This benchmark has been implemented carefully, but we have not yet validated the results against the original implementation. Use with caution when comparing with existing results or the original paper's numbers. Contributions and compute donations welcome!
 
-The **MMLU Benchmark** evaluates language models on multiple-choice questions spanning 57 academic subjects. The MASEval integration supports anchor-point-based evaluation for [DISCO](https://arxiv.org/abs/2407.12890) prediction, enabling efficient estimation of full benchmark performance from a subset of tasks.
+The **MMLU Benchmark** evaluates language models on multiple-choice questions spanning 57 academic subjects. The MASEval integration supports anchor-point-based evaluation for [DISCO](https://arxiv.org/abs/2510.07959) prediction, enabling efficient estimation of full benchmark performance from a subset of tasks.
 
 ## Overview
 
 [MMLU](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021) is a widely used benchmark for measuring knowledge and reasoning across diverse domains. The MASEval implementation features:
 
-- **Log-likelihood MCQ evaluation** matching lm-evaluation-harness methodology
+- **Log-likelihood MCQ evaluation** matching [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) methodology
 - **Anchor-point task selection** via `DISCOQueue` for DISCO-style subset evaluation
 - **HuggingFace integration** with batched log-probability computation
 - **lm-eval compatibility** mode for exact numerical reproduction

From bf4abbb6ab51b00f6ca88119d673d0d2a34776b8 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Sat, 14 Mar 2026 08:36:39 +0100
Subject: [PATCH 10/23] [Move DISCO queue to core]: - Update mmlu and disco
 dependencies - Add installation guide to mmlu example

---
 examples/mmlu_benchmark/README.md | 14 ++++++++++++++
 pyproject.toml                    | 13 +++++++------
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/examples/mmlu_benchmark/README.md b/examples/mmlu_benchmark/README.md
index 62c6bafc..0e90291b 100644
--- a/examples/mmlu_benchmark/README.md
+++ b/examples/mmlu_benchmark/README.md
@@ -2,6 +2,20 @@
 
 Evaluate language models on [MMLU (Massive Multitask Language Understanding)](https://arxiv.org/abs/2009.03300) with optional efficient evaluation via [DISCO](https://arxiv.org/abs/2510.07959).
 
+## Installation
+
+For basic MMLU evaluation:
+
+```bash
+uv pip install .[mmlu]
+```
+
+For DISCO prediction (includes DISCO dependencies):
+
+```bash
+uv pip install .[disco]
+```
+
 ## Run without DISCO (full evaluation)
 
 From the project root:
diff --git a/pyproject.toml b/pyproject.toml
index c252adeb..59e7eb02 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -82,22 +82,22 @@ multiagentbench = [
 ]
 tau2 = ["docstring-parser>=0.16", "addict>=2.4.0"]
 converse = []
-# HuggingFace model + tokenizer, default dataset download; numpy for example script and anchor-point loading;
-# lm-eval for --use_lmeval_batching (exact lm-evaluation-harness reproduction); aiohttp required by lm_eval.models.api_models
+# HuggingFace model + tokenizer, default dataset download; numpy for example script and anchor-point loading.
+# For exact lm-evaluation-harness reproduction (--use_lmeval_batching), also install maseval[lm-eval].
 mmlu = [
+    "torch>=2.0.0",
     "transformers>=4.37.0",
     "numpy>=1.20.0",
-    "aiohttp>=3.9.0",
-    "lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main",
 ]
 
-# LM Evaluation Harness (same as in mmlu; aiohttp required by lm_eval.models.api_models)
+# LM Evaluation Harness — requires transformers 4.x (lm-eval uses APIs removed in 5.x)
 lm-eval = [
     "aiohttp>=3.9.0",
+    "transformers>=4.37.0,<5.0.0",
     "lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main",
 ]
 
-# DISCO prediction (for MMLU benchmark example)
+# DISCO prediction (for MMLU benchmark example) — requires transformers 4.x via lm-eval
 disco = [
     "aiohttp>=3.9.0",
     "click>=8.1.0",
@@ -108,6 +108,7 @@ disco = [
     "jsonlines>=4.0.0",
     "lm-eval @ git+https://github.com/arubique/lm-evaluation-harness.git@main",
     "matplotlib>=3.5.0",
+    "transformers>=4.37.0,<5.0.0",
     "scikit-learn>=1.7.2",
     "scipy>=1.11.0",
     "stnd @ git+https://github.com/arubique/stnd.git@0d23b52f7742c08b28be560d2d52d450fcd274b7",

From f6a5885c8762a9d49540f7dcb64a0219fefb09af Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Sat, 14 Mar 2026 08:50:12 +0100
Subject: [PATCH 11/23] [Move DISCO queue to core]: - Update
 DefaultMMLUBenchmark.run_agents to pass type checks.

---
 maseval/benchmark/mmlu/mmlu.py | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index e59f41b8..1e778169 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -516,17 +516,18 @@ def run_agents(
         otherwise delegates to ``HuggingFaceModelScorer.loglikelihood_choices()``
         which automatically picks single-token or multi-token scoring.
         """
-        prompt = environment.get_prompt()
-        choices = environment.state["choices"]
+        mmlu_env = cast(MMLUEnvironment, environment)
+        prompt = mmlu_env.get_prompt()
+        choices = mmlu_env.state["choices"]
         doc_id = task.metadata["doc_id"]
+        agent = cast(_ScorerBackedAdapter, agents[0])
 
         if hasattr(self, "_precomputed_logprobs") and doc_id in self._precomputed_logprobs:
             logprobs = self._precomputed_logprobs[doc_id]
             best_idx = logprobs.index(max(logprobs))
             answer = choices[best_idx]
-            environment.state["logprobs"] = logprobs
-            environment.state["predicted_idx"] = best_idx
-            agent = agents[0]
+            mmlu_env.state["logprobs"] = logprobs
+            mmlu_env.state["predicted_idx"] = best_idx
             agent._messages.append({"role": "user", "content": prompt})
             agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
             return answer
@@ -535,10 +536,9 @@ def run_agents(
 
         best_idx = logprobs.index(max(logprobs))
         answer = choices[best_idx]
-        environment.state["logprobs"] = logprobs
-        environment.state["predicted_idx"] = best_idx
+        mmlu_env.state["logprobs"] = logprobs
+        mmlu_env.state["predicted_idx"] = best_idx
 
-        agent = agents[0]
         agent._messages.append({"role": "user", "content": prompt})
         agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
         return answer

From 26931972ab9801d6789570a3f5cf5a3eb849a61e Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 08:50:56 +0100
Subject: [PATCH 12/23] [Move DISCO queue to core]: - Add tests for
 get_with_assert, ModelScorer, InformativeSubsetQueue, and DISCOQueue

---
 tests/test_core/test_exceptions.py |  47 +++++++
 tests/test_core/test_queue.py      | 106 ++++++++++++++++
 tests/test_core/test_scorer.py     | 191 +++++++++++++++++++++++++++++
 3 files changed, 344 insertions(+)
 create mode 100644 tests/test_core/test_scorer.py

diff --git a/tests/test_core/test_exceptions.py b/tests/test_core/test_exceptions.py
index 416ebb7e..1698fa61 100644
--- a/tests/test_core/test_exceptions.py
+++ b/tests/test_core/test_exceptions.py
@@ -14,6 +14,7 @@
     AgentError,
     EnvironmentError,
     UserError,
+    get_with_assert,
     validate_argument_type,
     validate_required_arguments,
     validate_no_extra_arguments,
@@ -370,6 +371,52 @@ def test_validate_arguments_from_schema_strict_mode(self):
             validate_arguments_from_schema({"name": "test", "extra": 1}, schema, strict=True)
 
 
+@pytest.mark.core
+class TestGetWithAssert:
+    """Tests for get_with_assert required-key lookup."""
+
+    def test_single_key_present(self):
+        """Returns value when key exists."""
+        assert get_with_assert({"a": 1}, "a") == 1
+
+    def test_single_key_missing_raises_key_error(self):
+        """Raises KeyError with descriptive message when key is missing."""
+        with pytest.raises(KeyError, match='Required key "x"'):
+            get_with_assert({"a": 1}, "x")
+
+    def test_nested_key_access(self):
+        """Supports nested access via a list of keys."""
+        data = {"level1": {"level2": {"level3": "value"}}}
+        assert get_with_assert(data, ["level1", "level2", "level3"]) == "value"
+
+    def test_nested_key_missing_raises_key_error(self):
+        """Raises KeyError when a nested key is missing."""
+        data = {"level1": {"level2": {}}}
+        with pytest.raises(KeyError):
+            get_with_assert(data, ["level1", "level2", "level3"])
+
+    def test_custom_error_message(self):
+        """Uses custom error message when provided."""
+        with pytest.raises(KeyError, match="MMLU task missing query"):
+            get_with_assert({}, "query", error_msg="MMLU task missing query")
+
+    def test_single_element_list_key(self):
+        """List with one key behaves like a single key."""
+        assert get_with_assert({"a": 42}, ["a"]) == 42
+
+    def test_falsy_values_returned(self):
+        """Falsy values (0, empty string, False, None) are returned, not treated as missing."""
+        assert get_with_assert({"k": 0}, "k") == 0
+        assert get_with_assert({"k": ""}, "k") == ""
+        assert get_with_assert({"k": False}, "k") is False
+        assert get_with_assert({"k": None}, "k") is None
+
+    def test_empty_key_list_raises(self):
+        """Empty key list triggers assertion error."""
+        with pytest.raises(AssertionError):
+            get_with_assert({"a": 1}, [])
+
+
 class TestFilteringByErrorType:
     """Tests for filtering failed tasks by error type."""
 
diff --git a/tests/test_core/test_queue.py b/tests/test_core/test_queue.py
index 9ffdd7d9..ace96588 100644
--- a/tests/test_core/test_queue.py
+++ b/tests/test_core/test_queue.py
@@ -15,6 +15,8 @@
     AdaptiveTaskQueue,
     TaskQueue,
     BaseTaskQueue,
+    InformativeSubsetQueue,
+    DISCOQueue,
 )
 
 
@@ -212,6 +214,110 @@ def test_single_task(self):
         assert items[0].query == "Only one"
 
 
+# ==================== InformativeSubsetQueue Tests ====================
+
+
+@pytest.mark.core
+class TestInformativeSubsetQueue:
+    """Tests for InformativeSubsetQueue subset filtering."""
+
+    def test_filters_to_indices(self, simple_tasks):
+        """Only tasks at the given indices should be yielded."""
+        queue = InformativeSubsetQueue(simple_tasks, indices=[0, 2])
+
+        queries = [task.query for task in queue]
+
+        assert queries == ["Q1", "Q3"]
+
+    def test_preserves_index_order(self):
+        """Tasks should be yielded in the order given by indices, not original order."""
+        tasks = [Task(query=f"Q{i}") for i in range(5)]
+        queue = InformativeSubsetQueue(tasks, indices=[4, 1, 3])
+
+        queries = [task.query for task in queue]
+
+        assert queries == ["Q4", "Q1", "Q3"]
+
+    def test_none_indices_yields_all(self, simple_tasks):
+        """indices=None should yield all tasks in original order."""
+        queue = InformativeSubsetQueue(simple_tasks, indices=None)
+
+        queries = [task.query for task in queue]
+
+        assert queries == ["Q1", "Q2", "Q3"]
+
+    def test_stores_all_tasks(self, simple_tasks):
+        """_all_tasks should contain the full unfiltered list."""
+        queue = InformativeSubsetQueue(simple_tasks, indices=[0])
+
+        assert len(queue._all_tasks) == 3
+        assert len(queue) == 1
+
+    def test_out_of_range_indices_skipped(self):
+        """Indices not present in the task list should be silently skipped."""
+        tasks = [Task(query="Q0"), Task(query="Q1")]
+        queue = InformativeSubsetQueue(tasks, indices=[0, 5, 99])
+
+        queries = [task.query for task in queue]
+
+        assert queries == ["Q0"]
+
+    def test_empty_indices(self, simple_tasks):
+        """Empty indices list should yield no tasks."""
+        queue = InformativeSubsetQueue(simple_tasks, indices=[])
+
+        assert list(queue) == []
+        assert len(queue) == 0
+
+    def test_is_subclass_of_sequential(self, simple_tasks):
+        """InformativeSubsetQueue should be a SequentialTaskQueue."""
+        queue = InformativeSubsetQueue(simple_tasks)
+        assert isinstance(queue, SequentialTaskQueue)
+
+
+# ==================== DISCOQueue Tests ====================
+
+
+@pytest.mark.core
+class TestDISCOQueue:
+    """Tests for DISCOQueue diversity-based subset."""
+
+    def test_filters_to_anchor_points(self):
+        """Only tasks at anchor-point indices should be yielded."""
+        tasks = [Task(query=f"Q{i}") for i in range(10)]
+        queue = DISCOQueue(tasks, anchor_points=[2, 5, 8])
+
+        queries = [task.query for task in queue]
+
+        assert queries == ["Q2", "Q5", "Q8"]
+
+    def test_none_anchor_points_yields_all(self, simple_tasks):
+        """anchor_points=None should yield all tasks."""
+        queue = DISCOQueue(simple_tasks, anchor_points=None)
+
+        assert len(list(queue)) == 3
+
+    def test_stores_anchor_points(self):
+        """_anchor_points should be accessible."""
+        tasks = [Task(query=f"Q{i}") for i in range(5)]
+        anchor_pts = [0, 3, 4]
+        queue = DISCOQueue(tasks, anchor_points=anchor_pts)
+
+        assert queue._anchor_points == [0, 3, 4]
+
+    def test_is_subclass_of_informative_subset(self, simple_tasks):
+        """DISCOQueue should be an InformativeSubsetQueue."""
+        queue = DISCOQueue(simple_tasks)
+        assert isinstance(queue, InformativeSubsetQueue)
+
+    def test_len_matches_anchor_count(self):
+        """Queue length should match number of valid anchor points."""
+        tasks = [Task(query=f"Q{i}") for i in range(10)]
+        queue = DISCOQueue(tasks, anchor_points=[1, 3, 7])
+
+        assert len(queue) == 3
+
+
 # ==================== PriorityTaskQueue Tests ====================
 
 
diff --git a/tests/test_core/test_scorer.py b/tests/test_core/test_scorer.py
new file mode 100644
index 00000000..1c1570d0
--- /dev/null
+++ b/tests/test_core/test_scorer.py
@@ -0,0 +1,191 @@
+"""Tests for ModelScorer abstract base class.
+
+These tests verify that the ModelScorer ABC correctly delegates to
+subclass implementations, handles logging/tracing, and provides
+the expected batch and MCQ convenience methods.
+"""
+
+import pytest
+from typing import Dict, List, Optional, Tuple
+
+from maseval.core.scorer import ModelScorer
+
+
+class StubScorer(ModelScorer):
+    """Minimal concrete scorer for testing the ABC contract."""
+
+    def __init__(self, scores: Dict[Tuple[str, str], float], seed: Optional[int] = None):
+        super().__init__(seed=seed)
+        self._scores = scores
+        self._call_log: List[Tuple[str, str]] = []
+
+    @property
+    def model_id(self) -> str:
+        return "stub-model"
+
+    def _loglikelihood_impl(self, context: str, continuation: str) -> float:
+        self._call_log.append((context, continuation))
+        return self._scores[(context, continuation)]
+
+
+class FailingScorer(ModelScorer):
+    """Scorer that raises on every call, for error-path testing."""
+
+    @property
+    def model_id(self) -> str:
+        return "failing-model"
+
+    def _loglikelihood_impl(self, context: str, continuation: str) -> float:
+        raise ValueError("model exploded")
+
+
+pytestmark = pytest.mark.core
+
+
+class TestModelScorerLoglikelihood:
+    """Tests for single-pair loglikelihood."""
+
+    def test_delegates_to_impl(self):
+        """loglikelihood() should delegate to _loglikelihood_impl()."""
+        scorer = StubScorer({("ctx", " cont"): -1.5})
+        result = scorer.loglikelihood("ctx", " cont")
+
+        assert result == -1.5
+        assert scorer._call_log == [("ctx", " cont")]
+
+    def test_logs_success(self):
+        """Successful call should be logged."""
+        scorer = StubScorer({("a", "b"): -2.0})
+        scorer.loglikelihood("a", "b")
+
+        assert len(scorer.logs) == 1
+        assert scorer.logs[0]["status"] == "success"
+        assert scorer.logs[0]["type"] == "loglikelihood"
+        assert scorer.logs[0]["duration_seconds"] >= 0
+
+    def test_logs_error_and_reraises(self):
+        """Failed call should be logged and the exception re-raised."""
+        scorer = FailingScorer()
+
+        with pytest.raises(ValueError, match="model exploded"):
+            scorer.loglikelihood("a", "b")
+
+        assert len(scorer.logs) == 1
+        assert scorer.logs[0]["status"] == "error"
+        assert scorer.logs[0]["error_type"] == "ValueError"
+
+
+class TestModelScorerBatch:
+    """Tests for batch loglikelihood."""
+
+    def test_default_batch_loops_over_impl(self):
+        """Default _loglikelihood_batch_impl loops over _loglikelihood_impl."""
+        scores = {("q", " A"): -1.0, ("q", " B"): -2.0, ("q", " C"): -0.5}
+        scorer = StubScorer(scores)
+
+        results = scorer.loglikelihood_batch([("q", " A"), ("q", " B"), ("q", " C")])
+
+        assert results == [-1.0, -2.0, -0.5]
+        assert len(scorer._call_log) == 3
+
+    def test_batch_logs_single_entry(self):
+        """Batch call should produce one log entry (not per-pair)."""
+        scores = {("q", " A"): -1.0, ("q", " B"): -2.0}
+        scorer = StubScorer(scores)
+
+        scorer.loglikelihood_batch([("q", " A"), ("q", " B")])
+
+        assert len(scorer.logs) == 1
+        assert scorer.logs[0]["type"] == "loglikelihood_batch"
+        assert scorer.logs[0]["batch_size"] == 2
+
+    def test_empty_batch(self):
+        """Empty batch should return empty list."""
+        scorer = StubScorer({})
+        assert scorer.loglikelihood_batch([]) == []
+
+
+class TestModelScorerChoices:
+    """Tests for MCQ loglikelihood_choices."""
+
+    def test_prepends_delimiter(self):
+        """Choices should be prepended with the delimiter before scoring."""
+        scores = {("Q?", " A"): -1.0, ("Q?", " B"): -0.5, ("Q?", " C"): -2.0}
+        scorer = StubScorer(scores)
+
+        results = scorer.loglikelihood_choices("Q?", ["A", "B", "C"])
+
+        assert results == [-1.0, -0.5, -2.0]
+        assert scorer._call_log == [("Q?", " A"), ("Q?", " B"), ("Q?", " C")]
+
+    def test_custom_delimiter(self):
+        """Custom delimiter should be used instead of default space."""
+        scores = {("Q?", "\nA"): -1.0, ("Q?", "\nB"): -0.5}
+        scorer = StubScorer(scores)
+
+        results = scorer.loglikelihood_choices("Q?", ["A", "B"], delimiter="\n")
+
+        assert results == [-1.0, -0.5]
+        assert scorer._call_log == [("Q?", "\nA"), ("Q?", "\nB")]
+
+
+class TestModelScorerTracing:
+    """Tests for gather_traces and gather_config."""
+
+    def test_gather_traces_includes_call_stats(self):
+        """Traces should contain call counts and timing."""
+        scores = {("a", "b"): -1.0, ("c", "d"): -2.0}
+        scorer = StubScorer(scores)
+        scorer.loglikelihood("a", "b")
+        scorer.loglikelihood("c", "d")
+
+        traces = scorer.gather_traces()
+
+        assert traces["model_id"] == "stub-model"
+        assert traces["total_calls"] == 2
+        assert traces["successful_calls"] == 2
+        assert traces["failed_calls"] == 0
+        assert traces["total_duration_seconds"] >= 0
+        assert len(traces["logs"]) == 2
+
+    def test_gather_traces_counts_failures(self):
+        """Traces should correctly count failed calls."""
+        scorer = FailingScorer()
+        with pytest.raises(ValueError):
+            scorer.loglikelihood("a", "b")
+
+        traces = scorer.gather_traces()
+
+        assert traces["total_calls"] == 1
+        assert traces["successful_calls"] == 0
+        assert traces["failed_calls"] == 1
+
+    def test_gather_config(self):
+        """Config should include model_id, scorer_type, and seed."""
+        scorer = StubScorer({}, seed=42)
+
+        config = scorer.gather_config()
+
+        assert config["model_id"] == "stub-model"
+        assert config["scorer_type"] == "StubScorer"
+        assert config["seed"] == 42
+
+    def test_gather_config_seed_none(self):
+        """Config should report None seed when unseeded."""
+        scorer = StubScorer({})
+
+        config = scorer.gather_config()
+
+        assert config["seed"] is None
+
+
+class TestModelScorerSeed:
+    """Tests for seed property."""
+
+    def test_seed_stored(self):
+        scorer = StubScorer({}, seed=123)
+        assert scorer.seed == 123
+
+    def test_seed_default_none(self):
+        scorer = StubScorer({})
+        assert scorer.seed is None

From afd2cf95fb70ae0deea9bf4cce0e1023d5616b4c Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 08:58:05 +0100
Subject: [PATCH 13/23] [Move DISCO queue to core]: - Move load_anchor_points
 to DISCOQueue

---
 maseval/benchmark/mmlu/mmlu.py | 43 +---------------
 maseval/core/task.py           | 61 +++++++++++++++++++++--
 tests/test_core/test_queue.py  | 90 ++++++++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+), 47 deletions(-)

diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index 1e778169..d00fe5c0 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -24,19 +24,9 @@
 """
 
 import json
-import pickle
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Sequence, Tuple, Union, cast
 
-# numpy is optional - only needed for anchor points processing
-try:
-    import numpy as np
-
-    HAS_NUMPY = True
-except ImportError:
-    np = None  # type: ignore[assignment]
-    HAS_NUMPY = False
-
 from maseval import (
     AgentAdapter,
     DISCOQueue,
@@ -561,29 +551,6 @@ def get_model_adapter(self, model_id: str, **kwargs: Any) -> ModelAdapter:
 # =============================================================================
 
 
-def load_pickle(path: Union[str, Path]) -> Any:
-    """Load a pickle file."""
-    with open(path, "rb") as f:
-        return pickle.load(f)
-
-
-def load_anchor_points(path: Union[str, Path]) -> List[int]:
-    """Load anchor points from a .json or .pkl file. Returns a list of doc_ids."""
-    path = Path(path)
-    if not path.exists():
-        raise FileNotFoundError(f"Anchor points file not found: {path}")
-    if path.suffix.lower() == ".json":
-        with open(path) as f:
-            anchor_points = json.load(f)
-    else:
-        anchor_points = load_pickle(path)
-    if HAS_NUMPY and isinstance(anchor_points, np.ndarray):
-        anchor_points = anchor_points.tolist()
-    elif not HAS_NUMPY and hasattr(anchor_points, "tolist"):
-        anchor_points = anchor_points.tolist()
-    return list(anchor_points)
-
-
 def load_tasks(
     data_path: Union[str, Path],
     anchor_points_path: Optional[Union[str, Path]] = None,
@@ -601,8 +568,6 @@ def load_tasks(
     Returns:
         TaskQueue containing MMLU tasks.
 
-    Raises:
-        ImportError: If anchor_points_path is provided but numpy is not installed.
     """
     data_path = Path(data_path)
 
@@ -642,14 +607,8 @@ def load_tasks(
         )
         tasks.append(task)
 
-    # Load anchor points if provided
-    anchor_points = None
     if anchor_points_path is not None:
-        anchor_points = load_anchor_points(anchor_points_path)
-
-    # Create appropriate queue
-    if anchor_points is not None:
-        return DISCOQueue(tasks, anchor_points)
+        return DISCOQueue(tasks, anchor_points_path=anchor_points_path)
     else:
         return SequentialTaskQueue(tasks)
 
diff --git a/maseval/core/task.py b/maseval/core/task.py
index 22ec5e0f..07a3af9b 100644
--- a/maseval/core/task.py
+++ b/maseval/core/task.py
@@ -5,6 +5,7 @@
 from collections.abc import Sequence
 from typing import Iterable, List, Union, Iterator, Optional
 import json
+import pickle
 from pathlib import Path
 from enum import Enum
 
@@ -339,25 +340,75 @@ class DISCOQueue(InformativeSubsetQueue):
     Example:
         ```python
         queue = DISCOQueue(tasks, anchor_points=[0, 5, 12])
+        # or load from file:
+        queue = DISCOQueue(tasks, anchor_points_path="anchor_points.pkl")
 
         for task in queue:
-            result = execute(task)  # Only 3 tasks
+            result = execute(task)  # Only anchor-point tasks
         ```
     """
 
-    def __init__(self, tasks: Iterable[Task], anchor_points: Optional[List[int]] = None) -> None:
+    def __init__(
+        self,
+        tasks: Iterable[Task],
+        anchor_points: Optional[List[int]] = None,
+        anchor_points_path: Optional[Union[str, Path]] = None,
+    ) -> None:
         """Initialize DISCO task queue.
 
+        Anchor points can be supplied directly via ``anchor_points`` or loaded
+        from a file via ``anchor_points_path``.  Providing both is an error.
+
         Args:
             tasks: Full list of tasks (ordered by index).
             anchor_points: Diversity-selected indices into ``tasks``.
-                Typically loaded from a DISCO anchor-points file or
-                downloaded from a HuggingFace DISCO model repo.
-                If ``None``, evaluates all tasks in order.
+                Typically downloaded from a HuggingFace DISCO model repo.
+                If ``None`` and ``anchor_points_path`` is also ``None``,
+                evaluates all tasks in order.
+            anchor_points_path: Path to a ``.json`` or ``.pkl`` file
+                containing anchor-point indices.  Mutually exclusive with
+                ``anchor_points``.
         """
+        if anchor_points is not None and anchor_points_path is not None:
+            raise ValueError("Provide either anchor_points or anchor_points_path, not both.")
+
+        if anchor_points_path is not None:
+            anchor_points = self.load_anchor_points(anchor_points_path)
+
         self._anchor_points: Optional[List[int]] = anchor_points
         super().__init__(tasks, indices=anchor_points)
 
+    @staticmethod
+    def load_anchor_points(path: Union[str, Path]) -> List[int]:
+        """Load anchor points from a ``.json`` or ``.pkl`` file.
+
+        Args:
+            path: Path to anchor points file. JSON files should contain a
+                list of integer indices. Pickle files may contain a list or
+                a numpy array.
+
+        Returns:
+            List of integer anchor-point indices.
+
+        Raises:
+            FileNotFoundError: If the file does not exist.
+        """
+        path = Path(path)
+        if not path.exists():
+            raise FileNotFoundError(f"Anchor points file not found: {path}")
+
+        if path.suffix.lower() == ".json":
+            with open(path) as f:
+                anchor_points = json.load(f)
+        else:
+            with open(path, "rb") as f:
+                anchor_points = pickle.load(f)
+
+        if hasattr(anchor_points, "tolist"):
+            anchor_points = anchor_points.tolist()
+
+        return list(anchor_points)
+
 
 class PriorityTaskQueue(BaseTaskQueue):
     """Execute tasks ordered by priority.
diff --git a/tests/test_core/test_queue.py b/tests/test_core/test_queue.py
index ace96588..35bf1933 100644
--- a/tests/test_core/test_queue.py
+++ b/tests/test_core/test_queue.py
@@ -20,6 +20,16 @@
 )
 
 
+class _FakeArray:
+    """Pickle-serializable array-like for testing .tolist() conversion."""
+
+    def tolist(self):
+        return [1, 2, 3]
+
+    def __iter__(self):
+        return iter([1, 2, 3])
+
+
 # ==================== Fixtures ====================
 
 
@@ -318,6 +328,86 @@ def test_len_matches_anchor_count(self):
         assert len(queue) == 3
 
 
+@pytest.mark.core
+class TestDISCOQueueLoadAnchorPoints:
+    """Tests for DISCOQueue.load_anchor_points static method."""
+
+    def test_load_from_json(self, tmp_path):
+        """Should load anchor points from a JSON file."""
+        import json
+
+        path = tmp_path / "anchors.json"
+        path.write_text(json.dumps([0, 5, 12, 99]))
+
+        result = DISCOQueue.load_anchor_points(path)
+
+        assert result == [0, 5, 12, 99]
+
+    def test_load_from_pickle(self, tmp_path):
+        """Should load anchor points from a pickle file."""
+        import pickle
+
+        path = tmp_path / "anchors.pkl"
+        with open(path, "wb") as f:
+            pickle.dump([2, 7, 15], f)
+
+        result = DISCOQueue.load_anchor_points(path)
+
+        assert result == [2, 7, 15]
+
+    def test_load_converts_tolist(self, tmp_path):
+        """Should call .tolist() on array-like objects (e.g. numpy arrays)."""
+        import pickle
+
+        path = tmp_path / "anchors.pkl"
+        with open(path, "wb") as f:
+            pickle.dump(_FakeArray(), f)
+
+        result = DISCOQueue.load_anchor_points(path)
+
+        assert result == [1, 2, 3]
+
+    def test_file_not_found(self, tmp_path):
+        """Should raise FileNotFoundError for missing files."""
+        with pytest.raises(FileNotFoundError, match="not found"):
+            DISCOQueue.load_anchor_points(tmp_path / "nonexistent.json")
+
+    def test_accepts_string_path(self, tmp_path):
+        """Should accept a string path, not just Path objects."""
+        import json
+
+        path = tmp_path / "anchors.json"
+        path.write_text(json.dumps([10, 20]))
+
+        result = DISCOQueue.load_anchor_points(str(path))
+
+        assert result == [10, 20]
+
+    def test_init_with_anchor_points_path(self, tmp_path):
+        """DISCOQueue should load anchor points from file when anchor_points_path is given."""
+        import json
+
+        tasks = [Task(query=f"Q{i}") for i in range(10)]
+        path = tmp_path / "anchors.json"
+        path.write_text(json.dumps([2, 5, 8]))
+
+        queue = DISCOQueue(tasks, anchor_points_path=path)
+
+        assert len(queue) == 3
+        assert queue._anchor_points == [2, 5, 8]
+
+    def test_init_rejects_both_anchor_args(self, tmp_path):
+        """DISCOQueue should raise ValueError when both anchor_points and anchor_points_path are given."""
+        import json
+
+        tasks = [Task(query=f"Q{i}") for i in range(5)]
+        path = tmp_path / "anchors.json"
+        path.write_text(json.dumps([0, 1]))
+
+        with pytest.raises(ValueError, match="not both"):
+            DISCOQueue(tasks, anchor_points=[0, 1], anchor_points_path=path)
+
+
 # ==================== PriorityTaskQueue Tests ====================
 
 

From 72018322928c907c22b2efd271f4be6499cd073a Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 09:12:29 +0100
Subject: [PATCH 14/23] [Move DISCO queue to core]: Update docs to reflect
 MMLU, scorer, and queue changes - Add missing documentation for new core
 components introduced alongside the MMLU benchmark: ModelScorer reference
 page, InformativeSubsetQueue/DISCOQueue in task reference, get_with_assert in
 exceptions reference, and HuggingFacePipelineModelAdapter rename in
 model/HuggingFace pages. - Add mmlu extra to README install section. - Fix
 grammar in MMLU docs and fill CHANGELOG PR placeholders.

---
 CHANGELOG.md                            |  9 +++++----
 README.md                               |  7 +++++++
 docs/benchmark/mmlu.md                  |  2 +-
 docs/interface/inference/huggingface.md | 17 ++++++++++++++---
 docs/reference/exceptions.md            |  4 ++++
 docs/reference/model.md                 |  2 +-
 docs/reference/scorer.md                | 19 +++++++++++++++++++
 docs/reference/task.md                  | 22 +++++++++++++++-------
 mkdocs.yml                              |  1 +
 9 files changed, 67 insertions(+), 16 deletions(-)
 create mode 100644 docs/reference/scorer.md

diff --git a/CHANGELOG.md b/CHANGELOG.md
index bea3eedd..e4b63450 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -41,8 +41,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Core**
 
-- Added `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). Available via `from maseval import DISCOQueue`. (PR: #34)
-- Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #PR_NUMBER_PLACEHOLDER)
+- Added `InformativeSubsetQueue` and `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). `DISCOQueue` accepts `anchor_points_path` to load indices from a `.json`/`.pkl` file via `DISCOQueue.load_anchor_points()`. Available via `from maseval import DISCOQueue, InformativeSubsetQueue`. (PR: #34)
+- Added `get_with_assert()` utility in `maseval.core.exceptions` for strict dictionary access that raises `KeyError` instead of silently returning a default. Supports nested key lookups. (PR: #34)
+- Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #34)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
 - Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
@@ -53,8 +54,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Interface**
 
-- Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #PR_NUMBER_PLACEHOLDER)
-- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. The old name remains as a backwards-compatible alias. (PR: #PR_NUMBER_PLACEHOLDER)
+- Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #34)
+- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. The old name remains as a backwards-compatible alias. (PR: #34)
 
 - CAMEL-AI integration: `CamelAgentAdapter` and `CamelLLMUser` for evaluating CAMEL-AI ChatAgent-based systems (PR: #22)
   - Added `CamelAgentUser` for using a CAMEL ChatAgent as the user in agent-to-agent evaluation (PR: #22)
diff --git a/README.md b/README.md
index dea369c6..9f71751a 100644
--- a/README.md
+++ b/README.md
@@ -109,6 +109,13 @@ pip install "maseval[langgraph]"
 pip install "maseval[llamaindex]"
 ```
 
+Or install benchmark-specific dependencies:
+
+```bash
+# MMLU (HuggingFace models)
+pip install "maseval[mmlu]"
+```
+
 ## Example
 
 Examples are available in the [Documentation](https://maseval.readthedocs.io/en/stable/).
diff --git a/docs/benchmark/mmlu.md b/docs/benchmark/mmlu.md
index 348f7aaa..1b5d412b 100644
--- a/docs/benchmark/mmlu.md
+++ b/docs/benchmark/mmlu.md
@@ -88,7 +88,7 @@ tasks = load_tasks(
     anchor_points_path="/path/to/anchor_points.json",
 )
 
-# tasks is an DISCOQueue — only anchor tasks are evaluated
+# tasks is a DISCOQueue — only anchor tasks are evaluated
 print(f"Evaluating {len(tasks)} anchor tasks")
 ```
 
diff --git a/docs/interface/inference/huggingface.md b/docs/interface/inference/huggingface.md
index 00a424a4..28814b60 100644
--- a/docs/interface/inference/huggingface.md
+++ b/docs/interface/inference/huggingface.md
@@ -1,7 +1,18 @@
-# HuggingFace Inference Adapter
+# HuggingFace Inference Adapters
 
-This page documents the HuggingFace model adapter for MASEval.
+This page documents the HuggingFace model adapters for MASEval.
+
+## Pipeline Model Adapter (Text Generation)
 
 [:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/huggingface.py){ .md-source-file }
 
-::: maseval.interface.inference.huggingface.HuggingFaceModelAdapter
+::: maseval.interface.inference.huggingface.HuggingFacePipelineModelAdapter
+
+!!! note
+    `HuggingFaceModelAdapter` is a backwards-compatible alias for `HuggingFacePipelineModelAdapter`.
+
+## Model Scorer (Log-Likelihood)
+
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/huggingface_scorer.py){ .md-source-file }
+
+::: maseval.interface.inference.huggingface_scorer.HuggingFaceModelScorer
diff --git a/docs/reference/exceptions.md b/docs/reference/exceptions.md
index ef96f9dc..99cf2c3e 100644
--- a/docs/reference/exceptions.md
+++ b/docs/reference/exceptions.md
@@ -38,6 +38,10 @@ SimulatorError (base for simulators)
 
 ::: maseval.core.simulator.UserSimulatorError
 
+## Data Access Helpers
+
+::: maseval.core.exceptions.get_with_assert
+
 ## Validation Helpers
 
 These functions simplify input validation and raise `AgentError` with helpful suggestions:
diff --git a/docs/reference/model.md b/docs/reference/model.md
index 1569d939..f0029c0d 100644
--- a/docs/reference/model.md
+++ b/docs/reference/model.md
@@ -20,7 +20,7 @@ The following adapter classes implement the ModelAdapter interface for specific
 
 [:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/huggingface.py){ .md-source-file }
 
-::: maseval.interface.inference.huggingface.HuggingFaceModelAdapter
+::: maseval.interface.inference.huggingface.HuggingFacePipelineModelAdapter
 
 [:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/google_genai.py){ .md-source-file }
 
diff --git a/docs/reference/scorer.md b/docs/reference/scorer.md
new file mode 100644
index 00000000..cf2eddd4
--- /dev/null
+++ b/docs/reference/scorer.md
@@ -0,0 +1,19 @@
+# Model Scorers
+
+Model Scorers provide a uniform interface for log-likelihood computation across model providers. Unlike `ModelAdapter` (which handles text generation and chat), scorers evaluate how likely a model considers a given continuation given some context.
+
+!!! note
+
+    `ModelScorer` is the scoring counterpart to `ModelAdapter`. Use it when you need log-likelihood evaluation (e.g., multiple-choice benchmarks) rather than text generation.
+
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/scorer.py){ .md-source-file }
+
+::: maseval.core.scorer.ModelScorer
+
+## Interfaces
+
+The following scorer classes implement the ModelScorer interface for specific providers.
+
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/huggingface_scorer.py){ .md-source-file }
+
+::: maseval.interface.inference.huggingface_scorer.HuggingFaceModelScorer
diff --git a/docs/reference/task.md b/docs/reference/task.md
index b70ef13f..ad3087d6 100644
--- a/docs/reference/task.md
+++ b/docs/reference/task.md
@@ -2,15 +2,15 @@
 
 Tasks define individual benchmark scenarios including inputs, expected outputs, and metadata for evaluation. Task queues control execution order and scheduling strategy.
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L55){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L56){ .md-source-file }
 
 ::: maseval.core.task.Task
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L27){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L28){ .md-source-file }
 
 ::: maseval.core.task.TaskProtocol
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L18){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L19){ .md-source-file }
 
 ::: maseval.core.task.TimeoutAction
 
@@ -18,18 +18,26 @@ Tasks define individual benchmark scenarios including inputs, expected outputs,
 
 Task queues determine the order in which tasks are executed. Pass a queue to `Benchmark.run(queue=...)` to customize scheduling.
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L86){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L87){ .md-source-file }
 
 ::: maseval.core.task.BaseTaskQueue
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L256){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L257){ .md-source-file }
 
 ::: maseval.core.task.SequentialTaskQueue
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L276){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L277){ .md-source-file }
+
+::: maseval.core.task.InformativeSubsetQueue
+
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L325){ .md-source-file }
+
+::: maseval.core.task.DISCOQueue
+
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L413){ .md-source-file }
 
 ::: maseval.core.task.PriorityTaskQueue
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L322){ .md-source-file }
+[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/core/task.py#L459){ .md-source-file }
 
 ::: maseval.core.task.AdaptiveTaskQueue
diff --git a/mkdocs.yml b/mkdocs.yml
index 153215e9..dec8cc1e 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -110,6 +110,7 @@ nav:
           - Exceptions: reference/exceptions.md
           - History: reference/history.md
           - Model: reference/model.md
+          - Scorer: reference/scorer.md
           - Seeding: reference/seeding.md
           - Simulator: reference/simulator.md
           - Tasks: reference/task.md

From e7d15a86c98a14dd508dab1eb3c7ab67496fc244 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 09:15:34 +0100
Subject: [PATCH 15/23] [Move DISCO queue to core]: - Fix SmolAgents docs to
 make mkdocs build --strict pass

---
 docs/reference/environment.md | 10 ++++------
 docs/reference/user.md        | 23 +++++------------------
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/docs/reference/environment.md b/docs/reference/environment.md
index 77d40e30..7d65e9f1 100644
--- a/docs/reference/environment.md
+++ b/docs/reference/environment.md
@@ -8,10 +8,8 @@ Environments define the execution context for agents, including available tools,
 
 ## Tools and agent-provided helpers
 
-Some agent adapters expose helper tools or user-simulation tools that can be used by the Environment. For example:
+Some agent adapters expose helper tools or user-simulation tools that can be used by the Environment. See the framework-specific interface pages for details:
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/agents/smolagents.py){ .md-source-file }
-
-::: maseval.interface.agents.smolagents.SmolAgentAdapter
-
-::: maseval.interface.agents.smolagents.SmolAgentLLMUser
+- [SmolAgents](../interface/agents/smolagents.md) — `SmolAgentAdapter`, `SmolAgentLLMUser`
+- [LangGraph](../interface/agents/langgraph.md) — `LangGraphAgentAdapter`
+- [LlamaIndex](../interface/agents/llamaindex.md) — `LlamaIndexAgentAdapter`
diff --git a/docs/reference/user.md b/docs/reference/user.md
index c739ad25..c3cd1af8 100644
--- a/docs/reference/user.md
+++ b/docs/reference/user.md
@@ -14,22 +14,9 @@ The `LLMUser` is initialized with a persona and a scenario, both of which are ty
 
 ## Interfaces
 
-Some integrations provide convenience user/tool implementations for specific agent frameworks. For example:
+Some integrations provide convenience user implementations for specific agent frameworks. See the framework-specific interface pages for details:
 
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/agents/smolagents.py){ .md-source-file }
-
-::: maseval.interface.agents.smolagents.SmolAgentLLMUser
-
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/agents/langgraph.py){ .md-source-file }
-
-::: maseval.interface.agents.langgraph.LangGraphLLMUser
-
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/agents/llamaindex.py){ .md-source-file }
-
-::: maseval.interface.agents.llamaindex.LlamaIndexLLMUser
-
-[:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/agents/camel.py){ .md-source-file }
-
-::: maseval.interface.agents.camel.CamelLLMUser
-
-::: maseval.interface.agents.camel.CamelAgentUser
+- [SmolAgents](../interface/agents/smolagents.md) — `SmolAgentLLMUser`
+- [LangGraph](../interface/agents/langgraph.md) — `LangGraphLLMUser`
+- [LlamaIndex](../interface/agents/llamaindex.md) — `LlamaIndexLLMUser`
+- [CAMEL-AI](../interface/agents/camel.md) — `CamelLLMUser`, `CamelAgentUser`

From 3aa675e3c97cf06f7d91737296b3fabbfedf660c Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 09:22:36 +0100
Subject: [PATCH 16/23] [Move DISCO queue to core]: - Fix DISCO references.

---
 BENCHMARKS.md        | 2 +-
 maseval/core/task.py | 7 +++----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/BENCHMARKS.md b/BENCHMARKS.md
index 0cc5473c..4cb9f74c 100644
--- a/BENCHMARKS.md
+++ b/BENCHMARKS.md
@@ -90,7 +90,7 @@ MMLU evaluates language models on multiple-choice questions spanning 57 academic
 ### Source and License
 
 - **Original Paper:** [Measuring Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300) (Hendrycks et al., 2021)
-- **DISCO Paper:** [DISCO: DISCOvering key features for accurate prediction of LLM abilities on benchmarks](https://arxiv.org/abs/2407.12890) (Rubinstein et al., 2025)
+- **DISCO Paper:** [DISCO: Diversifying Sample Condensation for Efficient Model Evaluation](https://arxiv.org/abs/2510.07959) (Rubinstein et al., ICLR 2026)
 - **Dataset:** [arubique/flattened-MMLU](https://huggingface.co/datasets/arubique/flattened-MMLU)
 
 ---
diff --git a/maseval/core/task.py b/maseval/core/task.py
index 07a3af9b..9a7b3aca 100644
--- a/maseval/core/task.py
+++ b/maseval/core/task.py
@@ -327,15 +327,14 @@ class DISCOQueue(InformativeSubsetQueue):
 
     Selects a diverse subset of tasks (anchor points) for evaluation. Full
     benchmark performance is then predicted from results on this subset using
-    DISCO (DISCOvering key features for accurate prediction of LLM abilities
-    on benchmarks).
+    DISCO (Diversifying Sample Condensation for Efficient Model Evaluation).
 
     The informativeness criterion is **diversity**: anchor points are chosen
     to maximise disagreement across models, so that a small evaluation set
     captures the discriminative structure of the full benchmark.
 
-    Reference: `DISCO: DISCOvering key features for accurate prediction of
-    LLM abilities on benchmarks <https://arxiv.org/abs/2407.12890>`_
+    Reference: `DISCO: Diversifying Sample Condensation for Efficient Model
+    Evaluation <https://arxiv.org/abs/2510.07959>`_
 
     Example:
         ```python

From 6f5b0e22e6ab514d84ae853439f45980866ac2e3 Mon Sep 17 00:00:00 2001
From: Alexander Rubinstein <rubalex14@gmail.com>
Date: Mon, 16 Mar 2026 09:27:11 +0100
Subject: [PATCH 17/23] Add benchmark/index.md to mkdocs.yml to fix warning
 during docs building

---
 mkdocs.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mkdocs.yml b/mkdocs.yml
index dec8cc1e..4ba742bf 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -129,6 +129,7 @@ nav:
               - LiteLLM: interface/inference/litellm.md
               - OpenAI: interface/inference/openai.md
       - Benchmarks:
+          - Overview: benchmark/index.md
           - ConVerse: benchmark/converse.md
           - GAIA2: benchmark/gaia2.md
           - MACS: benchmark/macs.md

From fa280fdf070f08d1b355168f99cfe45975843999 Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Fri, 27 Mar 2026 23:13:48 +0100
Subject: [PATCH 18/23] small quality fixes

---
 BENCHMARKS.md                                 |    2 +-
 CHANGELOG.md                                  |    5 +-
 docs/interface/inference/huggingface.md       |    3 -
 docs/reference/exceptions.md                  |    4 -
 maseval/__init__.py                           |    2 -
 maseval/benchmark/mmlu/__init__.py            |    4 +-
 maseval/benchmark/mmlu/mmlu.py                |   51 +-
 maseval/core/agent.py                         |    2 -
 maseval/core/exceptions.py                    |   38 -
 maseval/core/model.py                         |    2 +-
 maseval/core/task.py                          |   10 +-
 maseval/interface/inference/__init__.py       |    2 -
 maseval/interface/inference/huggingface.py    |    4 -
 .../test_model_adapter_contract.py            |    8 +-
 tests/test_core/test_exceptions.py            |   47 -
 tests/test_core/test_model_adapter.py         |    6 +-
 tests/test_core/test_queue.py                 |   10 +-
 .../test_model_adapters.py                    |   98 +-
 uv.lock                                       | 1582 +++++++++++++++--
 19 files changed, 1543 insertions(+), 337 deletions(-)

diff --git a/BENCHMARKS.md b/BENCHMARKS.md
index 4cb9f74c..3597e9cb 100644
--- a/BENCHMARKS.md
+++ b/BENCHMARKS.md
@@ -85,7 +85,7 @@ MMLU evaluates language models on multiple-choice questions spanning 57 academic
 
 > **Beta:** This benchmark has been implemented carefully, but we have not yet validated the results against the original implementation. Use with caution when comparing with existing results or the original paper's numbers. Contributions and compute donations welcome!
 
-> **Implemented:** A ready-to-use implementation is available via `DefaultMMLUBenchmark` with HuggingFace model support. Install with `pip install maseval[mmlu]`. See the [MMLU documentation](docs/benchmark/mmlu.md) for usage details.
+> **Implemented:** A ready-to-use implementation is available via `DefaultMMLUBenchmark` with HuggingFace model support. Install with `pip install maseval[mmlu]`. See the [MMLU documentation](https://maseval.readthedocs.io/en/stable/benchmark/mmlu/) for usage details.
 
 ### Source and License
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
index e4b63450..321067ad 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -55,7 +55,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 **Interface**
 
 - Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #34)
-- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. The old name remains as a backwards-compatible alias. (PR: #34)
+- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. (PR: #34)
 
 - CAMEL-AI integration: `CamelAgentAdapter` and `CamelLLMUser` for evaluating CAMEL-AI ChatAgent-based systems (PR: #22)
   - Added `CamelAgentUser` for using a CAMEL ChatAgent as the user in agent-to-agent evaluation (PR: #22)
@@ -93,7 +93,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Benchmarks**
 
-- `MMLUBenchmark` no longer implements `setup_agents()` — consistent with other benchmarks, agent creation is left to concrete subclasses (e.g., `DefaultMMLUBenchmark`). Removed silent `.get()` fallbacks for required fields (`gold`, `query`, `model_id`) so missing data surfaces errors immediately instead of failing silently. `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue` instead of `AdaptiveTaskQueue`. Added `mmlu` optional extra (`pip install maseval[mmlu]`). `DefaultMMLUBenchmark` now delegates log-likelihood computation to `HuggingFaceModelScorer` and uses a scorer-backed adapter instead of the MMLU-specific `MMLUModelAgent`/`MMLUAgentAdapter` (removed). (PR: #34)
+- `MMLUBenchmark` is now a framework-agnostic base class — `setup_agents()` and `get_model_adapter()` must be implemented by subclasses. Use `DefaultMMLUBenchmark` for HuggingFace models. Missing required fields now raise immediately instead of falling back to silent defaults. Install with `pip install maseval[mmlu]`. (PR: #34)
+- `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue`. Removed `MMLUModelAgent`, `MMLUAgentAdapter`, and `AnchorPointsTaskQueue`. (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
diff --git a/docs/interface/inference/huggingface.md b/docs/interface/inference/huggingface.md
index 28814b60..35a1e5b4 100644
--- a/docs/interface/inference/huggingface.md
+++ b/docs/interface/inference/huggingface.md
@@ -8,9 +8,6 @@ This page documents the HuggingFace model adapters for MASEval.
 
 ::: maseval.interface.inference.huggingface.HuggingFacePipelineModelAdapter
 
-!!! note
-    `HuggingFaceModelAdapter` is a backwards-compatible alias for `HuggingFacePipelineModelAdapter`.
-
 ## Model Scorer (Log-Likelihood)
 
 [:material-github: View source](https://github.com/parameterlab/maseval/blob/main/maseval/interface/inference/huggingface_scorer.py){ .md-source-file }
diff --git a/docs/reference/exceptions.md b/docs/reference/exceptions.md
index 99cf2c3e..ef96f9dc 100644
--- a/docs/reference/exceptions.md
+++ b/docs/reference/exceptions.md
@@ -38,10 +38,6 @@ SimulatorError (base for simulators)
 
 ::: maseval.core.simulator.UserSimulatorError
 
-## Data Access Helpers
-
-::: maseval.core.exceptions.get_with_assert
-
 ## Validation Helpers
 
 These functions simplify input validation and raise `AgentError` with helpful suggestions:
diff --git a/maseval/__init__.py b/maseval/__init__.py
index c6fa6cec..e10b47a4 100644
--- a/maseval/__init__.py
+++ b/maseval/__init__.py
@@ -49,7 +49,6 @@
     UserError,
     UserExhaustedError,
     TaskTimeoutError,
-    get_with_assert,
     validate_argument_type,
     validate_required_arguments,
     validate_no_extra_arguments,
@@ -106,7 +105,6 @@
     "ChatResponse",
     "ModelScorer",
     # Exceptions and validation
-    "get_with_assert",
     "MASEvalError",
     "AgentError",
     "EnvironmentError",
diff --git a/maseval/benchmark/mmlu/__init__.py b/maseval/benchmark/mmlu/__init__.py
index 6c6f751c..d916f63c 100644
--- a/maseval/benchmark/mmlu/__init__.py
+++ b/maseval/benchmark/mmlu/__init__.py
@@ -7,7 +7,7 @@
         DefaultMMLUBenchmark,
         load_tasks,
     )
-    from maseval import DISCOQueue, InformativeSubsetQueue
+    from maseval.core.task import DISCOQueue, InformativeSubsetQueue
 
     # Load tasks and anchor points
     tasks = load_tasks(
@@ -20,7 +20,7 @@
     results = benchmark.run(tasks=tasks, agent_data={"model_id": "meta-llama/Llama-2-7b-hf"})
 """
 
-from maseval import DISCOQueue
+from maseval.core.task import DISCOQueue, InformativeSubsetQueue
 
 from .mmlu import (
     DEFAULT_AGENT_NAME,
diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index d00fe5c0..b367288a 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -10,7 +10,7 @@
     from maseval.benchmark.mmlu import (
         DefaultMMLUBenchmark, load_tasks,
     )
-    from maseval import DISCOQueue
+    from maseval.core.task import DISCOQueue
 
     # Load tasks (optionally filtered to anchor points)
     tasks = load_tasks(
@@ -27,19 +27,15 @@
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Sequence, Tuple, Union, cast
 
-from maseval import (
-    AgentAdapter,
-    DISCOQueue,
-    Benchmark,
-    Environment,
-    Evaluator,
-    ModelAdapter,
-    Task,
-    User,
-    SeedGenerator,
-)
+from maseval.core.agent import AgentAdapter
+from maseval.core.benchmark import Benchmark
+from maseval.core.environment import Environment
+from maseval.core.evaluator import Evaluator
 from maseval.core.history import MessageHistory
-from maseval.core.task import SequentialTaskQueue
+from maseval.core.model import ModelAdapter
+from maseval.core.seeding import SeedGenerator
+from maseval.core.task import DISCOQueue, SequentialTaskQueue, Task
+from maseval.core.user import User
 
 
 # =============================================================================
@@ -75,6 +71,10 @@ def __init__(self, scorer: Any, name: str) -> None:
         super().__init__(agent_instance=scorer, name=name)
         self._messages: List[Dict[str, Any]] = []
 
+    def record_message(self, message: Dict[str, Any]) -> None:
+        """Record a message for tracing purposes."""
+        self._messages.append(message)
+
     def _run_agent(self, query: str) -> Any:
         raise NotImplementedError(
             f"{type(self).__name__} is backed by a ModelScorer, not a generation model. "
@@ -378,6 +378,7 @@ def __init__(
         self._device = device
         self._trust_remote_code = trust_remote_code
         self._batch_size = batch_size
+        self._precomputed_logprobs: Optional[Dict[Any, List[float]]] = None
 
         from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
 
@@ -512,25 +513,18 @@ def run_agents(
         doc_id = task.metadata["doc_id"]
         agent = cast(_ScorerBackedAdapter, agents[0])
 
-        if hasattr(self, "_precomputed_logprobs") and doc_id in self._precomputed_logprobs:
+        if self._precomputed_logprobs is not None and doc_id in self._precomputed_logprobs:
             logprobs = self._precomputed_logprobs[doc_id]
-            best_idx = logprobs.index(max(logprobs))
-            answer = choices[best_idx]
-            mmlu_env.state["logprobs"] = logprobs
-            mmlu_env.state["predicted_idx"] = best_idx
-            agent._messages.append({"role": "user", "content": prompt})
-            agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
-            return answer
-
-        logprobs = self._scorer.loglikelihood_choices(prompt, choices, delimiter=TARGET_DELIMITER)
+        else:
+            logprobs = self._scorer.loglikelihood_choices(prompt, choices, delimiter=TARGET_DELIMITER)
 
         best_idx = logprobs.index(max(logprobs))
         answer = choices[best_idx]
         mmlu_env.state["logprobs"] = logprobs
         mmlu_env.state["predicted_idx"] = best_idx
 
-        agent._messages.append({"role": "user", "content": prompt})
-        agent._messages.append({"role": "assistant", "content": answer, "logprobs": logprobs})
+        agent.record_message({"role": "user", "content": prompt})
+        agent.record_message({"role": "assistant", "content": answer, "logprobs": logprobs})
         return answer
 
     def get_model_adapter(self, model_id: str, **kwargs: Any) -> ModelAdapter:
@@ -589,11 +583,14 @@ def load_tasks(
         if "gold" not in item:
             raise ValueError(f"MMLU task at index {i} missing required 'gold' field (correct answer index)")
 
+        if "choices" not in item:
+            raise ValueError(f"MMLU task at index {i} missing required 'choices' field")
+
         task = Task(
             query=query,
             id=f"mmlu_{i}",
             environment_data={
-                "choices": item.get("choices", DEFAULT_CHOICES),
+                "choices": item["choices"],
                 "full_prompt": item.get("full_prompt", ""),
                 "example": item.get("example", ""),
             },
@@ -638,7 +635,7 @@ def compute_benchmark_metrics(results: List[Dict[str, Any]]) -> Dict[str, Any]:
         if res["status"] != STATUS_SUCCESS:
             continue
 
-        evals = res["eval"] or []
+        evals = res["eval"] if res["eval"] is not None else []
         for entry in evals:
             acc_sum += entry["acc"]
             acc_norm_sum += entry["acc_norm"]
diff --git a/maseval/core/agent.py b/maseval/core/agent.py
index 1f0aeb9b..97011527 100644
--- a/maseval/core/agent.py
+++ b/maseval/core/agent.py
@@ -1,5 +1,3 @@
-from __future__ import annotations
-
 from abc import ABC, abstractmethod
 from typing import List, Any, Optional, Dict
 
diff --git a/maseval/core/exceptions.py b/maseval/core/exceptions.py
index b3e297c0..e4c8c0f1 100644
--- a/maseval/core/exceptions.py
+++ b/maseval/core/exceptions.py
@@ -308,44 +308,6 @@ def __init__(
 # =============================================================================
 
 
-def get_with_assert(container: Any, key: Any, error_msg: Optional[str] = None) -> Any:
-    """Get a value from a container, raising ``KeyError`` if not found.
-
-    Use instead of ``dict.get(key, default)`` when the key is **required**.
-    A missing key means a bug — not a case to paper over with a fallback.
-
-    Supports nested access via a list of keys::
-
-        get_with_assert(task, ["metadata", "doc_id"])
-        # equivalent to: task["metadata"]["doc_id"] but with a clear error
-
-    Args:
-        container: Dictionary or other container supporting ``in`` and ``[]``.
-        key: Key to look up. Pass a list for nested access.
-        error_msg: Custom error message. If ``None``, a descriptive default
-            is generated.
-
-    Returns:
-        The value at the given key.
-
-    Raises:
-        KeyError: If the key is not found in the container.
-    """
-    if isinstance(key, list):
-        assert len(key) > 0
-        value = get_with_assert(container, key[0], error_msg)
-        if len(key) == 1:
-            return value
-        return get_with_assert(value, key[1:], error_msg)
-
-    if key not in container:
-        if error_msg is None:
-            error_msg = f'Required key "{key}" not in container: {container}'
-        raise KeyError(error_msg)
-
-    return container[key]
-
-
 def validate_argument_type(
     value: Any,
     expected_type: str,
diff --git a/maseval/core/model.py b/maseval/core/model.py
index d62d204c..10a615c2 100644
--- a/maseval/core/model.py
+++ b/maseval/core/model.py
@@ -155,7 +155,7 @@ class ModelAdapter(ABC, TraceableMixin, ConfigurableMixin):
     See maseval.interface.inference for concrete implementations:
         - AnthropicModelAdapter
         - GoogleGenAIModelAdapter
-        - HuggingFacePipelineModelAdapter (alias: HuggingFaceModelAdapter)
+        - HuggingFacePipelineModelAdapter
         - LiteLLMModelAdapter
         - OpenAIModelAdapter
 
diff --git a/maseval/core/task.py b/maseval/core/task.py
index 9a7b3aca..004ed70a 100644
--- a/maseval/core/task.py
+++ b/maseval/core/task.py
@@ -315,8 +315,14 @@ def __init__(self, tasks: Iterable[Task], indices: Optional[List[int]] = None) -
         self._indices: Optional[List[int]] = indices
 
         if indices is not None:
-            task_by_index: Dict[int, Task] = {i: task for i, task in enumerate(all_tasks)}
-            filtered = [task_by_index[idx] for idx in indices if idx in task_by_index]
+            n_tasks = len(all_tasks)
+            out_of_range = [idx for idx in indices if idx < 0 or idx >= n_tasks]
+            if out_of_range:
+                raise IndexError(
+                    f"Indices {out_of_range} are out of range for task list of length {n_tasks}. "
+                    "This likely means a mismatch between the task data and the index file."
+                )
+            filtered = [all_tasks[idx] for idx in indices]
             super().__init__(filtered)
         else:
             super().__init__(all_tasks)
diff --git a/maseval/interface/inference/__init__.py b/maseval/interface/inference/__init__.py
index 549c719b..910cb2de 100644
--- a/maseval/interface/inference/__init__.py
+++ b/maseval/interface/inference/__init__.py
@@ -57,12 +57,10 @@
 try:
     from .huggingface import (  # noqa: F401
         HuggingFacePipelineModelAdapter,
-        HuggingFaceModelAdapter,
         ToolCallingNotSupportedError,
     )
 
     __all__.append("HuggingFacePipelineModelAdapter")
-    __all__.append("HuggingFaceModelAdapter")
     __all__.append("ToolCallingNotSupportedError")
 except ImportError:
     pass
diff --git a/maseval/interface/inference/huggingface.py b/maseval/interface/inference/huggingface.py
index f765eb49..9533d4f2 100644
--- a/maseval/interface/inference/huggingface.py
+++ b/maseval/interface/inference/huggingface.py
@@ -387,7 +387,3 @@ def gather_config(self) -> Dict[str, Any]:
             base_config["pipeline_config"] = pipeline_config
 
         return base_config
-
-
-# Backwards compatibility alias
-HuggingFaceModelAdapter = HuggingFacePipelineModelAdapter
diff --git a/tests/test_contract/test_model_adapter_contract.py b/tests/test_contract/test_model_adapter_contract.py
index 1a4bfbcf..3943cedb 100644
--- a/tests/test_contract/test_model_adapter_contract.py
+++ b/tests/test_contract/test_model_adapter_contract.py
@@ -1,7 +1,7 @@
 """Cross-implementation ModelAdapter contract tests.
 
 Verifies that ALL ModelAdapter implementations (DummyModelAdapter, OpenAIModelAdapter,
-GoogleGenAIModelAdapter, HuggingFaceModelAdapter, LiteLLMModelAdapter) implement the
+GoogleGenAIModelAdapter, HuggingFacePipelineModelAdapter, LiteLLMModelAdapter) implement the
 same contract and behave identically for key operations.
 
 This validates MASEval's CORE PROMISE: provider-agnostic model abstraction.
@@ -237,9 +237,9 @@ def create_huggingface_adapter(
     tool_calls: Optional[List[Optional[List[Dict[str, Any]]]]] = None,
     seed: Optional[int] = None,
 ) -> Any:
-    """Create HuggingFaceModelAdapter instance."""
+    """Create HuggingFacePipelineModelAdapter instance."""
     pytest.importorskip("transformers")
-    from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+    from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
     response_list: List[Optional[str]] = responses or ["Test response"]
     call_count = [0]
@@ -249,7 +249,7 @@ def mock_model(prompt, **kwargs):
         call_count[0] += 1
         return response
 
-    return HuggingFaceModelAdapter(model=mock_model, model_id=model_id, seed=seed)
+    return HuggingFacePipelineModelAdapter(model=mock_model, model_id=model_id, seed=seed)
 
 
 def create_litellm_adapter(
diff --git a/tests/test_core/test_exceptions.py b/tests/test_core/test_exceptions.py
index 1698fa61..416ebb7e 100644
--- a/tests/test_core/test_exceptions.py
+++ b/tests/test_core/test_exceptions.py
@@ -14,7 +14,6 @@
     AgentError,
     EnvironmentError,
     UserError,
-    get_with_assert,
     validate_argument_type,
     validate_required_arguments,
     validate_no_extra_arguments,
@@ -371,52 +370,6 @@ def test_validate_arguments_from_schema_strict_mode(self):
             validate_arguments_from_schema({"name": "test", "extra": 1}, schema, strict=True)
 
 
-@pytest.mark.core
-class TestGetWithAssert:
-    """Tests for get_with_assert required-key lookup."""
-
-    def test_single_key_present(self):
-        """Returns value when key exists."""
-        assert get_with_assert({"a": 1}, "a") == 1
-
-    def test_single_key_missing_raises_key_error(self):
-        """Raises KeyError with descriptive message when key is missing."""
-        with pytest.raises(KeyError, match='Required key "x"'):
-            get_with_assert({"a": 1}, "x")
-
-    def test_nested_key_access(self):
-        """Supports nested access via a list of keys."""
-        data = {"level1": {"level2": {"level3": "value"}}}
-        assert get_with_assert(data, ["level1", "level2", "level3"]) == "value"
-
-    def test_nested_key_missing_raises_key_error(self):
-        """Raises KeyError when a nested key is missing."""
-        data = {"level1": {"level2": {}}}
-        with pytest.raises(KeyError):
-            get_with_assert(data, ["level1", "level2", "level3"])
-
-    def test_custom_error_message(self):
-        """Uses custom error message when provided."""
-        with pytest.raises(KeyError, match="MMLU task missing query"):
-            get_with_assert({}, "query", error_msg="MMLU task missing query")
-
-    def test_single_element_list_key(self):
-        """List with one key behaves like a single key."""
-        assert get_with_assert({"a": 42}, ["a"]) == 42
-
-    def test_falsy_values_returned(self):
-        """Falsy values (0, empty string, False, None) are returned, not treated as missing."""
-        assert get_with_assert({"k": 0}, "k") == 0
-        assert get_with_assert({"k": ""}, "k") == ""
-        assert get_with_assert({"k": False}, "k") is False
-        assert get_with_assert({"k": None}, "k") is None
-
-    def test_empty_key_list_raises(self):
-        """Empty key list triggers assertion error."""
-        with pytest.raises(AssertionError):
-            get_with_assert({"a": 1}, [])
-
-
 class TestFilteringByErrorType:
     """Tests for filtering failed tasks by error type."""
 
diff --git a/tests/test_core/test_model_adapter.py b/tests/test_core/test_model_adapter.py
index 9682b0bb..09a4aab2 100644
--- a/tests/test_core/test_model_adapter.py
+++ b/tests/test_core/test_model_adapter.py
@@ -628,12 +628,12 @@ def test_litellm_adapter_accepts_seed(self):
         assert adapter.seed == 42
 
     def test_huggingface_adapter_accepts_seed(self):
-        """HuggingFaceModelAdapter accepts seed parameter."""
-        from maseval.interface.inference import HuggingFaceModelAdapter
+        """HuggingFacePipelineModelAdapter accepts seed parameter."""
+        from maseval.interface.inference import HuggingFacePipelineModelAdapter
 
         def mock_model(x):
             return "response"
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="llama", seed=42)
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="llama", seed=42)
 
         assert adapter.seed == 42
diff --git a/tests/test_core/test_queue.py b/tests/test_core/test_queue.py
index 35bf1933..c7e9ea8c 100644
--- a/tests/test_core/test_queue.py
+++ b/tests/test_core/test_queue.py
@@ -263,14 +263,12 @@ def test_stores_all_tasks(self, simple_tasks):
         assert len(queue._all_tasks) == 3
         assert len(queue) == 1
 
-    def test_out_of_range_indices_skipped(self):
-        """Indices not present in the task list should be silently skipped."""
+    def test_out_of_range_indices_raises(self):
+        """Out-of-range indices should raise IndexError."""
         tasks = [Task(query="Q0"), Task(query="Q1")]
-        queue = InformativeSubsetQueue(tasks, indices=[0, 5, 99])
 
-        queries = [task.query for task in queue]
-
-        assert queries == ["Q0"]
+        with pytest.raises(IndexError, match="out of range"):
+            InformativeSubsetQueue(tasks, indices=[0, 5, 99])
 
     def test_empty_indices(self, simple_tasks):
         """Empty indices list should yield no tasks."""
diff --git a/tests/test_interface/test_model_integration/test_model_adapters.py b/tests/test_interface/test_model_integration/test_model_adapters.py
index 50f418ca..642a002a 100644
--- a/tests/test_interface/test_model_integration/test_model_adapters.py
+++ b/tests/test_interface/test_model_integration/test_model_adapters.py
@@ -3,7 +3,7 @@
 Tests specific behavior and integration for each ModelAdapter implementation:
 - OpenAIModelAdapter
 - GoogleGenAIModelAdapter
-- HuggingFaceModelAdapter
+- HuggingFacePipelineModelAdapter
 - LiteLLMModelAdapter
 
 These tests verify that each adapter correctly wraps its underlying client
@@ -645,30 +645,30 @@ def __init__(self):
 
 
 @pytest.mark.interface
-class TestHuggingFaceModelAdapterIntegration:
-    """Test HuggingFaceModelAdapter specific behavior."""
+class TestHuggingFacePipelineModelAdapterIntegration:
+    """Test HuggingFacePipelineModelAdapter specific behavior."""
 
     def test_huggingface_adapter_initialization(self):
-        """HuggingFaceModelAdapter initializes with callable."""
+        """HuggingFacePipelineModelAdapter initializes with callable."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt, **kwargs):
             return f"Response to: {prompt}"
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="gpt2")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="gpt2")
 
         assert adapter.model_id == "gpt2"
 
     def test_huggingface_adapter_generate(self):
-        """HuggingFaceModelAdapter generates text with message formatting."""
+        """HuggingFacePipelineModelAdapter generates text with message formatting."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt, **kwargs):
             return f"Generated: {prompt}"
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="gpt2")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="gpt2")
         result = adapter.generate("Test prompt")
 
         assert isinstance(result, str)
@@ -676,9 +676,9 @@ def mock_model(prompt, **kwargs):
         assert "Generated:" in result
 
     def test_huggingface_adapter_default_generation_params(self):
-        """HuggingFaceModelAdapter uses default generation parameters."""
+        """HuggingFacePipelineModelAdapter uses default generation parameters."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         captured_params = {}
 
@@ -686,7 +686,7 @@ def mock_model(prompt, **kwargs):
             captured_params.update(kwargs)
             return "Response"
 
-        adapter = HuggingFaceModelAdapter(
+        adapter = HuggingFacePipelineModelAdapter(
             model=mock_model,
             model_id="gpt2",
             default_generation_params={"max_length": 50, "temperature": 0.7},
@@ -699,29 +699,29 @@ def mock_model(prompt, **kwargs):
         assert captured_params["temperature"] == 0.7
 
     def test_huggingface_adapter_fallback_without_kwargs(self):
-        """HuggingFaceModelAdapter falls back to calling without kwargs."""
+        """HuggingFacePipelineModelAdapter falls back to calling without kwargs."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt):
             # Only accepts prompt
             return f"Response: {prompt}"
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="gpt2")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="gpt2")
         result = adapter.generate("Test")
 
         # Should still work, just formats the prompt as messages
         assert "Response:" in result
 
     def test_huggingface_adapter_gather_config(self):
-        """HuggingFaceModelAdapter config includes parameters."""
+        """HuggingFacePipelineModelAdapter config includes parameters."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt):
             return "Response"
 
-        adapter = HuggingFaceModelAdapter(
+        adapter = HuggingFacePipelineModelAdapter(
             model=mock_model,
             model_id="gpt2",
             default_generation_params={"max_length": 100},
@@ -734,9 +734,9 @@ def mock_model(prompt):
         assert "callable_type" in config
 
     def test_huggingface_adapter_gather_config_with_pipeline(self):
-        """HuggingFaceModelAdapter config includes pipeline configuration."""
+        """HuggingFacePipelineModelAdapter config includes pipeline configuration."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         # Mock pipeline object with attributes
         class MockPipeline:
@@ -749,7 +749,7 @@ def __call__(self, prompt, **kwargs):
                 return "Response"
 
         pipeline = MockPipeline()
-        adapter = HuggingFaceModelAdapter(model=pipeline, model_id="gpt2")
+        adapter = HuggingFacePipelineModelAdapter(model=pipeline, model_id="gpt2")
 
         config = adapter.gather_config()
 
@@ -759,17 +759,17 @@ def __call__(self, prompt, **kwargs):
         assert config["pipeline_config"]["framework"] == "pt"
 
     def test_huggingface_adapter_tools_raises_error_without_support(self):
-        """HuggingFaceModelAdapter raises error when tools not supported."""
+        """HuggingFacePipelineModelAdapter raises error when tools not supported."""
         pytest.importorskip("transformers")
         from maseval.interface.inference.huggingface import (
-            HuggingFaceModelAdapter,
+            HuggingFacePipelineModelAdapter,
             ToolCallingNotSupportedError,
         )
 
         def mock_model(prompt, **kwargs):
             return "Response"
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="test-model")
 
         with pytest.raises(ToolCallingNotSupportedError):
             adapter.chat(
@@ -778,10 +778,10 @@ def mock_model(prompt, **kwargs):
             )
 
     def test_huggingface_adapter_tools_raises_when_template_doesnt_support(self):
-        """HuggingFaceModelAdapter raises error when template doesn't support tools."""
+        """HuggingFacePipelineModelAdapter raises error when template doesn't support tools."""
         pytest.importorskip("transformers")
         from maseval.interface.inference.huggingface import (
-            HuggingFaceModelAdapter,
+            HuggingFacePipelineModelAdapter,
             ToolCallingNotSupportedError,
         )
 
@@ -797,7 +797,7 @@ class MockPipeline:
             def __call__(self, prompt, **kwargs):
                 return "Response"
 
-        adapter = HuggingFaceModelAdapter(model=MockPipeline(), model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=MockPipeline(), model_id="test-model")
 
         with pytest.raises(ToolCallingNotSupportedError):
             adapter.chat(
@@ -806,9 +806,9 @@ def __call__(self, prompt, **kwargs):
             )
 
     def test_huggingface_adapter_chat_template_with_tools(self):
-        """HuggingFaceModelAdapter works when template supports tools."""
+        """HuggingFacePipelineModelAdapter works when template supports tools."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         class MockTokenizer:
             def apply_chat_template(self, messages, add_generation_prompt=True, tokenize=False, tools=None, **kwargs):
@@ -820,7 +820,7 @@ class MockPipeline:
             def __call__(self, prompt, **kwargs):
                 return "Response"
 
-        adapter = HuggingFaceModelAdapter(model=MockPipeline(), model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=MockPipeline(), model_id="test-model")
         response = adapter.chat(
             [{"role": "user", "content": "Test"}],
             tools=[{"type": "function", "function": {"name": "test"}}],
@@ -829,9 +829,9 @@ def __call__(self, prompt, **kwargs):
         assert response is not None
 
     def test_huggingface_adapter_parses_tool_calls_from_output(self):
-        """HuggingFaceModelAdapter parses tool calls from model output."""
+        """HuggingFacePipelineModelAdapter parses tool calls from model output."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         class MockTokenizer:
             def apply_chat_template(self, messages, add_generation_prompt=True, tokenize=False, tools=None, **kwargs):
@@ -843,7 +843,7 @@ class MockPipeline:
             def __call__(self, prompt, **kwargs):
                 return '<tool_call>{"name": "search", "arguments": {"q": "test"}}</tool_call>'
 
-        adapter = HuggingFaceModelAdapter(model=MockPipeline(), model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=MockPipeline(), model_id="test-model")
         response = adapter.chat(
             [{"role": "user", "content": "Search"}],
             tools=[{"type": "function", "function": {"name": "search"}}],
@@ -854,9 +854,9 @@ def __call__(self, prompt, **kwargs):
         assert any(tc["function"]["name"] == "search" for tc in response.tool_calls)
 
     def test_huggingface_adapter_chat_with_tokenizer(self):
-        """HuggingFaceModelAdapter uses chat template when available."""
+        """HuggingFacePipelineModelAdapter uses chat template when available."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         class MockTokenizer:
             def apply_chat_template(self, messages, add_generation_prompt=True, tokenize=False, **kwargs):
@@ -868,42 +868,42 @@ class MockPipeline:
             def __call__(self, prompt, **kwargs):
                 return f"Response to: {prompt}"
 
-        adapter = HuggingFaceModelAdapter(model=MockPipeline(), model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=MockPipeline(), model_id="test-model")
         response = adapter.chat([{"role": "user", "content": "Hello"}])
 
         assert response.content is not None
 
     def test_huggingface_adapter_pipeline_response_format(self):
-        """HuggingFaceModelAdapter handles pipeline list response format."""
+        """HuggingFacePipelineModelAdapter handles pipeline list response format."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt, **kwargs):
             return [{"generated_text": prompt + " Generated"}]
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="test-model")
         response = adapter.chat([{"role": "user", "content": "Test"}])
 
         assert response.content is not None
         assert "Generated" in response.content
 
     def test_huggingface_adapter_dict_response_format(self):
-        """HuggingFaceModelAdapter handles dict response format."""
+        """HuggingFacePipelineModelAdapter handles dict response format."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         def mock_model(prompt, **kwargs):
             return {"generated_text": "Dict response"}
 
-        adapter = HuggingFaceModelAdapter(model=mock_model, model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=mock_model, model_id="test-model")
         response = adapter.chat([{"role": "user", "content": "Test"}])
 
         assert response.content == "Dict response"
 
     def test_huggingface_adapter_nested_tokenizer(self):
-        """HuggingFaceModelAdapter gets tokenizer from model.model.tokenizer."""
+        """HuggingFacePipelineModelAdapter gets tokenizer from model.model.tokenizer."""
         pytest.importorskip("transformers")
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
 
         class MockTokenizer:
             def apply_chat_template(self, messages, add_generation_prompt=True, tokenize=False, **kwargs):
@@ -918,7 +918,7 @@ class MockPipeline:
             def __call__(self, prompt, **kwargs):
                 return "Response"
 
-        adapter = HuggingFaceModelAdapter(model=MockPipeline(), model_id="test-model")
+        adapter = HuggingFacePipelineModelAdapter(model=MockPipeline(), model_id="test-model")
         response = adapter.chat([{"role": "user", "content": "Test"}])
 
         assert response is not None
@@ -1673,7 +1673,7 @@ def test_all_adapters_expose_model_id(self):
 
         from maseval.interface.inference.openai import OpenAIModelAdapter
         from maseval.interface.inference.google_genai import GoogleGenAIModelAdapter
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
         from maseval.interface.inference.litellm import LiteLLMModelAdapter
 
         # OpenAI - mock with modern interface
@@ -1706,7 +1706,7 @@ def __init__(self):
         assert google_adapter.model_id == "gemini-pro"
 
         # HuggingFace
-        hf_adapter = HuggingFaceModelAdapter(model=lambda p: "R", model_id="gpt2")
+        hf_adapter = HuggingFacePipelineModelAdapter(model=lambda p: "R", model_id="gpt2")
         assert hf_adapter.model_id == "gpt2"
 
         # LiteLLM
@@ -1722,7 +1722,7 @@ def test_all_adapters_include_default_params_in_config(self):
 
         from maseval.interface.inference.openai import OpenAIModelAdapter
         from maseval.interface.inference.google_genai import GoogleGenAIModelAdapter
-        from maseval.interface.inference.huggingface import HuggingFaceModelAdapter
+        from maseval.interface.inference.huggingface import HuggingFacePipelineModelAdapter
         from maseval.interface.inference.litellm import LiteLLMModelAdapter
 
         params = {"temperature": 0.7}
@@ -1761,7 +1761,7 @@ def __init__(self):
         assert "default_generation_params" in google_config
 
         # HuggingFace
-        hf_config = HuggingFaceModelAdapter(model=lambda p: "R", model_id="gpt2", default_generation_params=params).gather_config()
+        hf_config = HuggingFacePipelineModelAdapter(model=lambda p: "R", model_id="gpt2", default_generation_params=params).gather_config()
         assert "default_generation_params" in hf_config
 
         # LiteLLM
diff --git a/uv.lock b/uv.lock
index 5d02dcf0..f65fd4d6 100644
--- a/uv.lock
+++ b/uv.lock
@@ -3,11 +3,13 @@ revision = 3
 requires-python = ">=3.10"
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version < '3.11' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
     "python_full_version < '3.11' and sys_platform != 'linux'",
 ]
@@ -37,6 +39,34 @@ overrides = [
     { name = "termcolor", specifier = ">=2.5.0" },
 ]
 
+[[package]]
+name = "absl-py"
+version = "2.4.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/64/c7/8de93764ad66968d19329a7e0c147a2bb3c7054c554d4a119111b8f9440f/absl_py-2.4.0.tar.gz", hash = "sha256:8c6af82722b35cf71e0f4d1d47dcaebfff286e27110a99fc359349b247dfb5d4", size = 116543, upload-time = "2026-01-28T10:17:05.322Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/18/a6/907a406bb7d359e6a63f99c313846d9eec4f7e6f7437809e03aa00fa3074/absl_py-2.4.0-py3-none-any.whl", hash = "sha256:88476fd881ca8aab94ffa78b7b6c632a782ab3ba1cd19c9bd423abc4fb4cd28d", size = 135750, upload-time = "2026-01-28T10:17:04.19Z" },
+]
+
+[[package]]
+name = "accelerate"
+version = "1.13.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "huggingface-hub" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "packaging" },
+    { name = "psutil" },
+    { name = "pyyaml" },
+    { name = "safetensors" },
+    { name = "torch" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/ca/14/787e5498cd062640f0f3d92ef4ae4063174f76f9afd29d13fc52a319daae/accelerate-1.13.0.tar.gz", hash = "sha256:d631b4e0f5b3de4aff2d7e9e6857d164810dfc3237d54d017f075122d057b236", size = 402835, upload-time = "2026-03-04T19:34:12.359Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7e/46/02ac5e262d4af18054b3e922b2baedbb2a03289ee792162de60a865defc5/accelerate-1.13.0-py3-none-any.whl", hash = "sha256:cf1a3efb96c18f7b152eb0fa7490f3710b19c3f395699358f08decca2b8b62e0", size = 383744, upload-time = "2026-03-04T19:34:10.313Z" },
+]
+
 [[package]]
 name = "addict"
 version = "2.4.0"
@@ -197,6 +227,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/ef/39/b2181148075272edfbbd6d87e6cd78cc71dca243446fa3b381fd4116950b/aiosqlite-0.22.0-py3-none-any.whl", hash = "sha256:96007fac2ce70eda3ca1bba7a3008c435258a592b8fbf2ee3eeaa36d33971a09", size = 17263, upload-time = "2025-12-13T18:32:44.619Z" },
 ]
 
+[[package]]
+name = "annotated-doc"
+version = "0.0.4"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/57/ba/046ceea27344560984e26a590f90bc7f4a75b06701f653222458922b558c/annotated_doc-0.0.4.tar.gz", hash = "sha256:fbcda96e87e9c92ad167c2e53839e57503ecfda18804ea28102353485033faa4", size = 7288, upload-time = "2025-11-10T22:07:42.062Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/1e/d3/26bf1008eb3d2daa8ef4cacc7f3bfdc11818d111f7e2d0201bc6e3b49d45/annotated_doc-0.0.4-py3-none-any.whl", hash = "sha256:571ac1dc6991c450b25a9c2d84a3705e2ae7a53467b5d111c24fa8baabbed320", size = 5303, upload-time = "2025-11-10T22:07:40.673Z" },
+]
+
 [[package]]
 name = "annotated-types"
 version = "0.7.0"
@@ -599,6 +638,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/db/3c/33bac158f8ab7f89b2e59426d5fe2e4f63f7ed25df84c036890172b412b5/cfgv-3.5.0-py2.py3-none-any.whl", hash = "sha256:a8dc6b26ad22ff227d2634a65cb388215ce6cc96bbcc5cfde7641ae87e8dacc0", size = 7445, upload-time = "2025-11-19T20:55:50.744Z" },
 ]
 
+[[package]]
+name = "chardet"
+version = "5.2.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f3/0d/f7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079/chardet-5.2.0.tar.gz", hash = "sha256:1b3b6ff479a8c414bc3fa2c0852995695c4a026dcd6d0633b2dd092ca39c1cf7", size = 2069618, upload-time = "2023-08-01T19:23:02.662Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/38/6f/f5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf/chardet-5.2.0-py3-none-any.whl", hash = "sha256:e1cf59446890a00105fe7b7912492ea04b6e6f06d4b742b2c788469e34c82970", size = 199385, upload-time = "2023-08-01T19:23:00.661Z" },
+]
+
 [[package]]
 name = "charset-normalizer"
 version = "3.4.4"
@@ -739,6 +787,169 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/60/97/891a0971e1e4a8c5d2b20bbe0e524dc04548d2307fee33cdeba148fd4fc7/comm-0.2.3-py3-none-any.whl", hash = "sha256:c615d91d75f7f04f095b30d1c1711babd43bdc6419c1be9886a85f2f4e489417", size = 7294, upload-time = "2025-07-25T14:02:02.896Z" },
 ]
 
+[[package]]
+name = "contourpy"
+version = "1.3.2"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "python_full_version < '3.11' and sys_platform == 'linux'",
+    "python_full_version < '3.11' and sys_platform != 'linux'",
+]
+dependencies = [
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/66/54/eb9bfc647b19f2009dd5c7f5ec51c4e6ca831725f1aea7a993034f483147/contourpy-1.3.2.tar.gz", hash = "sha256:b6945942715a034c671b7fc54f9588126b0b8bf23db2696e3ca8328f3ff0ab54", size = 13466130, upload-time = "2025-04-15T17:47:53.79Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/12/a3/da4153ec8fe25d263aa48c1a4cbde7f49b59af86f0b6f7862788c60da737/contourpy-1.3.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:ba38e3f9f330af820c4b27ceb4b9c7feee5fe0493ea53a8720f4792667465934", size = 268551, upload-time = "2025-04-15T17:34:46.581Z" },
+    { url = "https://files.pythonhosted.org/packages/2f/6c/330de89ae1087eb622bfca0177d32a7ece50c3ef07b28002de4757d9d875/contourpy-1.3.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:dc41ba0714aa2968d1f8674ec97504a8f7e334f48eeacebcaa6256213acb0989", size = 253399, upload-time = "2025-04-15T17:34:51.427Z" },
+    { url = "https://files.pythonhosted.org/packages/c1/bd/20c6726b1b7f81a8bee5271bed5c165f0a8e1f572578a9d27e2ccb763cb2/contourpy-1.3.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9be002b31c558d1ddf1b9b415b162c603405414bacd6932d031c5b5a8b757f0d", size = 312061, upload-time = "2025-04-15T17:34:55.961Z" },
+    { url = "https://files.pythonhosted.org/packages/22/fc/a9665c88f8a2473f823cf1ec601de9e5375050f1958cbb356cdf06ef1ab6/contourpy-1.3.2-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8d2e74acbcba3bfdb6d9d8384cdc4f9260cae86ed9beee8bd5f54fee49a430b9", size = 351956, upload-time = "2025-04-15T17:35:00.992Z" },
+    { url = "https://files.pythonhosted.org/packages/25/eb/9f0a0238f305ad8fb7ef42481020d6e20cf15e46be99a1fcf939546a177e/contourpy-1.3.2-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:e259bced5549ac64410162adc973c5e2fb77f04df4a439d00b478e57a0e65512", size = 320872, upload-time = "2025-04-15T17:35:06.177Z" },
+    { url = "https://files.pythonhosted.org/packages/32/5c/1ee32d1c7956923202f00cf8d2a14a62ed7517bdc0ee1e55301227fc273c/contourpy-1.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ad687a04bc802cbe8b9c399c07162a3c35e227e2daccf1668eb1f278cb698631", size = 325027, upload-time = "2025-04-15T17:35:11.244Z" },
+    { url = "https://files.pythonhosted.org/packages/83/bf/9baed89785ba743ef329c2b07fd0611d12bfecbedbdd3eeecf929d8d3b52/contourpy-1.3.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:cdd22595308f53ef2f891040ab2b93d79192513ffccbd7fe19be7aa773a5e09f", size = 1306641, upload-time = "2025-04-15T17:35:26.701Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/cc/74e5e83d1e35de2d28bd97033426b450bc4fd96e092a1f7a63dc7369b55d/contourpy-1.3.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:b4f54d6a2defe9f257327b0f243612dd051cc43825587520b1bf74a31e2f6ef2", size = 1374075, upload-time = "2025-04-15T17:35:43.204Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/42/17f3b798fd5e033b46a16f8d9fcb39f1aba051307f5ebf441bad1ecf78f8/contourpy-1.3.2-cp310-cp310-win32.whl", hash = "sha256:f939a054192ddc596e031e50bb13b657ce318cf13d264f095ce9db7dc6ae81c0", size = 177534, upload-time = "2025-04-15T17:35:46.554Z" },
+    { url = "https://files.pythonhosted.org/packages/54/ec/5162b8582f2c994721018d0c9ece9dc6ff769d298a8ac6b6a652c307e7df/contourpy-1.3.2-cp310-cp310-win_amd64.whl", hash = "sha256:c440093bbc8fc21c637c03bafcbef95ccd963bc6e0514ad887932c18ca2a759a", size = 221188, upload-time = "2025-04-15T17:35:50.064Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/b9/ede788a0b56fc5b071639d06c33cb893f68b1178938f3425debebe2dab78/contourpy-1.3.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6a37a2fb93d4df3fc4c0e363ea4d16f83195fc09c891bc8ce072b9d084853445", size = 269636, upload-time = "2025-04-15T17:35:54.473Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/75/3469f011d64b8bbfa04f709bfc23e1dd71be54d05b1b083be9f5b22750d1/contourpy-1.3.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:b7cd50c38f500bbcc9b6a46643a40e0913673f869315d8e70de0438817cb7773", size = 254636, upload-time = "2025-04-15T17:35:58.283Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/2f/95adb8dae08ce0ebca4fd8e7ad653159565d9739128b2d5977806656fcd2/contourpy-1.3.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d6658ccc7251a4433eebd89ed2672c2ed96fba367fd25ca9512aa92a4b46c4f1", size = 313053, upload-time = "2025-04-15T17:36:03.235Z" },
+    { url = "https://files.pythonhosted.org/packages/c3/a6/8ccf97a50f31adfa36917707fe39c9a0cbc24b3bbb58185577f119736cc9/contourpy-1.3.2-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:70771a461aaeb335df14deb6c97439973d253ae70660ca085eec25241137ef43", size = 352985, upload-time = "2025-04-15T17:36:08.275Z" },
+    { url = "https://files.pythonhosted.org/packages/1d/b6/7925ab9b77386143f39d9c3243fdd101621b4532eb126743201160ffa7e6/contourpy-1.3.2-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:65a887a6e8c4cd0897507d814b14c54a8c2e2aa4ac9f7686292f9769fcf9a6ab", size = 323750, upload-time = "2025-04-15T17:36:13.29Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/f3/20c5d1ef4f4748e52d60771b8560cf00b69d5c6368b5c2e9311bcfa2a08b/contourpy-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3859783aefa2b8355697f16642695a5b9792e7a46ab86da1118a4a23a51a33d7", size = 326246, upload-time = "2025-04-15T17:36:18.329Z" },
+    { url = "https://files.pythonhosted.org/packages/8c/e5/9dae809e7e0b2d9d70c52b3d24cba134dd3dad979eb3e5e71f5df22ed1f5/contourpy-1.3.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:eab0f6db315fa4d70f1d8ab514e527f0366ec021ff853d7ed6a2d33605cf4b83", size = 1308728, upload-time = "2025-04-15T17:36:33.878Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/4a/0058ba34aeea35c0b442ae61a4f4d4ca84d6df8f91309bc2d43bb8dd248f/contourpy-1.3.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:d91a3ccc7fea94ca0acab82ceb77f396d50a1f67412efe4c526f5d20264e6ecd", size = 1375762, upload-time = "2025-04-15T17:36:51.295Z" },
+    { url = "https://files.pythonhosted.org/packages/09/33/7174bdfc8b7767ef2c08ed81244762d93d5c579336fc0b51ca57b33d1b80/contourpy-1.3.2-cp311-cp311-win32.whl", hash = "sha256:1c48188778d4d2f3d48e4643fb15d8608b1d01e4b4d6b0548d9b336c28fc9b6f", size = 178196, upload-time = "2025-04-15T17:36:55.002Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/fe/4029038b4e1c4485cef18e480b0e2cd2d755448bb071eb9977caac80b77b/contourpy-1.3.2-cp311-cp311-win_amd64.whl", hash = "sha256:5ebac872ba09cb8f2131c46b8739a7ff71de28a24c869bcad554477eb089a878", size = 222017, upload-time = "2025-04-15T17:36:58.576Z" },
+    { url = "https://files.pythonhosted.org/packages/34/f7/44785876384eff370c251d58fd65f6ad7f39adce4a093c934d4a67a7c6b6/contourpy-1.3.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:4caf2bcd2969402bf77edc4cb6034c7dd7c0803213b3523f111eb7460a51b8d2", size = 271580, upload-time = "2025-04-15T17:37:03.105Z" },
+    { url = "https://files.pythonhosted.org/packages/93/3b/0004767622a9826ea3d95f0e9d98cd8729015768075d61f9fea8eeca42a8/contourpy-1.3.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:82199cb78276249796419fe36b7386bd8d2cc3f28b3bc19fe2454fe2e26c4c15", size = 255530, upload-time = "2025-04-15T17:37:07.026Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/bb/7bd49e1f4fa805772d9fd130e0d375554ebc771ed7172f48dfcd4ca61549/contourpy-1.3.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:106fab697af11456fcba3e352ad50effe493a90f893fca6c2ca5c033820cea92", size = 307688, upload-time = "2025-04-15T17:37:11.481Z" },
+    { url = "https://files.pythonhosted.org/packages/fc/97/e1d5dbbfa170725ef78357a9a0edc996b09ae4af170927ba8ce977e60a5f/contourpy-1.3.2-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:d14f12932a8d620e307f715857107b1d1845cc44fdb5da2bc8e850f5ceba9f87", size = 347331, upload-time = "2025-04-15T17:37:18.212Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/66/e69e6e904f5ecf6901be3dd16e7e54d41b6ec6ae3405a535286d4418ffb4/contourpy-1.3.2-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:532fd26e715560721bb0d5fc7610fce279b3699b018600ab999d1be895b09415", size = 318963, upload-time = "2025-04-15T17:37:22.76Z" },
+    { url = "https://files.pythonhosted.org/packages/a8/32/b8a1c8965e4f72482ff2d1ac2cd670ce0b542f203c8e1d34e7c3e6925da7/contourpy-1.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f26b383144cf2d2c29f01a1e8170f50dacf0eac02d64139dcd709a8ac4eb3cfe", size = 323681, upload-time = "2025-04-15T17:37:33.001Z" },
+    { url = "https://files.pythonhosted.org/packages/30/c6/12a7e6811d08757c7162a541ca4c5c6a34c0f4e98ef2b338791093518e40/contourpy-1.3.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:c49f73e61f1f774650a55d221803b101d966ca0c5a2d6d5e4320ec3997489441", size = 1308674, upload-time = "2025-04-15T17:37:48.64Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/8a/bebe5a3f68b484d3a2b8ffaf84704b3e343ef1addea528132ef148e22b3b/contourpy-1.3.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:3d80b2c0300583228ac98d0a927a1ba6a2ba6b8a742463c564f1d419ee5b211e", size = 1380480, upload-time = "2025-04-15T17:38:06.7Z" },
+    { url = "https://files.pythonhosted.org/packages/34/db/fcd325f19b5978fb509a7d55e06d99f5f856294c1991097534360b307cf1/contourpy-1.3.2-cp312-cp312-win32.whl", hash = "sha256:90df94c89a91b7362e1142cbee7568f86514412ab8a2c0d0fca72d7e91b62912", size = 178489, upload-time = "2025-04-15T17:38:10.338Z" },
+    { url = "https://files.pythonhosted.org/packages/01/c8/fadd0b92ffa7b5eb5949bf340a63a4a496a6930a6c37a7ba0f12acb076d6/contourpy-1.3.2-cp312-cp312-win_amd64.whl", hash = "sha256:8c942a01d9163e2e5cfb05cb66110121b8d07ad438a17f9e766317bcb62abf73", size = 223042, upload-time = "2025-04-15T17:38:14.239Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/61/5673f7e364b31e4e7ef6f61a4b5121c5f170f941895912f773d95270f3a2/contourpy-1.3.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:de39db2604ae755316cb5967728f4bea92685884b1e767b7c24e983ef5f771cb", size = 271630, upload-time = "2025-04-15T17:38:19.142Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/66/a40badddd1223822c95798c55292844b7e871e50f6bfd9f158cb25e0bd39/contourpy-1.3.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:3f9e896f447c5c8618f1edb2bafa9a4030f22a575ec418ad70611450720b5b08", size = 255670, upload-time = "2025-04-15T17:38:23.688Z" },
+    { url = "https://files.pythonhosted.org/packages/1e/c7/cf9fdee8200805c9bc3b148f49cb9482a4e3ea2719e772602a425c9b09f8/contourpy-1.3.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:71e2bd4a1c4188f5c2b8d274da78faab884b59df20df63c34f74aa1813c4427c", size = 306694, upload-time = "2025-04-15T17:38:28.238Z" },
+    { url = "https://files.pythonhosted.org/packages/dd/e7/ccb9bec80e1ba121efbffad7f38021021cda5be87532ec16fd96533bb2e0/contourpy-1.3.2-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:de425af81b6cea33101ae95ece1f696af39446db9682a0b56daaa48cfc29f38f", size = 345986, upload-time = "2025-04-15T17:38:33.502Z" },
+    { url = "https://files.pythonhosted.org/packages/dc/49/ca13bb2da90391fa4219fdb23b078d6065ada886658ac7818e5441448b78/contourpy-1.3.2-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:977e98a0e0480d3fe292246417239d2d45435904afd6d7332d8455981c408b85", size = 318060, upload-time = "2025-04-15T17:38:38.672Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/65/5245ce8c548a8422236c13ffcdcdada6a2a812c361e9e0c70548bb40b661/contourpy-1.3.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:434f0adf84911c924519d2b08fc10491dd282b20bdd3fa8f60fd816ea0b48841", size = 322747, upload-time = "2025-04-15T17:38:43.712Z" },
+    { url = "https://files.pythonhosted.org/packages/72/30/669b8eb48e0a01c660ead3752a25b44fdb2e5ebc13a55782f639170772f9/contourpy-1.3.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:c66c4906cdbc50e9cba65978823e6e00b45682eb09adbb78c9775b74eb222422", size = 1308895, upload-time = "2025-04-15T17:39:00.224Z" },
+    { url = "https://files.pythonhosted.org/packages/05/5a/b569f4250decee6e8d54498be7bdf29021a4c256e77fe8138c8319ef8eb3/contourpy-1.3.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8b7fc0cd78ba2f4695fd0a6ad81a19e7e3ab825c31b577f384aa9d7817dc3bef", size = 1379098, upload-time = "2025-04-15T17:43:29.649Z" },
+    { url = "https://files.pythonhosted.org/packages/19/ba/b227c3886d120e60e41b28740ac3617b2f2b971b9f601c835661194579f1/contourpy-1.3.2-cp313-cp313-win32.whl", hash = "sha256:15ce6ab60957ca74cff444fe66d9045c1fd3e92c8936894ebd1f3eef2fff075f", size = 178535, upload-time = "2025-04-15T17:44:44.532Z" },
+    { url = "https://files.pythonhosted.org/packages/12/6e/2fed56cd47ca739b43e892707ae9a13790a486a3173be063681ca67d2262/contourpy-1.3.2-cp313-cp313-win_amd64.whl", hash = "sha256:e1578f7eafce927b168752ed7e22646dad6cd9bca673c60bff55889fa236ebf9", size = 223096, upload-time = "2025-04-15T17:44:48.194Z" },
+    { url = "https://files.pythonhosted.org/packages/54/4c/e76fe2a03014a7c767d79ea35c86a747e9325537a8b7627e0e5b3ba266b4/contourpy-1.3.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0475b1f6604896bc7c53bb070e355e9321e1bc0d381735421a2d2068ec56531f", size = 285090, upload-time = "2025-04-15T17:43:34.084Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/e2/5aba47debd55d668e00baf9651b721e7733975dc9fc27264a62b0dd26eb8/contourpy-1.3.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:c85bb486e9be652314bb5b9e2e3b0d1b2e643d5eec4992c0fbe8ac71775da739", size = 268643, upload-time = "2025-04-15T17:43:38.626Z" },
+    { url = "https://files.pythonhosted.org/packages/a1/37/cd45f1f051fe6230f751cc5cdd2728bb3a203f5619510ef11e732109593c/contourpy-1.3.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:745b57db7758f3ffc05a10254edd3182a2a83402a89c00957a8e8a22f5582823", size = 310443, upload-time = "2025-04-15T17:43:44.522Z" },
+    { url = "https://files.pythonhosted.org/packages/8b/a2/36ea6140c306c9ff6dd38e3bcec80b3b018474ef4d17eb68ceecd26675f4/contourpy-1.3.2-cp313-cp313t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:970e9173dbd7eba9b4e01aab19215a48ee5dd3f43cef736eebde064a171f89a5", size = 349865, upload-time = "2025-04-15T17:43:49.545Z" },
+    { url = "https://files.pythonhosted.org/packages/95/b7/2fc76bc539693180488f7b6cc518da7acbbb9e3b931fd9280504128bf956/contourpy-1.3.2-cp313-cp313t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:c6c4639a9c22230276b7bffb6a850dfc8258a2521305e1faefe804d006b2e532", size = 321162, upload-time = "2025-04-15T17:43:54.203Z" },
+    { url = "https://files.pythonhosted.org/packages/f4/10/76d4f778458b0aa83f96e59d65ece72a060bacb20cfbee46cf6cd5ceba41/contourpy-1.3.2-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cc829960f34ba36aad4302e78eabf3ef16a3a100863f0d4eeddf30e8a485a03b", size = 327355, upload-time = "2025-04-15T17:44:01.025Z" },
+    { url = "https://files.pythonhosted.org/packages/43/a3/10cf483ea683f9f8ab096c24bad3cce20e0d1dd9a4baa0e2093c1c962d9d/contourpy-1.3.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:d32530b534e986374fc19eaa77fcb87e8a99e5431499949b828312bdcd20ac52", size = 1307935, upload-time = "2025-04-15T17:44:17.322Z" },
+    { url = "https://files.pythonhosted.org/packages/78/73/69dd9a024444489e22d86108e7b913f3528f56cfc312b5c5727a44188471/contourpy-1.3.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:e298e7e70cf4eb179cc1077be1c725b5fd131ebc81181bf0c03525c8abc297fd", size = 1372168, upload-time = "2025-04-15T17:44:33.43Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/1b/96d586ccf1b1a9d2004dd519b25fbf104a11589abfd05484ff12199cca21/contourpy-1.3.2-cp313-cp313t-win32.whl", hash = "sha256:d0e589ae0d55204991450bb5c23f571c64fe43adaa53f93fc902a84c96f52fe1", size = 189550, upload-time = "2025-04-15T17:44:37.092Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/e6/6000d0094e8a5e32ad62591c8609e269febb6e4db83a1c75ff8868b42731/contourpy-1.3.2-cp313-cp313t-win_amd64.whl", hash = "sha256:78e9253c3de756b3f6a5174d024c4835acd59eb3f8e2ca13e775dbffe1558f69", size = 238214, upload-time = "2025-04-15T17:44:40.827Z" },
+    { url = "https://files.pythonhosted.org/packages/33/05/b26e3c6ecc05f349ee0013f0bb850a761016d89cec528a98193a48c34033/contourpy-1.3.2-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:fd93cc7f3139b6dd7aab2f26a90dde0aa9fc264dbf70f6740d498a70b860b82c", size = 265681, upload-time = "2025-04-15T17:44:59.314Z" },
+    { url = "https://files.pythonhosted.org/packages/2b/25/ac07d6ad12affa7d1ffed11b77417d0a6308170f44ff20fa1d5aa6333f03/contourpy-1.3.2-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:107ba8a6a7eec58bb475329e6d3b95deba9440667c4d62b9b6063942b61d7f16", size = 315101, upload-time = "2025-04-15T17:45:04.165Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/4d/5bb3192bbe9d3f27e3061a6a8e7733c9120e203cb8515767d30973f71030/contourpy-1.3.2-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:ded1706ed0c1049224531b81128efbd5084598f18d8a2d9efae833edbd2b40ad", size = 220599, upload-time = "2025-04-15T17:45:08.456Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/c0/91f1215d0d9f9f343e4773ba6c9b89e8c0cc7a64a6263f21139da639d848/contourpy-1.3.2-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:5f5964cdad279256c084b69c3f412b7801e15356b16efa9d78aa974041903da0", size = 266807, upload-time = "2025-04-15T17:45:15.535Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/79/6be7e90c955c0487e7712660d6cead01fa17bff98e0ea275737cc2bc8e71/contourpy-1.3.2-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:49b65a95d642d4efa8f64ba12558fcb83407e58a2dfba9d796d77b63ccfcaff5", size = 318729, upload-time = "2025-04-15T17:45:20.166Z" },
+    { url = "https://files.pythonhosted.org/packages/87/68/7f46fb537958e87427d98a4074bcde4b67a70b04900cfc5ce29bc2f556c1/contourpy-1.3.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8c5acb8dddb0752bf252e01a3035b21443158910ac16a3b0d20e7fed7d534ce5", size = 221791, upload-time = "2025-04-15T17:45:24.794Z" },
+]
+
+[[package]]
+name = "contourpy"
+version = "1.3.3"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "python_full_version >= '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
+    "python_full_version == '3.11.*' and sys_platform == 'linux'",
+    "python_full_version >= '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
+    "python_full_version == '3.11.*' and sys_platform != 'linux'",
+]
+dependencies = [
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/58/01/1253e6698a07380cd31a736d248a3f2a50a7c88779a1813da27503cadc2a/contourpy-1.3.3.tar.gz", hash = "sha256:083e12155b210502d0bca491432bb04d56dc3432f95a979b429f2848c3dbe880", size = 13466174, upload-time = "2025-07-26T12:03:12.549Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/91/2e/c4390a31919d8a78b90e8ecf87cd4b4c4f05a5b48d05ec17db8e5404c6f4/contourpy-1.3.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:709a48ef9a690e1343202916450bc48b9e51c049b089c7f79a267b46cffcdaa1", size = 288773, upload-time = "2025-07-26T12:01:02.277Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/44/c4b0b6095fef4dc9c420e041799591e3b63e9619e3044f7f4f6c21c0ab24/contourpy-1.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:23416f38bfd74d5d28ab8429cc4d63fa67d5068bd711a85edb1c3fb0c3e2f381", size = 270149, upload-time = "2025-07-26T12:01:04.072Z" },
+    { url = "https://files.pythonhosted.org/packages/30/2e/dd4ced42fefac8470661d7cb7e264808425e6c5d56d175291e93890cce09/contourpy-1.3.3-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:929ddf8c4c7f348e4c0a5a3a714b5c8542ffaa8c22954862a46ca1813b667ee7", size = 329222, upload-time = "2025-07-26T12:01:05.688Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/74/cc6ec2548e3d276c71389ea4802a774b7aa3558223b7bade3f25787fafc2/contourpy-1.3.3-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:9e999574eddae35f1312c2b4b717b7885d4edd6cb46700e04f7f02db454e67c1", size = 377234, upload-time = "2025-07-26T12:01:07.054Z" },
+    { url = "https://files.pythonhosted.org/packages/03/b3/64ef723029f917410f75c09da54254c5f9ea90ef89b143ccadb09df14c15/contourpy-1.3.3-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0bf67e0e3f482cb69779dd3061b534eb35ac9b17f163d851e2a547d56dba0a3a", size = 380555, upload-time = "2025-07-26T12:01:08.801Z" },
+    { url = "https://files.pythonhosted.org/packages/5f/4b/6157f24ca425b89fe2eb7e7be642375711ab671135be21e6faa100f7448c/contourpy-1.3.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:51e79c1f7470158e838808d4a996fa9bac72c498e93d8ebe5119bc1e6becb0db", size = 355238, upload-time = "2025-07-26T12:01:10.319Z" },
+    { url = "https://files.pythonhosted.org/packages/98/56/f914f0dd678480708a04cfd2206e7c382533249bc5001eb9f58aa693e200/contourpy-1.3.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:598c3aaece21c503615fd59c92a3598b428b2f01bfb4b8ca9c4edeecc2438620", size = 1326218, upload-time = "2025-07-26T12:01:12.659Z" },
+    { url = "https://files.pythonhosted.org/packages/fb/d7/4a972334a0c971acd5172389671113ae82aa7527073980c38d5868ff1161/contourpy-1.3.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:322ab1c99b008dad206d406bb61d014cf0174df491ae9d9d0fac6a6fda4f977f", size = 1392867, upload-time = "2025-07-26T12:01:15.533Z" },
+    { url = "https://files.pythonhosted.org/packages/75/3e/f2cc6cd56dc8cff46b1a56232eabc6feea52720083ea71ab15523daab796/contourpy-1.3.3-cp311-cp311-win32.whl", hash = "sha256:fd907ae12cd483cd83e414b12941c632a969171bf90fc937d0c9f268a31cafff", size = 183677, upload-time = "2025-07-26T12:01:17.088Z" },
+    { url = "https://files.pythonhosted.org/packages/98/4b/9bd370b004b5c9d8045c6c33cf65bae018b27aca550a3f657cdc99acdbd8/contourpy-1.3.3-cp311-cp311-win_amd64.whl", hash = "sha256:3519428f6be58431c56581f1694ba8e50626f2dd550af225f82fb5f5814d2a42", size = 225234, upload-time = "2025-07-26T12:01:18.256Z" },
+    { url = "https://files.pythonhosted.org/packages/d9/b6/71771e02c2e004450c12b1120a5f488cad2e4d5b590b1af8bad060360fe4/contourpy-1.3.3-cp311-cp311-win_arm64.whl", hash = "sha256:15ff10bfada4bf92ec8b31c62bf7c1834c244019b4a33095a68000d7075df470", size = 193123, upload-time = "2025-07-26T12:01:19.848Z" },
+    { url = "https://files.pythonhosted.org/packages/be/45/adfee365d9ea3d853550b2e735f9d66366701c65db7855cd07621732ccfc/contourpy-1.3.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:b08a32ea2f8e42cf1d4be3169a98dd4be32bafe4f22b6c4cb4ba810fa9e5d2cb", size = 293419, upload-time = "2025-07-26T12:01:21.16Z" },
+    { url = "https://files.pythonhosted.org/packages/53/3e/405b59cfa13021a56bba395a6b3aca8cec012b45bf177b0eaf7a202cde2c/contourpy-1.3.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:556dba8fb6f5d8742f2923fe9457dbdd51e1049c4a43fd3986a0b14a1d815fc6", size = 273979, upload-time = "2025-07-26T12:01:22.448Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/1c/a12359b9b2ca3a845e8f7f9ac08bdf776114eb931392fcad91743e2ea17b/contourpy-1.3.3-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:92d9abc807cf7d0e047b95ca5d957cf4792fcd04e920ca70d48add15c1a90ea7", size = 332653, upload-time = "2025-07-26T12:01:24.155Z" },
+    { url = "https://files.pythonhosted.org/packages/63/12/897aeebfb475b7748ea67b61e045accdfcf0d971f8a588b67108ed7f5512/contourpy-1.3.3-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b2e8faa0ed68cb29af51edd8e24798bb661eac3bd9f65420c1887b6ca89987c8", size = 379536, upload-time = "2025-07-26T12:01:25.91Z" },
+    { url = "https://files.pythonhosted.org/packages/43/8a/a8c584b82deb248930ce069e71576fc09bd7174bbd35183b7943fb1064fd/contourpy-1.3.3-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:626d60935cf668e70a5ce6ff184fd713e9683fb458898e4249b63be9e28286ea", size = 384397, upload-time = "2025-07-26T12:01:27.152Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/8f/ec6289987824b29529d0dfda0d74a07cec60e54b9c92f3c9da4c0ac732de/contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d00e655fcef08aba35ec9610536bfe90267d7ab5ba944f7032549c55a146da1", size = 362601, upload-time = "2025-07-26T12:01:28.808Z" },
+    { url = "https://files.pythonhosted.org/packages/05/0a/a3fe3be3ee2dceb3e615ebb4df97ae6f3828aa915d3e10549ce016302bd1/contourpy-1.3.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:451e71b5a7d597379ef572de31eeb909a87246974d960049a9848c3bc6c41bf7", size = 1331288, upload-time = "2025-07-26T12:01:31.198Z" },
+    { url = "https://files.pythonhosted.org/packages/33/1d/acad9bd4e97f13f3e2b18a3977fe1b4a37ecf3d38d815333980c6c72e963/contourpy-1.3.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:459c1f020cd59fcfe6650180678a9993932d80d44ccde1fa1868977438f0b411", size = 1403386, upload-time = "2025-07-26T12:01:33.947Z" },
+    { url = "https://files.pythonhosted.org/packages/cf/8f/5847f44a7fddf859704217a99a23a4f6417b10e5ab1256a179264561540e/contourpy-1.3.3-cp312-cp312-win32.whl", hash = "sha256:023b44101dfe49d7d53932be418477dba359649246075c996866106da069af69", size = 185018, upload-time = "2025-07-26T12:01:35.64Z" },
+    { url = "https://files.pythonhosted.org/packages/19/e8/6026ed58a64563186a9ee3f29f41261fd1828f527dd93d33b60feca63352/contourpy-1.3.3-cp312-cp312-win_amd64.whl", hash = "sha256:8153b8bfc11e1e4d75bcb0bff1db232f9e10b274e0929de9d608027e0d34ff8b", size = 226567, upload-time = "2025-07-26T12:01:36.804Z" },
+    { url = "https://files.pythonhosted.org/packages/d1/e2/f05240d2c39a1ed228d8328a78b6f44cd695f7ef47beb3e684cf93604f86/contourpy-1.3.3-cp312-cp312-win_arm64.whl", hash = "sha256:07ce5ed73ecdc4a03ffe3e1b3e3c1166db35ae7584be76f65dbbe28a7791b0cc", size = 193655, upload-time = "2025-07-26T12:01:37.999Z" },
+    { url = "https://files.pythonhosted.org/packages/68/35/0167aad910bbdb9599272bd96d01a9ec6852f36b9455cf2ca67bd4cc2d23/contourpy-1.3.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:177fb367556747a686509d6fef71d221a4b198a3905fe824430e5ea0fda54eb5", size = 293257, upload-time = "2025-07-26T12:01:39.367Z" },
+    { url = "https://files.pythonhosted.org/packages/96/e4/7adcd9c8362745b2210728f209bfbcf7d91ba868a2c5f40d8b58f54c509b/contourpy-1.3.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:d002b6f00d73d69333dac9d0b8d5e84d9724ff9ef044fd63c5986e62b7c9e1b1", size = 274034, upload-time = "2025-07-26T12:01:40.645Z" },
+    { url = "https://files.pythonhosted.org/packages/73/23/90e31ceeed1de63058a02cb04b12f2de4b40e3bef5e082a7c18d9c8ae281/contourpy-1.3.3-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:348ac1f5d4f1d66d3322420f01d42e43122f43616e0f194fc1c9f5d830c5b286", size = 334672, upload-time = "2025-07-26T12:01:41.942Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/93/b43d8acbe67392e659e1d984700e79eb67e2acb2bd7f62012b583a7f1b55/contourpy-1.3.3-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:655456777ff65c2c548b7c454af9c6f33f16c8884f11083244b5819cc214f1b5", size = 381234, upload-time = "2025-07-26T12:01:43.499Z" },
+    { url = "https://files.pythonhosted.org/packages/46/3b/bec82a3ea06f66711520f75a40c8fc0b113b2a75edb36aa633eb11c4f50f/contourpy-1.3.3-cp313-cp313-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:644a6853d15b2512d67881586bd03f462c7ab755db95f16f14d7e238f2852c67", size = 385169, upload-time = "2025-07-26T12:01:45.219Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/32/e0f13a1c5b0f8572d0ec6ae2f6c677b7991fafd95da523159c19eff0696a/contourpy-1.3.3-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4debd64f124ca62069f313a9cb86656ff087786016d76927ae2cf37846b006c9", size = 362859, upload-time = "2025-07-26T12:01:46.519Z" },
+    { url = "https://files.pythonhosted.org/packages/33/71/e2a7945b7de4e58af42d708a219f3b2f4cff7386e6b6ab0a0fa0033c49a9/contourpy-1.3.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:a15459b0f4615b00bbd1e91f1b9e19b7e63aea7483d03d804186f278c0af2659", size = 1332062, upload-time = "2025-07-26T12:01:48.964Z" },
+    { url = "https://files.pythonhosted.org/packages/12/fc/4e87ac754220ccc0e807284f88e943d6d43b43843614f0a8afa469801db0/contourpy-1.3.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:ca0fdcd73925568ca027e0b17ab07aad764be4706d0a925b89227e447d9737b7", size = 1403932, upload-time = "2025-07-26T12:01:51.979Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/2e/adc197a37443f934594112222ac1aa7dc9a98faf9c3842884df9a9d8751d/contourpy-1.3.3-cp313-cp313-win32.whl", hash = "sha256:b20c7c9a3bf701366556e1b1984ed2d0cedf999903c51311417cf5f591d8c78d", size = 185024, upload-time = "2025-07-26T12:01:53.245Z" },
+    { url = "https://files.pythonhosted.org/packages/18/0b/0098c214843213759692cc638fce7de5c289200a830e5035d1791d7a2338/contourpy-1.3.3-cp313-cp313-win_amd64.whl", hash = "sha256:1cadd8b8969f060ba45ed7c1b714fe69185812ab43bd6b86a9123fe8f99c3263", size = 226578, upload-time = "2025-07-26T12:01:54.422Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/9a/2f6024a0c5995243cd63afdeb3651c984f0d2bc727fd98066d40e141ad73/contourpy-1.3.3-cp313-cp313-win_arm64.whl", hash = "sha256:fd914713266421b7536de2bfa8181aa8c699432b6763a0ea64195ebe28bff6a9", size = 193524, upload-time = "2025-07-26T12:01:55.73Z" },
+    { url = "https://files.pythonhosted.org/packages/c0/b3/f8a1a86bd3298513f500e5b1f5fd92b69896449f6cab6a146a5d52715479/contourpy-1.3.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:88df9880d507169449d434c293467418b9f6cbe82edd19284aa0409e7fdb933d", size = 306730, upload-time = "2025-07-26T12:01:57.051Z" },
+    { url = "https://files.pythonhosted.org/packages/3f/11/4780db94ae62fc0c2053909b65dc3246bd7cecfc4f8a20d957ad43aa4ad8/contourpy-1.3.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:d06bb1f751ba5d417047db62bca3c8fde202b8c11fb50742ab3ab962c81e8216", size = 287897, upload-time = "2025-07-26T12:01:58.663Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/15/e59f5f3ffdd6f3d4daa3e47114c53daabcb18574a26c21f03dc9e4e42ff0/contourpy-1.3.3-cp313-cp313t-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e4e6b05a45525357e382909a4c1600444e2a45b4795163d3b22669285591c1ae", size = 326751, upload-time = "2025-07-26T12:02:00.343Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/81/03b45cfad088e4770b1dcf72ea78d3802d04200009fb364d18a493857210/contourpy-1.3.3-cp313-cp313t-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:ab3074b48c4e2cf1a960e6bbeb7f04566bf36b1861d5c9d4d8ac04b82e38ba20", size = 375486, upload-time = "2025-07-26T12:02:02.128Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/ba/49923366492ffbdd4486e970d421b289a670ae8cf539c1ea9a09822b371a/contourpy-1.3.3-cp313-cp313t-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:6c3d53c796f8647d6deb1abe867daeb66dcc8a97e8455efa729516b997b8ed99", size = 388106, upload-time = "2025-07-26T12:02:03.615Z" },
+    { url = "https://files.pythonhosted.org/packages/9f/52/5b00ea89525f8f143651f9f03a0df371d3cbd2fccd21ca9b768c7a6500c2/contourpy-1.3.3-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:50ed930df7289ff2a8d7afeb9603f8289e5704755c7e5c3bbd929c90c817164b", size = 352548, upload-time = "2025-07-26T12:02:05.165Z" },
+    { url = "https://files.pythonhosted.org/packages/32/1d/a209ec1a3a3452d490f6b14dd92e72280c99ae3d1e73da74f8277d4ee08f/contourpy-1.3.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:4feffb6537d64b84877da813a5c30f1422ea5739566abf0bd18065ac040e120a", size = 1322297, upload-time = "2025-07-26T12:02:07.379Z" },
+    { url = "https://files.pythonhosted.org/packages/bc/9e/46f0e8ebdd884ca0e8877e46a3f4e633f6c9c8c4f3f6e72be3fe075994aa/contourpy-1.3.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:2b7e9480ffe2b0cd2e787e4df64270e3a0440d9db8dc823312e2c940c167df7e", size = 1391023, upload-time = "2025-07-26T12:02:10.171Z" },
+    { url = "https://files.pythonhosted.org/packages/b9/70/f308384a3ae9cd2209e0849f33c913f658d3326900d0ff5d378d6a1422d2/contourpy-1.3.3-cp313-cp313t-win32.whl", hash = "sha256:283edd842a01e3dcd435b1c5116798d661378d83d36d337b8dde1d16a5fc9ba3", size = 196157, upload-time = "2025-07-26T12:02:11.488Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/dd/880f890a6663b84d9e34a6f88cded89d78f0091e0045a284427cb6b18521/contourpy-1.3.3-cp313-cp313t-win_amd64.whl", hash = "sha256:87acf5963fc2b34825e5b6b048f40e3635dd547f590b04d2ab317c2619ef7ae8", size = 240570, upload-time = "2025-07-26T12:02:12.754Z" },
+    { url = "https://files.pythonhosted.org/packages/80/99/2adc7d8ffead633234817ef8e9a87115c8a11927a94478f6bb3d3f4d4f7d/contourpy-1.3.3-cp313-cp313t-win_arm64.whl", hash = "sha256:3c30273eb2a55024ff31ba7d052dde990d7d8e5450f4bbb6e913558b3d6c2301", size = 199713, upload-time = "2025-07-26T12:02:14.4Z" },
+    { url = "https://files.pythonhosted.org/packages/72/8b/4546f3ab60f78c514ffb7d01a0bd743f90de36f0019d1be84d0a708a580a/contourpy-1.3.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:fde6c716d51c04b1c25d0b90364d0be954624a0ee9d60e23e850e8d48353d07a", size = 292189, upload-time = "2025-07-26T12:02:16.095Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/e1/3542a9cb596cadd76fcef413f19c79216e002623158befe6daa03dbfa88c/contourpy-1.3.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:cbedb772ed74ff5be440fa8eee9bd49f64f6e3fc09436d9c7d8f1c287b121d77", size = 273251, upload-time = "2025-07-26T12:02:17.524Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/71/f93e1e9471d189f79d0ce2497007731c1e6bf9ef6d1d61b911430c3db4e5/contourpy-1.3.3-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:22e9b1bd7a9b1d652cd77388465dc358dafcd2e217d35552424aa4f996f524f5", size = 335810, upload-time = "2025-07-26T12:02:18.9Z" },
+    { url = "https://files.pythonhosted.org/packages/91/f9/e35f4c1c93f9275d4e38681a80506b5510e9327350c51f8d4a5a724d178c/contourpy-1.3.3-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a22738912262aa3e254e4f3cb079a95a67132fc5a063890e224393596902f5a4", size = 382871, upload-time = "2025-07-26T12:02:20.418Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/71/47b512f936f66a0a900d81c396a7e60d73419868fba959c61efed7a8ab46/contourpy-1.3.3-cp314-cp314-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:afe5a512f31ee6bd7d0dda52ec9864c984ca3d66664444f2d72e0dc4eb832e36", size = 386264, upload-time = "2025-07-26T12:02:21.916Z" },
+    { url = "https://files.pythonhosted.org/packages/04/5f/9ff93450ba96b09c7c2b3f81c94de31c89f92292f1380261bd7195bea4ea/contourpy-1.3.3-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f64836de09927cba6f79dcd00fdd7d5329f3fccc633468507079c829ca4db4e3", size = 363819, upload-time = "2025-07-26T12:02:23.759Z" },
+    { url = "https://files.pythonhosted.org/packages/3e/a6/0b185d4cc480ee494945cde102cb0149ae830b5fa17bf855b95f2e70ad13/contourpy-1.3.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:1fd43c3be4c8e5fd6e4f2baeae35ae18176cf2e5cced681cca908addf1cdd53b", size = 1333650, upload-time = "2025-07-26T12:02:26.181Z" },
+    { url = "https://files.pythonhosted.org/packages/43/d7/afdc95580ca56f30fbcd3060250f66cedbde69b4547028863abd8aa3b47e/contourpy-1.3.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:6afc576f7b33cf00996e5c1102dc2a8f7cc89e39c0b55df93a0b78c1bd992b36", size = 1404833, upload-time = "2025-07-26T12:02:28.782Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/e2/366af18a6d386f41132a48f033cbd2102e9b0cf6345d35ff0826cd984566/contourpy-1.3.3-cp314-cp314-win32.whl", hash = "sha256:66c8a43a4f7b8df8b71ee1840e4211a3c8d93b214b213f590e18a1beca458f7d", size = 189692, upload-time = "2025-07-26T12:02:30.128Z" },
+    { url = "https://files.pythonhosted.org/packages/7d/c2/57f54b03d0f22d4044b8afb9ca0e184f8b1afd57b4f735c2fa70883dc601/contourpy-1.3.3-cp314-cp314-win_amd64.whl", hash = "sha256:cf9022ef053f2694e31d630feaacb21ea24224be1c3ad0520b13d844274614fd", size = 232424, upload-time = "2025-07-26T12:02:31.395Z" },
+    { url = "https://files.pythonhosted.org/packages/18/79/a9416650df9b525737ab521aa181ccc42d56016d2123ddcb7b58e926a42c/contourpy-1.3.3-cp314-cp314-win_arm64.whl", hash = "sha256:95b181891b4c71de4bb404c6621e7e2390745f887f2a026b2d99e92c17892339", size = 198300, upload-time = "2025-07-26T12:02:32.956Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/42/38c159a7d0f2b7b9c04c64ab317042bb6952b713ba875c1681529a2932fe/contourpy-1.3.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:33c82d0138c0a062380332c861387650c82e4cf1747aaa6938b9b6516762e772", size = 306769, upload-time = "2025-07-26T12:02:34.2Z" },
+    { url = "https://files.pythonhosted.org/packages/c3/6c/26a8205f24bca10974e77460de68d3d7c63e282e23782f1239f226fcae6f/contourpy-1.3.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ea37e7b45949df430fe649e5de8351c423430046a2af20b1c1961cae3afcda77", size = 287892, upload-time = "2025-07-26T12:02:35.807Z" },
+    { url = "https://files.pythonhosted.org/packages/66/06/8a475c8ab718ebfd7925661747dbb3c3ee9c82ac834ccb3570be49d129f4/contourpy-1.3.3-cp314-cp314t-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d304906ecc71672e9c89e87c4675dc5c2645e1f4269a5063b99b0bb29f232d13", size = 326748, upload-time = "2025-07-26T12:02:37.193Z" },
+    { url = "https://files.pythonhosted.org/packages/b4/a3/c5ca9f010a44c223f098fccd8b158bb1cb287378a31ac141f04730dc49be/contourpy-1.3.3-cp314-cp314t-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:ca658cd1a680a5c9ea96dc61cdbae1e85c8f25849843aa799dfd3cb370ad4fbe", size = 375554, upload-time = "2025-07-26T12:02:38.894Z" },
+    { url = "https://files.pythonhosted.org/packages/80/5b/68bd33ae63fac658a4145088c1e894405e07584a316738710b636c6d0333/contourpy-1.3.3-cp314-cp314t-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:ab2fd90904c503739a75b7c8c5c01160130ba67944a7b77bbf36ef8054576e7f", size = 388118, upload-time = "2025-07-26T12:02:40.642Z" },
+    { url = "https://files.pythonhosted.org/packages/40/52/4c285a6435940ae25d7410a6c36bda5145839bc3f0beb20c707cda18b9d2/contourpy-1.3.3-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b7301b89040075c30e5768810bc96a8e8d78085b47d8be6e4c3f5a0b4ed478a0", size = 352555, upload-time = "2025-07-26T12:02:42.25Z" },
+    { url = "https://files.pythonhosted.org/packages/24/ee/3e81e1dd174f5c7fefe50e85d0892de05ca4e26ef1c9a59c2a57e43b865a/contourpy-1.3.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:2a2a8b627d5cc6b7c41a4beff6c5ad5eb848c88255fda4a8745f7e901b32d8e4", size = 1322295, upload-time = "2025-07-26T12:02:44.668Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/b2/6d913d4d04e14379de429057cd169e5e00f6c2af3bb13e1710bcbdb5da12/contourpy-1.3.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:fd6ec6be509c787f1caf6b247f0b1ca598bef13f4ddeaa126b7658215529ba0f", size = 1391027, upload-time = "2025-07-26T12:02:47.09Z" },
+    { url = "https://files.pythonhosted.org/packages/93/8a/68a4ec5c55a2971213d29a9374913f7e9f18581945a7a31d1a39b5d2dfe5/contourpy-1.3.3-cp314-cp314t-win32.whl", hash = "sha256:e74a9a0f5e3fff48fb5a7f2fd2b9b70a3fe014a67522f79b7cca4c0c7e43c9ae", size = 202428, upload-time = "2025-07-26T12:02:48.691Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/96/fd9f641ffedc4fa3ace923af73b9d07e869496c9cc7a459103e6e978992f/contourpy-1.3.3-cp314-cp314t-win_amd64.whl", hash = "sha256:13b68d6a62db8eafaebb8039218921399baf6e47bf85006fd8529f2a08ef33fc", size = 250331, upload-time = "2025-07-26T12:02:50.137Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/8c/469afb6465b853afff216f9528ffda78a915ff880ed58813ba4faf4ba0b6/contourpy-1.3.3-cp314-cp314t-win_arm64.whl", hash = "sha256:b7448cb5a725bb1e35ce88771b86fba35ef418952474492cf7c764059933ff8b", size = 203831, upload-time = "2025-07-26T12:02:51.449Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/29/8dcfe16f0107943fa92388c23f6e05cff0ba58058c4c95b00280d4c75a14/contourpy-1.3.3-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:cd5dfcaeb10f7b7f9dc8941717c6c2ade08f587be2226222c12b25f0483ed497", size = 278809, upload-time = "2025-07-26T12:02:52.74Z" },
+    { url = "https://files.pythonhosted.org/packages/85/a9/8b37ef4f7dafeb335daee3c8254645ef5725be4d9c6aa70b50ec46ef2f7e/contourpy-1.3.3-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:0c1fc238306b35f246d61a1d416a627348b5cf0648648a031e14bb8705fcdfe8", size = 261593, upload-time = "2025-07-26T12:02:54.037Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/59/ebfb8c677c75605cc27f7122c90313fd2f375ff3c8d19a1694bda74aaa63/contourpy-1.3.3-pp311-pypy311_pp73-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:70f9aad7de812d6541d29d2bbf8feb22ff7e1c299523db288004e3157ff4674e", size = 302202, upload-time = "2025-07-26T12:02:55.947Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/37/21972a15834d90bfbfb009b9d004779bd5a07a0ec0234e5ba8f64d5736f4/contourpy-1.3.3-pp311-pypy311_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5ed3657edf08512fc3fe81b510e35c2012fbd3081d2e26160f27ca28affec989", size = 329207, upload-time = "2025-07-26T12:02:57.468Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/58/bd257695f39d05594ca4ad60df5bcb7e32247f9951fd09a9b8edb82d1daa/contourpy-1.3.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:3d1a3799d62d45c18bafd41c5fa05120b96a28079f2393af559b843d1a966a77", size = 225315, upload-time = "2025-07-26T12:02:58.801Z" },
+]
+
 [[package]]
 name = "coverage"
 version = "7.13.0"
@@ -845,92 +1056,44 @@ toml = [
 
 [[package]]
 name = "cryptography"
-version = "46.0.3"
+version = "43.0.3"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "cffi", marker = "platform_python_implementation != 'PyPy'" },
-    { name = "typing-extensions", marker = "python_full_version < '3.11'" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/9f/33/c00162f49c0e2fe8064a62cb92b93e50c74a72bc370ab92f86112b33ff62/cryptography-46.0.3.tar.gz", hash = "sha256:a8b17438104fed022ce745b362294d9ce35b4c2e45c1d958ad4a4b019285f4a1", size = 749258, upload-time = "2025-10-15T23:18:31.74Z" }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/1d/42/9c391dd801d6cf0d561b5890549d4b27bafcc53b39c31a817e69d87c625b/cryptography-46.0.3-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:109d4ddfadf17e8e7779c39f9b18111a09efb969a301a31e987416a0191ed93a", size = 7225004, upload-time = "2025-10-15T23:16:52.239Z" },
-    { url = "https://files.pythonhosted.org/packages/1c/67/38769ca6b65f07461eb200e85fc1639b438bdc667be02cf7f2cd6a64601c/cryptography-46.0.3-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:09859af8466b69bc3c27bdf4f5d84a665e0f7ab5088412e9e2ec49758eca5cbc", size = 4296667, upload-time = "2025-10-15T23:16:54.369Z" },
-    { url = "https://files.pythonhosted.org/packages/5c/49/498c86566a1d80e978b42f0d702795f69887005548c041636df6ae1ca64c/cryptography-46.0.3-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:01ca9ff2885f3acc98c29f1860552e37f6d7c7d013d7334ff2a9de43a449315d", size = 4450807, upload-time = "2025-10-15T23:16:56.414Z" },
-    { url = "https://files.pythonhosted.org/packages/4b/0a/863a3604112174c8624a2ac3c038662d9e59970c7f926acdcfaed8d61142/cryptography-46.0.3-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:6eae65d4c3d33da080cff9c4ab1f711b15c1d9760809dad6ea763f3812d254cb", size = 4299615, upload-time = "2025-10-15T23:16:58.442Z" },
-    { url = "https://files.pythonhosted.org/packages/64/02/b73a533f6b64a69f3cd3872acb6ebc12aef924d8d103133bb3ea750dc703/cryptography-46.0.3-cp311-abi3-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:e5bf0ed4490068a2e72ac03d786693adeb909981cc596425d09032d372bcc849", size = 4016800, upload-time = "2025-10-15T23:17:00.378Z" },
-    { url = "https://files.pythonhosted.org/packages/25/d5/16e41afbfa450cde85a3b7ec599bebefaef16b5c6ba4ec49a3532336ed72/cryptography-46.0.3-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:5ecfccd2329e37e9b7112a888e76d9feca2347f12f37918facbb893d7bb88ee8", size = 4984707, upload-time = "2025-10-15T23:17:01.98Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/56/e7e69b427c3878352c2fb9b450bd0e19ed552753491d39d7d0a2f5226d41/cryptography-46.0.3-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:a2c0cd47381a3229c403062f764160d57d4d175e022c1df84e168c6251a22eec", size = 4482541, upload-time = "2025-10-15T23:17:04.078Z" },
-    { url = "https://files.pythonhosted.org/packages/78/f6/50736d40d97e8483172f1bb6e698895b92a223dba513b0ca6f06b2365339/cryptography-46.0.3-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:549e234ff32571b1f4076ac269fcce7a808d3bf98b76c8dd560e42dbc66d7d91", size = 4299464, upload-time = "2025-10-15T23:17:05.483Z" },
-    { url = "https://files.pythonhosted.org/packages/00/de/d8e26b1a855f19d9994a19c702fa2e93b0456beccbcfe437eda00e0701f2/cryptography-46.0.3-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:c0a7bb1a68a5d3471880e264621346c48665b3bf1c3759d682fc0864c540bd9e", size = 4950838, upload-time = "2025-10-15T23:17:07.425Z" },
-    { url = "https://files.pythonhosted.org/packages/8f/29/798fc4ec461a1c9e9f735f2fc58741b0daae30688f41b2497dcbc9ed1355/cryptography-46.0.3-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:10b01676fc208c3e6feeb25a8b83d81767e8059e1fe86e1dc62d10a3018fa926", size = 4481596, upload-time = "2025-10-15T23:17:09.343Z" },
-    { url = "https://files.pythonhosted.org/packages/15/8d/03cd48b20a573adfff7652b76271078e3045b9f49387920e7f1f631d125e/cryptography-46.0.3-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:0abf1ffd6e57c67e92af68330d05760b7b7efb243aab8377e583284dbab72c71", size = 4426782, upload-time = "2025-10-15T23:17:11.22Z" },
-    { url = "https://files.pythonhosted.org/packages/fa/b1/ebacbfe53317d55cf33165bda24c86523497a6881f339f9aae5c2e13e57b/cryptography-46.0.3-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:a04bee9ab6a4da801eb9b51f1b708a1b5b5c9eb48c03f74198464c66f0d344ac", size = 4698381, upload-time = "2025-10-15T23:17:12.829Z" },
-    { url = "https://files.pythonhosted.org/packages/96/92/8a6a9525893325fc057a01f654d7efc2c64b9de90413adcf605a85744ff4/cryptography-46.0.3-cp311-abi3-win32.whl", hash = "sha256:f260d0d41e9b4da1ed1e0f1ce571f97fe370b152ab18778e9e8f67d6af432018", size = 3055988, upload-time = "2025-10-15T23:17:14.65Z" },
-    { url = "https://files.pythonhosted.org/packages/7e/bf/80fbf45253ea585a1e492a6a17efcb93467701fa79e71550a430c5e60df0/cryptography-46.0.3-cp311-abi3-win_amd64.whl", hash = "sha256:a9a3008438615669153eb86b26b61e09993921ebdd75385ddd748702c5adfddb", size = 3514451, upload-time = "2025-10-15T23:17:16.142Z" },
-    { url = "https://files.pythonhosted.org/packages/2e/af/9b302da4c87b0beb9db4e756386a7c6c5b8003cd0e742277888d352ae91d/cryptography-46.0.3-cp311-abi3-win_arm64.whl", hash = "sha256:5d7f93296ee28f68447397bf5198428c9aeeab45705a55d53a6343455dcb2c3c", size = 2928007, upload-time = "2025-10-15T23:17:18.04Z" },
-    { url = "https://files.pythonhosted.org/packages/f5/e2/a510aa736755bffa9d2f75029c229111a1d02f8ecd5de03078f4c18d91a3/cryptography-46.0.3-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:00a5e7e87938e5ff9ff5447ab086a5706a957137e6e433841e9d24f38a065217", size = 7158012, upload-time = "2025-10-15T23:17:19.982Z" },
-    { url = "https://files.pythonhosted.org/packages/73/dc/9aa866fbdbb95b02e7f9d086f1fccfeebf8953509b87e3f28fff927ff8a0/cryptography-46.0.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c8daeb2d2174beb4575b77482320303f3d39b8e81153da4f0fb08eb5fe86a6c5", size = 4288728, upload-time = "2025-10-15T23:17:21.527Z" },
-    { url = "https://files.pythonhosted.org/packages/c5/fd/bc1daf8230eaa075184cbbf5f8cd00ba9db4fd32d63fb83da4671b72ed8a/cryptography-46.0.3-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:39b6755623145ad5eff1dab323f4eae2a32a77a7abef2c5089a04a3d04366715", size = 4435078, upload-time = "2025-10-15T23:17:23.042Z" },
-    { url = "https://files.pythonhosted.org/packages/82/98/d3bd5407ce4c60017f8ff9e63ffee4200ab3e23fe05b765cab805a7db008/cryptography-46.0.3-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:db391fa7c66df6762ee3f00c95a89e6d428f4d60e7abc8328f4fe155b5ac6e54", size = 4293460, upload-time = "2025-10-15T23:17:24.885Z" },
-    { url = "https://files.pythonhosted.org/packages/26/e9/e23e7900983c2b8af7a08098db406cf989d7f09caea7897e347598d4cd5b/cryptography-46.0.3-cp314-cp314t-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:78a97cf6a8839a48c49271cdcbd5cf37ca2c1d6b7fdd86cc864f302b5e9bf459", size = 3995237, upload-time = "2025-10-15T23:17:26.449Z" },
-    { url = "https://files.pythonhosted.org/packages/91/15/af68c509d4a138cfe299d0d7ddb14afba15233223ebd933b4bbdbc7155d3/cryptography-46.0.3-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:dfb781ff7eaa91a6f7fd41776ec37c5853c795d3b358d4896fdbb5df168af422", size = 4967344, upload-time = "2025-10-15T23:17:28.06Z" },
-    { url = "https://files.pythonhosted.org/packages/ca/e3/8643d077c53868b681af077edf6b3cb58288b5423610f21c62aadcbe99f4/cryptography-46.0.3-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:6f61efb26e76c45c4a227835ddeae96d83624fb0d29eb5df5b96e14ed1a0afb7", size = 4466564, upload-time = "2025-10-15T23:17:29.665Z" },
-    { url = "https://files.pythonhosted.org/packages/0e/43/c1e8726fa59c236ff477ff2b5dc071e54b21e5a1e51aa2cee1676f1c986f/cryptography-46.0.3-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:23b1a8f26e43f47ceb6d6a43115f33a5a37d57df4ea0ca295b780ae8546e8044", size = 4292415, upload-time = "2025-10-15T23:17:31.686Z" },
-    { url = "https://files.pythonhosted.org/packages/42/f9/2f8fefdb1aee8a8e3256a0568cffc4e6d517b256a2fe97a029b3f1b9fe7e/cryptography-46.0.3-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:b419ae593c86b87014b9be7396b385491ad7f320bde96826d0dd174459e54665", size = 4931457, upload-time = "2025-10-15T23:17:33.478Z" },
-    { url = "https://files.pythonhosted.org/packages/79/30/9b54127a9a778ccd6d27c3da7563e9f2d341826075ceab89ae3b41bf5be2/cryptography-46.0.3-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:50fc3343ac490c6b08c0cf0d704e881d0d660be923fd3076db3e932007e726e3", size = 4466074, upload-time = "2025-10-15T23:17:35.158Z" },
-    { url = "https://files.pythonhosted.org/packages/ac/68/b4f4a10928e26c941b1b6a179143af9f4d27d88fe84a6a3c53592d2e76bf/cryptography-46.0.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:22d7e97932f511d6b0b04f2bfd818d73dcd5928db509460aaf48384778eb6d20", size = 4420569, upload-time = "2025-10-15T23:17:37.188Z" },
-    { url = "https://files.pythonhosted.org/packages/a3/49/3746dab4c0d1979888f125226357d3262a6dd40e114ac29e3d2abdf1ec55/cryptography-46.0.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:d55f3dffadd674514ad19451161118fd010988540cee43d8bc20675e775925de", size = 4681941, upload-time = "2025-10-15T23:17:39.236Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/30/27654c1dbaf7e4a3531fa1fc77986d04aefa4d6d78259a62c9dc13d7ad36/cryptography-46.0.3-cp314-cp314t-win32.whl", hash = "sha256:8a6e050cb6164d3f830453754094c086ff2d0b2f3a897a1d9820f6139a1f0914", size = 3022339, upload-time = "2025-10-15T23:17:40.888Z" },
-    { url = "https://files.pythonhosted.org/packages/f6/30/640f34ccd4d2a1bc88367b54b926b781b5a018d65f404d409aba76a84b1c/cryptography-46.0.3-cp314-cp314t-win_amd64.whl", hash = "sha256:760f83faa07f8b64e9c33fc963d790a2edb24efb479e3520c14a45741cd9b2db", size = 3494315, upload-time = "2025-10-15T23:17:42.769Z" },
-    { url = "https://files.pythonhosted.org/packages/ba/8b/88cc7e3bd0a8e7b861f26981f7b820e1f46aa9d26cc482d0feba0ecb4919/cryptography-46.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:516ea134e703e9fe26bcd1277a4b59ad30586ea90c365a87781d7887a646fe21", size = 2919331, upload-time = "2025-10-15T23:17:44.468Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/23/45fe7f376a7df8daf6da3556603b36f53475a99ce4faacb6ba2cf3d82021/cryptography-46.0.3-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:cb3d760a6117f621261d662bccc8ef5bc32ca673e037c83fbe565324f5c46936", size = 7218248, upload-time = "2025-10-15T23:17:46.294Z" },
-    { url = "https://files.pythonhosted.org/packages/27/32/b68d27471372737054cbd34c84981f9edbc24fe67ca225d389799614e27f/cryptography-46.0.3-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:4b7387121ac7d15e550f5cb4a43aef2559ed759c35df7336c402bb8275ac9683", size = 4294089, upload-time = "2025-10-15T23:17:48.269Z" },
-    { url = "https://files.pythonhosted.org/packages/26/42/fa8389d4478368743e24e61eea78846a0006caffaf72ea24a15159215a14/cryptography-46.0.3-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:15ab9b093e8f09daab0f2159bb7e47532596075139dd74365da52ecc9cb46c5d", size = 4440029, upload-time = "2025-10-15T23:17:49.837Z" },
-    { url = "https://files.pythonhosted.org/packages/5f/eb/f483db0ec5ac040824f269e93dd2bd8a21ecd1027e77ad7bdf6914f2fd80/cryptography-46.0.3-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:46acf53b40ea38f9c6c229599a4a13f0d46a6c3fa9ef19fc1a124d62e338dfa0", size = 4297222, upload-time = "2025-10-15T23:17:51.357Z" },
-    { url = "https://files.pythonhosted.org/packages/fd/cf/da9502c4e1912cb1da3807ea3618a6829bee8207456fbbeebc361ec38ba3/cryptography-46.0.3-cp38-abi3-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:10ca84c4668d066a9878890047f03546f3ae0a6b8b39b697457b7757aaf18dbc", size = 4012280, upload-time = "2025-10-15T23:17:52.964Z" },
-    { url = "https://files.pythonhosted.org/packages/6b/8f/9adb86b93330e0df8b3dcf03eae67c33ba89958fc2e03862ef1ac2b42465/cryptography-46.0.3-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:36e627112085bb3b81b19fed209c05ce2a52ee8b15d161b7c643a7d5a88491f3", size = 4978958, upload-time = "2025-10-15T23:17:54.965Z" },
-    { url = "https://files.pythonhosted.org/packages/d1/a0/5fa77988289c34bdb9f913f5606ecc9ada1adb5ae870bd0d1054a7021cc4/cryptography-46.0.3-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:1000713389b75c449a6e979ffc7dcc8ac90b437048766cef052d4d30b8220971", size = 4473714, upload-time = "2025-10-15T23:17:56.754Z" },
-    { url = "https://files.pythonhosted.org/packages/14/e5/fc82d72a58d41c393697aa18c9abe5ae1214ff6f2a5c18ac470f92777895/cryptography-46.0.3-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:b02cf04496f6576afffef5ddd04a0cb7d49cf6be16a9059d793a30b035f6b6ac", size = 4296970, upload-time = "2025-10-15T23:17:58.588Z" },
-    { url = "https://files.pythonhosted.org/packages/78/06/5663ed35438d0b09056973994f1aec467492b33bd31da36e468b01ec1097/cryptography-46.0.3-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:71e842ec9bc7abf543b47cf86b9a743baa95f4677d22baa4c7d5c69e49e9bc04", size = 4940236, upload-time = "2025-10-15T23:18:00.897Z" },
-    { url = "https://files.pythonhosted.org/packages/fc/59/873633f3f2dcd8a053b8dd1d38f783043b5fce589c0f6988bf55ef57e43e/cryptography-46.0.3-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:402b58fc32614f00980b66d6e56a5b4118e6cb362ae8f3fda141ba4689bd4506", size = 4472642, upload-time = "2025-10-15T23:18:02.749Z" },
-    { url = "https://files.pythonhosted.org/packages/3d/39/8e71f3930e40f6877737d6f69248cf74d4e34b886a3967d32f919cc50d3b/cryptography-46.0.3-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:ef639cb3372f69ec44915fafcd6698b6cc78fbe0c2ea41be867f6ed612811963", size = 4423126, upload-time = "2025-10-15T23:18:04.85Z" },
-    { url = "https://files.pythonhosted.org/packages/cd/c7/f65027c2810e14c3e7268353b1681932b87e5a48e65505d8cc17c99e36ae/cryptography-46.0.3-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:3b51b8ca4f1c6453d8829e1eb7299499ca7f313900dd4d89a24b8b87c0a780d4", size = 4686573, upload-time = "2025-10-15T23:18:06.908Z" },
-    { url = "https://files.pythonhosted.org/packages/0a/6e/1c8331ddf91ca4730ab3086a0f1be19c65510a33b5a441cb334e7a2d2560/cryptography-46.0.3-cp38-abi3-win32.whl", hash = "sha256:6276eb85ef938dc035d59b87c8a7dc559a232f954962520137529d77b18ff1df", size = 3036695, upload-time = "2025-10-15T23:18:08.672Z" },
-    { url = "https://files.pythonhosted.org/packages/90/45/b0d691df20633eff80955a0fc7695ff9051ffce8b69741444bd9ed7bd0db/cryptography-46.0.3-cp38-abi3-win_amd64.whl", hash = "sha256:416260257577718c05135c55958b674000baef9a1c7d9e8f306ec60d71db850f", size = 3501720, upload-time = "2025-10-15T23:18:10.632Z" },
-    { url = "https://files.pythonhosted.org/packages/e8/cb/2da4cc83f5edb9c3257d09e1e7ab7b23f049c7962cae8d842bbef0a9cec9/cryptography-46.0.3-cp38-abi3-win_arm64.whl", hash = "sha256:d89c3468de4cdc4f08a57e214384d0471911a3830fcdaf7a8cc587e42a866372", size = 2918740, upload-time = "2025-10-15T23:18:12.277Z" },
-    { url = "https://files.pythonhosted.org/packages/d9/cd/1a8633802d766a0fa46f382a77e096d7e209e0817892929655fe0586ae32/cryptography-46.0.3-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:a23582810fedb8c0bc47524558fb6c56aac3fc252cb306072fd2815da2a47c32", size = 3689163, upload-time = "2025-10-15T23:18:13.821Z" },
-    { url = "https://files.pythonhosted.org/packages/4c/59/6b26512964ace6480c3e54681a9859c974172fb141c38df11eadd8416947/cryptography-46.0.3-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:e7aec276d68421f9574040c26e2a7c3771060bc0cff408bae1dcb19d3ab1e63c", size = 3429474, upload-time = "2025-10-15T23:18:15.477Z" },
-    { url = "https://files.pythonhosted.org/packages/06/8a/e60e46adab4362a682cf142c7dcb5bf79b782ab2199b0dcb81f55970807f/cryptography-46.0.3-pp311-pypy311_pp73-macosx_10_9_x86_64.whl", hash = "sha256:7ce938a99998ed3c8aa7e7272dca1a610401ede816d36d0693907d863b10d9ea", size = 3698132, upload-time = "2025-10-15T23:18:17.056Z" },
-    { url = "https://files.pythonhosted.org/packages/da/38/f59940ec4ee91e93d3311f7532671a5cef5570eb04a144bf203b58552d11/cryptography-46.0.3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:191bb60a7be5e6f54e30ba16fdfae78ad3a342a0599eb4193ba88e3f3d6e185b", size = 4243992, upload-time = "2025-10-15T23:18:18.695Z" },
-    { url = "https://files.pythonhosted.org/packages/b0/0c/35b3d92ddebfdfda76bb485738306545817253d0a3ded0bfe80ef8e67aa5/cryptography-46.0.3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:c70cc23f12726be8f8bc72e41d5065d77e4515efae3690326764ea1b07845cfb", size = 4409944, upload-time = "2025-10-15T23:18:20.597Z" },
-    { url = "https://files.pythonhosted.org/packages/99/55/181022996c4063fc0e7666a47049a1ca705abb9c8a13830f074edb347495/cryptography-46.0.3-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:9394673a9f4de09e28b5356e7fff97d778f8abad85c9d5ac4a4b7e25a0de7717", size = 4242957, upload-time = "2025-10-15T23:18:22.18Z" },
-    { url = "https://files.pythonhosted.org/packages/ba/af/72cd6ef29f9c5f731251acadaeb821559fe25f10852f44a63374c9ca08c1/cryptography-46.0.3-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:94cd0549accc38d1494e1f8de71eca837d0509d0d44bf11d158524b0e12cebf9", size = 4409447, upload-time = "2025-10-15T23:18:24.209Z" },
-    { url = "https://files.pythonhosted.org/packages/0d/c3/e90f4a4feae6410f914f8ebac129b9ae7a8c92eb60a638012dde42030a9d/cryptography-46.0.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:6b5063083824e5509fdba180721d55909ffacccc8adbec85268b48439423d78c", size = 3438528, upload-time = "2025-10-15T23:18:26.227Z" },
-]
-
-[[package]]
-name = "cuda-bindings"
-version = "12.9.4"
-source = { registry = "https://pypi.org/simple" }
-dependencies = [
-    { name = "cuda-pathfinder", marker = "sys_platform == 'linux'" },
-]
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/7a/d8/b546104b8da3f562c1ff8ab36d130c8fe1dd6a045ced80b4f6ad74f7d4e1/cuda_bindings-12.9.4-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d3c842c2a4303b2a580fe955018e31aea30278be19795ae05226235268032e5", size = 12148218, upload-time = "2025-10-21T14:51:28.855Z" },
-    { url = "https://files.pythonhosted.org/packages/45/e7/b47792cc2d01c7e1d37c32402182524774dadd2d26339bd224e0e913832e/cuda_bindings-12.9.4-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c912a3d9e6b6651853eed8eed96d6800d69c08e94052c292fec3f282c5a817c9", size = 12210593, upload-time = "2025-10-21T14:51:36.574Z" },
-    { url = "https://files.pythonhosted.org/packages/a9/c1/dabe88f52c3e3760d861401bb994df08f672ec893b8f7592dc91626adcf3/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fda147a344e8eaeca0c6ff113d2851ffca8f7dfc0a6c932374ee5c47caa649c8", size = 12151019, upload-time = "2025-10-21T14:51:43.167Z" },
-    { url = "https://files.pythonhosted.org/packages/63/56/e465c31dc9111be3441a9ba7df1941fe98f4aa6e71e8788a3fb4534ce24d/cuda_bindings-12.9.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:32bdc5a76906be4c61eb98f546a6786c5773a881f3b166486449b5d141e4a39f", size = 11906628, upload-time = "2025-10-21T14:51:49.905Z" },
-    { url = "https://files.pythonhosted.org/packages/a3/84/1e6be415e37478070aeeee5884c2022713c1ecc735e6d82d744de0252eee/cuda_bindings-12.9.4-cp313-cp313t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:56e0043c457a99ac473ddc926fe0dc4046694d99caef633e92601ab52cbe17eb", size = 11925991, upload-time = "2025-10-21T14:51:56.535Z" },
-    { url = "https://files.pythonhosted.org/packages/d1/af/6dfd8f2ed90b1d4719bc053ff8940e494640fe4212dc3dd72f383e4992da/cuda_bindings-12.9.4-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8b72ee72a9cc1b531db31eebaaee5c69a8ec3500e32c6933f2d3b15297b53686", size = 11922703, upload-time = "2025-10-21T14:52:03.585Z" },
-    { url = "https://files.pythonhosted.org/packages/6c/19/90ac264acc00f6df8a49378eedec9fd2db3061bf9263bf9f39fd3d8377c3/cuda_bindings-12.9.4-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d80bffc357df9988dca279734bc9674c3934a654cab10cadeed27ce17d8635ee", size = 11924658, upload-time = "2025-10-21T14:52:10.411Z" },
-]
-
-[[package]]
-name = "cuda-pathfinder"
-version = "1.3.3"
+sdist = { url = "https://files.pythonhosted.org/packages/0d/05/07b55d1fa21ac18c3a8c79f764e2514e6f6a9698f1be44994f5adf0d29db/cryptography-43.0.3.tar.gz", hash = "sha256:315b9001266a492a6ff443b61238f956b214dbec9910a081ba5b6646a055a805", size = 686989, upload-time = "2024-10-18T15:58:32.918Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/1f/f3/01fdf26701a26f4b4dbc337a26883ad5bccaa6f1bbbdd29cd89e22f18a1c/cryptography-43.0.3-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:bf7a1932ac4176486eab36a19ed4c0492da5d97123f1406cf15e41b05e787d2e", size = 6225303, upload-time = "2024-10-18T15:57:36.753Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/01/4896f3d1b392025d4fcbecf40fdea92d3df8662123f6835d0af828d148fd/cryptography-43.0.3-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:63efa177ff54aec6e1c0aefaa1a241232dcd37413835a9b674b6e3f0ae2bfd3e", size = 3760905, upload-time = "2024-10-18T15:57:39.166Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/be/f9a1f673f0ed4b7f6c643164e513dbad28dd4f2dcdf5715004f172ef24b6/cryptography-43.0.3-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7e1ce50266f4f70bf41a2c6dc4358afadae90e2a1e5342d3c08883df1675374f", size = 3977271, upload-time = "2024-10-18T15:57:41.227Z" },
+    { url = "https://files.pythonhosted.org/packages/4e/49/80c3a7b5514d1b416d7350830e8c422a4d667b6d9b16a9392ebfd4a5388a/cryptography-43.0.3-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:443c4a81bb10daed9a8f334365fe52542771f25aedaf889fd323a853ce7377d6", size = 3746606, upload-time = "2024-10-18T15:57:42.903Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/16/a28ddf78ac6e7e3f25ebcef69ab15c2c6be5ff9743dd0709a69a4f968472/cryptography-43.0.3-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:74f57f24754fe349223792466a709f8e0c093205ff0dca557af51072ff47ab18", size = 3986484, upload-time = "2024-10-18T15:57:45.434Z" },
+    { url = "https://files.pythonhosted.org/packages/01/f5/69ae8da70c19864a32b0315049866c4d411cce423ec169993d0434218762/cryptography-43.0.3-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9762ea51a8fc2a88b70cf2995e5675b38d93bf36bd67d91721c309df184f49bd", size = 3852131, upload-time = "2024-10-18T15:57:47.267Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/db/e74911d95c040f9afd3612b1f732e52b3e517cb80de8bf183be0b7d413c6/cryptography-43.0.3-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:81ef806b1fef6b06dcebad789f988d3b37ccaee225695cf3e07648eee0fc6b73", size = 4075647, upload-time = "2024-10-18T15:57:49.684Z" },
+    { url = "https://files.pythonhosted.org/packages/56/48/7b6b190f1462818b324e674fa20d1d5ef3e24f2328675b9b16189cbf0b3c/cryptography-43.0.3-cp37-abi3-win32.whl", hash = "sha256:cbeb489927bd7af4aa98d4b261af9a5bc025bd87f0e3547e11584be9e9427be2", size = 2623873, upload-time = "2024-10-18T15:57:51.822Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/b1/0ebff61a004f7f89e7b65ca95f2f2375679d43d0290672f7713ee3162aff/cryptography-43.0.3-cp37-abi3-win_amd64.whl", hash = "sha256:f46304d6f0c6ab8e52770addfa2fc41e6629495548862279641972b6215451cd", size = 3068039, upload-time = "2024-10-18T15:57:54.426Z" },
+    { url = "https://files.pythonhosted.org/packages/30/d5/c8b32c047e2e81dd172138f772e81d852c51f0f2ad2ae8a24f1122e9e9a7/cryptography-43.0.3-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:8ac43ae87929a5982f5948ceda07001ee5e83227fd69cf55b109144938d96984", size = 6222984, upload-time = "2024-10-18T15:57:56.174Z" },
+    { url = "https://files.pythonhosted.org/packages/2f/78/55356eb9075d0be6e81b59f45c7b48df87f76a20e73893872170471f3ee8/cryptography-43.0.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:846da004a5804145a5f441b8530b4bf35afbf7da70f82409f151695b127213d5", size = 3762968, upload-time = "2024-10-18T15:57:58.206Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/2c/488776a3dc843f95f86d2f957ca0fc3407d0242b50bede7fad1e339be03f/cryptography-43.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0f996e7268af62598f2fc1204afa98a3b5712313a55c4c9d434aef49cadc91d4", size = 3977754, upload-time = "2024-10-18T15:58:00.683Z" },
+    { url = "https://files.pythonhosted.org/packages/7c/04/2345ca92f7a22f601a9c62961741ef7dd0127c39f7310dffa0041c80f16f/cryptography-43.0.3-cp39-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:f7b178f11ed3664fd0e995a47ed2b5ff0a12d893e41dd0494f406d1cf555cab7", size = 3749458, upload-time = "2024-10-18T15:58:02.225Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/25/e715fa0bc24ac2114ed69da33adf451a38abb6f3f24ec207908112e9ba53/cryptography-43.0.3-cp39-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:c2e6fc39c4ab499049df3bdf567f768a723a5e8464816e8f009f121a5a9f4405", size = 3988220, upload-time = "2024-10-18T15:58:04.331Z" },
+    { url = "https://files.pythonhosted.org/packages/21/ce/b9c9ff56c7164d8e2edfb6c9305045fbc0df4508ccfdb13ee66eb8c95b0e/cryptography-43.0.3-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:e1be4655c7ef6e1bbe6b5d0403526601323420bcf414598955968c9ef3eb7d16", size = 3853898, upload-time = "2024-10-18T15:58:06.113Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/33/b3682992ab2e9476b9c81fff22f02c8b0a1e6e1d49ee1750a67d85fd7ed2/cryptography-43.0.3-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:df6b6c6d742395dd77a23ea3728ab62f98379eff8fb61be2744d4679ab678f73", size = 4076592, upload-time = "2024-10-18T15:58:08.673Z" },
+    { url = "https://files.pythonhosted.org/packages/81/1e/ffcc41b3cebd64ca90b28fd58141c5f68c83d48563c88333ab660e002cd3/cryptography-43.0.3-cp39-abi3-win32.whl", hash = "sha256:d56e96520b1020449bbace2b78b603442e7e378a9b3bd68de65c782db1507995", size = 2623145, upload-time = "2024-10-18T15:58:10.264Z" },
+    { url = "https://files.pythonhosted.org/packages/87/5c/3dab83cc4aba1f4b0e733e3f0c3e7d4386440d660ba5b1e3ff995feb734d/cryptography-43.0.3-cp39-abi3-win_amd64.whl", hash = "sha256:0c580952eef9bf68c4747774cde7ec1d85a6e61de97281f2dba83c7d2c806362", size = 3068026, upload-time = "2024-10-18T15:58:11.916Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/db/d8b8a039483f25fc3b70c90bc8f3e1d4497a99358d610c5067bf3bd4f0af/cryptography-43.0.3-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:d03b5621a135bffecad2c73e9f4deb1a0f977b9a8ffe6f8e002bf6c9d07b918c", size = 3144545, upload-time = "2024-10-18T15:58:13.572Z" },
+    { url = "https://files.pythonhosted.org/packages/93/90/116edd5f8ec23b2dc879f7a42443e073cdad22950d3c8ee834e3b8124543/cryptography-43.0.3-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:a2a431ee15799d6db9fe80c82b055bae5a752bef645bba795e8e52687c69efe3", size = 3679828, upload-time = "2024-10-18T15:58:15.254Z" },
+    { url = "https://files.pythonhosted.org/packages/d8/32/1e1d78b316aa22c0ba6493cc271c1c309969e5aa5c22c830a1d7ce3471e6/cryptography-43.0.3-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:281c945d0e28c92ca5e5930664c1cefd85efe80e5c0d2bc58dd63383fda29f83", size = 3908132, upload-time = "2024-10-18T15:58:16.943Z" },
+    { url = "https://files.pythonhosted.org/packages/91/bb/cd2c13be3332e7af3cdf16154147952d39075b9f61ea5e6b5241bf4bf436/cryptography-43.0.3-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:f18c716be16bc1fea8e95def49edf46b82fccaa88587a45f8dc0ff6ab5d8e0a7", size = 2988811, upload-time = "2024-10-18T15:58:19.674Z" },
+]
+
+[[package]]
+name = "cycler"
+version = "0.12.1"
 source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/a9/95/a3dbbb5028f35eafb79008e7522a75244477d2838f38cbb722248dabc2a8/cycler-0.12.1.tar.gz", hash = "sha256:88bb128f02ba341da8ef447245a9e138fae777f6a23943da4540077d3601eb1c", size = 7615, upload-time = "2023-10-07T05:32:18.335Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/0b/02/4dbe7568a42e46582248942f54dc64ad094769532adbe21e525e4edf7bc4/cuda_pathfinder-1.3.3-py3-none-any.whl", hash = "sha256:9984b664e404f7c134954a771be8775dfd6180ea1e1aef4a5a37d4be05d9bbb1", size = 27154, upload-time = "2025-12-04T22:35:08.996Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl", hash = "sha256:85cef7cff222d8644161529808465972e51340599459b8ac3ccbac5a854e0d30", size = 8321, upload-time = "2023-10-07T05:32:16.783Z" },
 ]
 
 [[package]]
@@ -946,6 +1109,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/c3/be/d0d44e092656fe7a06b55e6103cbce807cdbdee17884a5367c68c9860853/dataclasses_json-0.6.7-py3-none-any.whl", hash = "sha256:0dbf33f26c8d5305befd61b39d2b3414e8a407bedc2834dea9b8d642666fb40a", size = 28686, upload-time = "2024-06-09T16:20:16.715Z" },
 ]
 
+[[package]]
+name = "dataproperty"
+version = "1.1.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "mbstrdecoder" },
+    { name = "typepy", extra = ["datetime"] },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/0b/81/8c8b64ae873cb9014815214c07b63b12e3b18835780fb342223cfe3fe7d8/dataproperty-1.1.0.tar.gz", hash = "sha256:b038437a4097d1a1c497695c3586ea34bea67fdd35372b9a50f30bf044d77d04", size = 42574, upload-time = "2024-12-31T14:37:26.033Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/21/c2/e12e95e289e6081a40454199ab213139ef16a528c7c86432de545b05a23a/DataProperty-1.1.0-py3-none-any.whl", hash = "sha256:c61fcb2e2deca35e6d1eb1f251a7f22f0dcde63e80e61f0cc18c19f42abfd25b", size = 27581, upload-time = "2024-12-31T14:37:22.657Z" },
+]
+
 [[package]]
 name = "datasets"
 version = "4.0.0"
@@ -1075,6 +1251,29 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/55/e2/2537ebcff11c1ee1ff17d8d0b6f4db75873e3b0fb32c2d4a2ee31ecb310a/docstring_parser-0.17.0-py3-none-any.whl", hash = "sha256:cf2569abd23dce8099b300f9b4fa8191e9582dda731fd533daf54c4551658708", size = 36896, upload-time = "2025-07-21T07:35:00.684Z" },
 ]
 
+[[package]]
+name = "evaluate"
+version = "0.4.6"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "datasets" },
+    { name = "dill" },
+    { name = "fsspec" },
+    { name = "huggingface-hub" },
+    { name = "multiprocess" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "packaging" },
+    { name = "pandas" },
+    { name = "requests" },
+    { name = "tqdm" },
+    { name = "xxhash" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/ad/d0/0c17a8e6e8dc7245f22dea860557c32bae50fc4d287ae030cb0e8ab8720f/evaluate-0.4.6.tar.gz", hash = "sha256:e07036ca12b3c24331f83ab787f21cc2dbf3631813a1631e63e40897c69a3f21", size = 65716, upload-time = "2025-09-18T13:06:30.581Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3e/af/3e990d8d4002bbc9342adb4facd59506e653da93b2417de0fa6027cb86b1/evaluate-0.4.6-py3-none-any.whl", hash = "sha256:bca85bc294f338377b7ac2f861e21c308b11b2a285f510d7d5394d5df437db29", size = 84069, upload-time = "2025-09-18T13:06:29.265Z" },
+]
+
 [[package]]
 name = "exceptiongroup"
 version = "1.3.1"
@@ -1215,6 +1414,63 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/ec/f9/7f9263c5695f4bd0023734af91bedb2ff8209e8de6ead162f35d8dc762fd/flask-3.1.2-py3-none-any.whl", hash = "sha256:ca1d8112ec8a6158cc29ea4858963350011b5c846a414cdb7a954aa9e967d03c", size = 103308, upload-time = "2025-08-19T21:03:19.499Z" },
 ]
 
+[[package]]
+name = "fonttools"
+version = "4.62.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/9a/08/7012b00a9a5874311b639c3920270c36ee0c445b69d9989a85e5c92ebcb0/fonttools-4.62.1.tar.gz", hash = "sha256:e54c75fd6041f1122476776880f7c3c3295ffa31962dc6ebe2543c00dca58b5d", size = 3580737, upload-time = "2026-03-13T13:54:25.52Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/5a/ff/532ed43808b469c807e8cb6b21358da3fe6fd51486b3a8c93db0bb5d957f/fonttools-4.62.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:ad5cca75776cd453b1b035b530e943334957ae152a36a88a320e779d61fc980c", size = 2873740, upload-time = "2026-03-13T13:52:11.822Z" },
+    { url = "https://files.pythonhosted.org/packages/85/e4/2318d2b430562da7227010fb2bb029d2fa54d7b46443ae8942bab224e2a0/fonttools-4.62.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:0b3ae47e8636156a9accff64c02c0924cbebad62854c4a6dbdc110cd5b4b341a", size = 2417649, upload-time = "2026-03-13T13:52:14.605Z" },
+    { url = "https://files.pythonhosted.org/packages/4c/28/40f15523b5188598018e7956899fed94eb7debec89e2dd70cb4a8df90492/fonttools-4.62.1-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c9b9e288b4da2f64fd6180644221749de651703e8d0c16bd4b719533a3a7d6e3", size = 4935213, upload-time = "2026-03-13T13:52:17.399Z" },
+    { url = "https://files.pythonhosted.org/packages/42/09/7dbe3d7023f57d9b580cfa832109d521988112fd59dddfda3fddda8218f9/fonttools-4.62.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:7bca7a1c1faf235ffe25d4f2e555246b4750220b38de8261d94ebc5ce8a23c23", size = 4892374, upload-time = "2026-03-13T13:52:20.175Z" },
+    { url = "https://files.pythonhosted.org/packages/d1/2d/84509a2e32cb925371560ef5431365d8da2183c11d98e5b4b8b4e42426a5/fonttools-4.62.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:b4e0fcf265ad26e487c56cb12a42dffe7162de708762db951e1b3f755319507d", size = 4911856, upload-time = "2026-03-13T13:52:22.777Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/80/df28131379eed93d9e6e6fccd3bf6e3d077bebbfe98cc83f21bbcd83ed02/fonttools-4.62.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:2d850f66830a27b0d498ee05adb13a3781637b1826982cd7e2b3789ef0cc71ae", size = 5031712, upload-time = "2026-03-13T13:52:25.14Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/03/3c8f09aad64230cd6d921ae7a19f9603c36f70930b00459f112706f6769a/fonttools-4.62.1-cp310-cp310-win32.whl", hash = "sha256:486f32c8047ccd05652aba17e4a8819a3a9d78570eb8a0e3b4503142947880ed", size = 1507878, upload-time = "2026-03-13T13:52:28.149Z" },
+    { url = "https://files.pythonhosted.org/packages/dd/ec/f53f626f8f3e89f4cadd8fc08f3452c8fd182c951ad5caa35efac22b29ab/fonttools-4.62.1-cp310-cp310-win_amd64.whl", hash = "sha256:5a648bde915fba9da05ae98856987ca91ba832949a9e2888b48c47ef8b96c5a9", size = 1556766, upload-time = "2026-03-13T13:52:30.814Z" },
+    { url = "https://files.pythonhosted.org/packages/88/39/23ff32561ec8d45a4d48578b4d241369d9270dc50926c017570e60893701/fonttools-4.62.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:40975849bac44fb0b9253d77420c6d8b523ac4dcdcefeff6e4d706838a5b80f7", size = 2871039, upload-time = "2026-03-13T13:52:33.127Z" },
+    { url = "https://files.pythonhosted.org/packages/24/7f/66d3f8a9338a9b67fe6e1739f47e1cd5cee78bd3bc1206ef9b0b982289a5/fonttools-4.62.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:9dde91633f77fa576879a0c76b1d89de373cae751a98ddf0109d54e173b40f14", size = 2416346, upload-time = "2026-03-13T13:52:35.676Z" },
+    { url = "https://files.pythonhosted.org/packages/aa/53/5276ceba7bff95da7793a07c5284e1da901cf00341ce5e2f3273056c0cca/fonttools-4.62.1-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6acb4109f8bee00fec985c8c7afb02299e35e9c94b57287f3ea542f28bd0b0a7", size = 5100897, upload-time = "2026-03-13T13:52:38.102Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/a1/40a5c4d8e28b0851d53a8eeeb46fbd73c325a2a9a165f290a5ed90e6c597/fonttools-4.62.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1c5c25671ce8805e0d080e2ffdeca7f1e86778c5cbfbeae86d7f866d8830517b", size = 5071078, upload-time = "2026-03-13T13:52:41.305Z" },
+    { url = "https://files.pythonhosted.org/packages/e3/be/d378fca4c65ea1956fee6d90ace6e861776809cbbc5af22388a090c3c092/fonttools-4.62.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:a5d8825e1140f04e6c99bb7d37a9e31c172f3bc208afbe02175339e699c710e1", size = 5076908, upload-time = "2026-03-13T13:52:44.122Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/d9/ae6a1d0693a4185a84605679c8a1f719a55df87b9c6e8e817bfdd9ef5936/fonttools-4.62.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:268abb1cb221e66c014acc234e872b7870d8b5d4657a83a8f4205094c32d2416", size = 5202275, upload-time = "2026-03-13T13:52:46.591Z" },
+    { url = "https://files.pythonhosted.org/packages/54/6c/af95d9c4efb15cabff22642b608342f2bd67137eea6107202d91b5b03184/fonttools-4.62.1-cp311-cp311-win32.whl", hash = "sha256:942b03094d7edbb99bdf1ae7e9090898cad7bf9030b3d21f33d7072dbcb51a53", size = 2293075, upload-time = "2026-03-13T13:52:48.711Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/97/bf54c5b3f2be34e1f143e6db838dfdc54f2ffa3e68c738934c82f3b2a08d/fonttools-4.62.1-cp311-cp311-win_amd64.whl", hash = "sha256:e8514f4924375f77084e81467e63238b095abda5107620f49421c368a6017ed2", size = 2344593, upload-time = "2026-03-13T13:52:50.725Z" },
+    { url = "https://files.pythonhosted.org/packages/47/d4/dbacced3953544b9a93088cc10ef2b596d348c983d5c67a404fa41ec51ba/fonttools-4.62.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:90365821debbd7db678809c7491ca4acd1e0779b9624cdc6ddaf1f31992bf974", size = 2870219, upload-time = "2026-03-13T13:52:53.664Z" },
+    { url = "https://files.pythonhosted.org/packages/66/9e/a769c8e99b81e5a87ab7e5e7236684de4e96246aae17274e5347d11ebd78/fonttools-4.62.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:12859ff0b47dd20f110804c3e0d0970f7b832f561630cd879969011541a464a9", size = 2414891, upload-time = "2026-03-13T13:52:56.493Z" },
+    { url = "https://files.pythonhosted.org/packages/69/64/f19a9e3911968c37e1e620e14dfc5778299e1474f72f4e57c5ec771d9489/fonttools-4.62.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9c125ffa00c3d9003cdaaf7f2c79e6e535628093e14b5de1dccb08859b680936", size = 5033197, upload-time = "2026-03-13T13:52:59.179Z" },
+    { url = "https://files.pythonhosted.org/packages/9b/8a/99c8b3c3888c5c474c08dbfd7c8899786de9604b727fcefb055b42c84bba/fonttools-4.62.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:149f7d84afca659d1a97e39a4778794a2f83bf344c5ee5134e09995086cc2392", size = 4988768, upload-time = "2026-03-13T13:53:02.761Z" },
+    { url = "https://files.pythonhosted.org/packages/d1/c6/0f904540d3e6ab463c1243a0d803504826a11604c72dd58c2949796a1762/fonttools-4.62.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:0aa72c43a601cfa9273bb1ae0518f1acadc01ee181a6fc60cd758d7fdadffc04", size = 4971512, upload-time = "2026-03-13T13:53:05.678Z" },
+    { url = "https://files.pythonhosted.org/packages/29/0b/5cbef6588dc9bd6b5c9ad6a4d5a8ca384d0cea089da31711bbeb4f9654a6/fonttools-4.62.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:19177c8d96c7c36359266e571c5173bcee9157b59cfc8cb0153c5673dc5a3a7d", size = 5122723, upload-time = "2026-03-13T13:53:08.662Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/47/b3a5342d381595ef439adec67848bed561ab7fdb1019fa522e82101b7d9c/fonttools-4.62.1-cp312-cp312-win32.whl", hash = "sha256:a24decd24d60744ee8b4679d38e88b8303d86772053afc29b19d23bb8207803c", size = 2281278, upload-time = "2026-03-13T13:53:10.998Z" },
+    { url = "https://files.pythonhosted.org/packages/28/b1/0c2ab56a16f409c6c8a68816e6af707827ad5d629634691ff60a52879792/fonttools-4.62.1-cp312-cp312-win_amd64.whl", hash = "sha256:9e7863e10b3de72376280b515d35b14f5eeed639d1aa7824f4cf06779ec65e42", size = 2331414, upload-time = "2026-03-13T13:53:13.992Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/56/6f389de21c49555553d6a5aeed5ac9767631497ac836c4f076273d15bd72/fonttools-4.62.1-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:c22b1014017111c401469e3acc5433e6acf6ebcc6aa9efb538a533c800971c79", size = 2865155, upload-time = "2026-03-13T13:53:16.132Z" },
+    { url = "https://files.pythonhosted.org/packages/03/c5/0e3966edd5ec668d41dfe418787726752bc07e2f5fd8c8f208615e61fa89/fonttools-4.62.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:68959f5fc58ed4599b44aad161c2837477d7f35f5f79402d97439974faebfebe", size = 2412802, upload-time = "2026-03-13T13:53:18.878Z" },
+    { url = "https://files.pythonhosted.org/packages/52/94/e6ac4b44026de7786fe46e3bfa0c87e51d5d70a841054065d49cd62bb909/fonttools-4.62.1-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ef46db46c9447103b8f3ff91e8ba009d5fe181b1920a83757a5762551e32bb68", size = 5013926, upload-time = "2026-03-13T13:53:21.379Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/98/8b1e801939839d405f1f122e7d175cebe9aeb4e114f95bfc45e3152af9a7/fonttools-4.62.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:6706d1cb1d5e6251a97ad3c1b9347505c5615c112e66047abbef0f8545fa30d1", size = 4964575, upload-time = "2026-03-13T13:53:23.857Z" },
+    { url = "https://files.pythonhosted.org/packages/46/76/7d051671e938b1881670528fec69cc4044315edd71a229c7fd712eaa5119/fonttools-4.62.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:2e7abd2b1e11736f58c1de27819e1955a53267c21732e78243fa2fa2e5c1e069", size = 4953693, upload-time = "2026-03-13T13:53:26.569Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/ae/b41f8628ec0be3c1b934fc12b84f4576a5c646119db4d3bdd76a217c90b5/fonttools-4.62.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:403d28ce06ebfc547fbcb0cb8b7f7cc2f7a2d3e1a67ba9a34b14632df9e080f9", size = 5094920, upload-time = "2026-03-13T13:53:29.329Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/f6/53a1e9469331a23dcc400970a27a4caa3d9f6edbf5baab0260285238b884/fonttools-4.62.1-cp313-cp313-win32.whl", hash = "sha256:93c316e0f5301b2adbe6a5f658634307c096fd5aae60a5b3412e4f3e1728ab24", size = 2279928, upload-time = "2026-03-13T13:53:32.352Z" },
+    { url = "https://files.pythonhosted.org/packages/38/60/35186529de1db3c01f5ad625bde07c1f576305eab6d86bbda4c58445f721/fonttools-4.62.1-cp313-cp313-win_amd64.whl", hash = "sha256:7aa21ff53e28a9c2157acbc44e5b401149d3c9178107130e82d74ceb500e5056", size = 2330514, upload-time = "2026-03-13T13:53:34.991Z" },
+    { url = "https://files.pythonhosted.org/packages/36/f0/2888cdac391807d68d90dcb16ef858ddc1b5309bfc6966195a459dd326e2/fonttools-4.62.1-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:fa1d16210b6b10a826d71bed68dd9ec24a9e218d5a5e2797f37c573e7ec215ca", size = 2864442, upload-time = "2026-03-13T13:53:37.509Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/b2/e521803081f8dc35990816b82da6360fa668a21b44da4b53fc9e77efcd62/fonttools-4.62.1-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:aa69d10ed420d8121118e628ad47d86e4caa79ba37f968597b958f6cceab7eca", size = 2410901, upload-time = "2026-03-13T13:53:40.55Z" },
+    { url = "https://files.pythonhosted.org/packages/00/a4/8c3511ff06e53110039358dbbdc1a65d72157a054638387aa2ada300a8b8/fonttools-4.62.1-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bd13b7999d59c5eb1c2b442eb2d0c427cb517a0b7a1f5798fc5c9e003f5ff782", size = 4999608, upload-time = "2026-03-13T13:53:42.798Z" },
+    { url = "https://files.pythonhosted.org/packages/28/63/cd0c3b26afe60995a5295f37c246a93d454023726c3261cfbb3559969bb9/fonttools-4.62.1-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:8d337fdd49a79b0d51c4da87bc38169d21c3abbf0c1aa9367eff5c6656fb6dae", size = 4912726, upload-time = "2026-03-13T13:53:45.405Z" },
+    { url = "https://files.pythonhosted.org/packages/70/b9/ac677cb07c24c685cf34f64e140617d58789d67a3dd524164b63648c6114/fonttools-4.62.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:d241cdc4a67b5431c6d7f115fdf63335222414995e3a1df1a41e1182acd4bcc7", size = 4951422, upload-time = "2026-03-13T13:53:48.326Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/10/11c08419a14b85b7ca9a9faca321accccc8842dd9e0b1c8a72908de05945/fonttools-4.62.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:c05557a78f8fa514da0f869556eeda40887a8abc77c76ee3f74cf241778afd5a", size = 5060979, upload-time = "2026-03-13T13:53:51.366Z" },
+    { url = "https://files.pythonhosted.org/packages/4e/3c/12eea4a4cf054e7ab058ed5ceada43b46809fce2bf319017c4d63ae55bb4/fonttools-4.62.1-cp314-cp314-win32.whl", hash = "sha256:49a445d2f544ce4a69338694cad575ba97b9a75fff02720da0882d1a73f12800", size = 2283733, upload-time = "2026-03-13T13:53:53.606Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/67/74b070029043186b5dd13462c958cb7c7f811be0d2e634309d9a1ffb1505/fonttools-4.62.1-cp314-cp314-win_amd64.whl", hash = "sha256:1eecc128c86c552fb963fe846ca4e011b1be053728f798185a1687502f6d398e", size = 2335663, upload-time = "2026-03-13T13:53:56.23Z" },
+    { url = "https://files.pythonhosted.org/packages/42/c5/4d2ed3ca6e33617fc5624467da353337f06e7f637707478903c785bd8e20/fonttools-4.62.1-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:1596aeaddf7f78e21e68293c011316a25267b3effdaccaf4d59bc9159d681b82", size = 2947288, upload-time = "2026-03-13T13:53:59.397Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/e9/7ab11ddfda48ed0f89b13380e5595ba572619c27077be0b2c447a63ff351/fonttools-4.62.1-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:8f8fca95d3bb3208f59626a4b0ea6e526ee51f5a8ad5d91821c165903e8d9260", size = 2449023, upload-time = "2026-03-13T13:54:01.642Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/10/a800fa090b5e8819942e54e19b55fc7c21fe14a08757c3aa3ca8db358939/fonttools-4.62.1-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ee91628c08e76f77b533d65feb3fbe6d9dad699f95be51cf0d022db94089cdc4", size = 5137599, upload-time = "2026-03-13T13:54:04.495Z" },
+    { url = "https://files.pythonhosted.org/packages/37/dc/8ccd45033fffd74deb6912fa1ca524643f584b94c87a16036855b498a1ed/fonttools-4.62.1-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5f37df1cac61d906e7b836abe356bc2f34c99d4477467755c216b72aa3dc748b", size = 4920933, upload-time = "2026-03-13T13:54:07.557Z" },
+    { url = "https://files.pythonhosted.org/packages/99/eb/e618adefb839598d25ac8136cd577925d6c513dc0d931d93b8af956210f0/fonttools-4.62.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:92bb00a947e666169c99b43753c4305fc95a890a60ef3aeb2a6963e07902cc87", size = 5016232, upload-time = "2026-03-13T13:54:10.611Z" },
+    { url = "https://files.pythonhosted.org/packages/d9/5f/9b5c9bfaa8ec82def8d8168c4f13615990d6ce5996fe52bd49bfb5e05134/fonttools-4.62.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:bdfe592802ef939a0e33106ea4a318eeb17822c7ee168c290273cbd5fabd746c", size = 5042987, upload-time = "2026-03-13T13:54:13.569Z" },
+    { url = "https://files.pythonhosted.org/packages/90/aa/dfbbe24c6a6afc5c203d90cc0343e24bcbb09e76d67c4d6eef8c2558d7ba/fonttools-4.62.1-cp314-cp314t-win32.whl", hash = "sha256:b820fcb92d4655513d8402d5b219f94481c4443d825b4372c75a2072aa4b357a", size = 2348021, upload-time = "2026-03-13T13:54:16.98Z" },
+    { url = "https://files.pythonhosted.org/packages/13/6f/ae9c4e4dd417948407b680855c2c7790efb52add6009aaecff1e3bc50e8e/fonttools-4.62.1-cp314-cp314t-win_amd64.whl", hash = "sha256:59b372b4f0e113d3746b88985f1c796e7bf830dd54b28374cd85c2b8acd7583e", size = 2414147, upload-time = "2026-03-13T13:54:19.416Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/ba/56147c165442cc5ba7e82ecf301c9a68353cede498185869e6e02b4c264f/fonttools-4.62.1-py3-none-any.whl", hash = "sha256:7487782e2113861f4ddcc07c3436450659e3caa5e470b27dc2177cade2d8e7fd", size = 1152647, upload-time = "2026-03-13T13:54:22.735Z" },
+]
+
 [[package]]
 name = "fqdn"
 version = "1.5.1"
@@ -1354,6 +1610,21 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/eb/02/a6b21098b1d5d6249b7c5ab69dde30108a71e4e819d4a9778f1de1d5b70d/fsspec-2025.10.0-py3-none-any.whl", hash = "sha256:7c7712353ae7d875407f97715f0e1ffcc21e33d5b24556cb1e090ae9409ec61d", size = 200966, upload-time = "2025-10-30T14:58:42.53Z" },
 ]
 
+[[package]]
+name = "gdown"
+version = "5.2.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "beautifulsoup4" },
+    { name = "filelock" },
+    { name = "requests", extra = ["socks"] },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/f4/cf/919a9fa16faf8e4572a24d941353edaf4d54e3ddcd048e6c1aeb8c7a9903/gdown-5.2.1.tar.gz", hash = "sha256:247c2ad1f579db5b66b54c04e6a871995fc8fd7021708b950b8ba7b32cf90323", size = 284743, upload-time = "2026-01-11T09:34:01.037Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/87/21/35dd0a0b7428bd67b12b358d7b4277f693493a3839b071d540a4c8357b78/gdown-5.2.1-py3-none-any.whl", hash = "sha256:391f0480d495fb87644d1a1ee3ddfeb2144e1de31408fbc74f7e3b3ba927052b", size = 18241, upload-time = "2026-01-11T09:34:02.637Z" },
+]
+
 [[package]]
 name = "ghp-import"
 version = "2.1.0"
@@ -1390,6 +1661,38 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/01/61/d4b89fec821f72385526e1b9d9a3a0385dda4a72b206d28049e2c7cd39b8/gitpython-3.1.45-py3-none-any.whl", hash = "sha256:8908cb2e02fb3b93b7eb0f2827125cb699869470432cc885f019b8fd0fccff77", size = 208168, upload-time = "2025-07-24T03:45:52.517Z" },
 ]
 
+[[package]]
+name = "google-api-core"
+version = "2.30.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "google-auth" },
+    { name = "googleapis-common-protos" },
+    { name = "proto-plus" },
+    { name = "protobuf" },
+    { name = "requests" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/22/98/586ec94553b569080caef635f98a3723db36a38eac0e3d7eb3ea9d2e4b9a/google_api_core-2.30.0.tar.gz", hash = "sha256:02edfa9fab31e17fc0befb5f161b3bf93c9096d99aed584625f38065c511ad9b", size = 176959, upload-time = "2026-02-18T20:28:11.926Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/45/27/09c33d67f7e0dcf06d7ac17d196594e66989299374bfb0d4331d1038e76b/google_api_core-2.30.0-py3-none-any.whl", hash = "sha256:80be49ee937ff9aba0fd79a6eddfde35fe658b9953ab9b79c57dd7061afa8df5", size = 173288, upload-time = "2026-02-18T20:28:10.367Z" },
+]
+
+[[package]]
+name = "google-api-python-client"
+version = "2.193.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "google-api-core" },
+    { name = "google-auth" },
+    { name = "google-auth-httplib2" },
+    { name = "httplib2" },
+    { name = "uritemplate" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/90/f4/e14b6815d3b1885328dd209676a3a4c704882743ac94e18ef0093894f5c8/google_api_python_client-2.193.0.tar.gz", hash = "sha256:8f88d16e89d11341e0a8b199cafde0fb7e6b44260dffb88d451577cbd1bb5d33", size = 14281006, upload-time = "2026-03-17T18:25:29.415Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f0/6d/fe75167797790a56d17799b75e1129bb93f7ff061efc7b36e9731bd4be2b/google_api_python_client-2.193.0-py3-none-any.whl", hash = "sha256:c42aa324b822109901cfecab5dc4fc3915d35a7b376835233c916c70610322db", size = 14856490, upload-time = "2026-03-17T18:25:26.608Z" },
+]
+
 [[package]]
 name = "google-auth"
 version = "2.45.0"
@@ -1409,6 +1712,32 @@ requests = [
     { name = "requests" },
 ]
 
+[[package]]
+name = "google-auth-httplib2"
+version = "0.3.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "google-auth" },
+    { name = "httplib2" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/d5/ad/c1f2b1175096a8d04cf202ad5ea6065f108d26be6fc7215876bde4a7981d/google_auth_httplib2-0.3.0.tar.gz", hash = "sha256:177898a0175252480d5ed916aeea183c2df87c1f9c26705d74ae6b951c268b0b", size = 11134, upload-time = "2025-12-15T22:13:51.825Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/99/d5/3c97526c8796d3caf5f4b3bed2b05e8a7102326f00a334e7a438237f3b22/google_auth_httplib2-0.3.0-py3-none-any.whl", hash = "sha256:426167e5df066e3f5a0fc7ea18768c08e7296046594ce4c8c409c2457dd1f776", size = 9529, upload-time = "2025-12-15T22:13:51.048Z" },
+]
+
+[[package]]
+name = "google-auth-oauthlib"
+version = "1.2.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "google-auth" },
+    { name = "requests-oauthlib" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/fb/87/e10bf24f7bcffc1421b84d6f9c3377c30ec305d082cd737ddaa6d8f77f7c/google_auth_oauthlib-1.2.2.tar.gz", hash = "sha256:11046fb8d3348b296302dd939ace8af0a724042e8029c1b872d87fabc9f41684", size = 20955, upload-time = "2025-04-22T16:40:29.172Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ac/84/40ee070be95771acd2f4418981edb834979424565c3eec3cd88b6aa09d24/google_auth_oauthlib-1.2.2-py3-none-any.whl", hash = "sha256:fd619506f4b3908b5df17b65f39ca8d66ea56986e5472eb5978fd8f3786f00a2", size = 19072, upload-time = "2025-04-22T16:40:28.174Z" },
+]
+
 [[package]]
 name = "google-genai"
 version = "1.56.0"
@@ -1509,6 +1838,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/9c/83/3b1d03d36f224edded98e9affd0467630fc09d766c0e56fb1498cbb04a9b/griffe-1.15.0-py3-none-any.whl", hash = "sha256:6f6762661949411031f5fcda9593f586e6ce8340f0ba88921a0f2ef7a81eb9a3", size = 150705, upload-time = "2025-11-10T15:03:13.549Z" },
 ]
 
+[[package]]
+name = "gspread"
+version = "6.2.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "google-auth" },
+    { name = "google-auth-oauthlib" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/91/83/42d1d813822ed016d77aabadc99b09de3b5bd68532fd6bae23fd62347c41/gspread-6.2.1.tar.gz", hash = "sha256:2c7c99f7c32ebea6ec0d36f2d5cbe8a2be5e8f2a48bde87ad1ea203eff32bd03", size = 82590, upload-time = "2025-05-14T15:56:25.254Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/27/76/563fb20dedd0e12794d9a12cfe0198458cc0501fdc7b034eee2166d035d5/gspread-6.2.1-py3-none-any.whl", hash = "sha256:6d4ec9f1c23ae3c704a9219026dac01f2b328ac70b96f1495055d453c4c184db", size = 59977, upload-time = "2025-05-14T15:56:24.014Z" },
+]
+
 [[package]]
 name = "h11"
 version = "0.16.0"
@@ -1518,6 +1860,65 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/04/4b/29cac41a4d98d144bf5f6d33995617b185d14b22401f75ca86f384e87ff1/h11-0.16.0-py3-none-any.whl", hash = "sha256:63cf8bbe7522de3bf65932fda1d9c2772064ffb3dae62d55932da54b31cb6c86", size = 37515, upload-time = "2025-04-24T03:35:24.344Z" },
 ]
 
+[[package]]
+name = "h5py"
+version = "3.16.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/db/33/acd0ce6863b6c0d7735007df01815403f5589a21ff8c2e1ee2587a38f548/h5py-3.16.0.tar.gz", hash = "sha256:a0dbaad796840ccaa67a4c144a0d0c8080073c34c76d5a6941d6818678ef2738", size = 446526, upload-time = "2026-03-06T13:49:08.07Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3a/6b/231413e58a787a89b316bb0d1777da3c62257e4797e09afd8d17ad3549dc/h5py-3.16.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e06f864bedb2c8e7c1358e6c73af48519e317457c444d6f3d332bb4e8fa6d7d9", size = 3724137, upload-time = "2026-03-06T13:47:35.242Z" },
+    { url = "https://files.pythonhosted.org/packages/74/f9/557ce3aad0fe8471fb5279bab0fc56ea473858a022c4ce8a0b8f303d64e9/h5py-3.16.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ec86d4fffd87a0f4cb3d5796ceb5a50123a2a6d99b43e616e5504e66a953eca3", size = 3090112, upload-time = "2026-03-06T13:47:37.634Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/f5/e15b3d0dc8a18e56409a839e6468d6fb589bc5207c917399c2e0706eeb44/h5py-3.16.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:86385ea895508220b8a7e45efa428aeafaa586bd737c7af9ee04661d8d84a10d", size = 4844847, upload-time = "2026-03-06T13:47:39.811Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/92/a8851d936547efe30cc0ce5245feac01f3ec6171f7899bc3f775c72030b3/h5py-3.16.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:8975273c2c5921c25700193b408e28d6bdd0111c37468b2d4e25dcec4cd1d84d", size = 5065352, upload-time = "2026-03-06T13:47:41.489Z" },
+    { url = "https://files.pythonhosted.org/packages/2b/ae/f2adc5d0ca9626db3277a3d87516e124cbc5d0eea0bd79bc085702d04f2c/h5py-3.16.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:1677ad48b703f44efc9ea0c3ab284527f81bc4f318386aaaebc5fede6bbae56f", size = 4839173, upload-time = "2026-03-06T13:47:43.586Z" },
+    { url = "https://files.pythonhosted.org/packages/64/0b/e0c8c69da1d8838da023a50cd3080eae5d475691f7636b35eff20bb6ef20/h5py-3.16.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:7c4dd4cf5f0a4e36083f73172f6cfc25a5710789269547f132a20975bfe2434c", size = 5076216, upload-time = "2026-03-06T13:47:45.315Z" },
+    { url = "https://files.pythonhosted.org/packages/66/35/d88fd6718832133c885004c61ceeeb24dbd6397ef877dbed6b3a64d6a286/h5py-3.16.0-cp310-cp310-win_amd64.whl", hash = "sha256:bdef06507725b455fccba9c16529121a5e1fbf56aa375f7d9713d9e8ff42454d", size = 3183639, upload-time = "2026-03-06T13:47:47.041Z" },
+    { url = "https://files.pythonhosted.org/packages/ba/95/a825894f3e45cbac7554c4e97314ce886b233a20033787eda755ca8fecc7/h5py-3.16.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:719439d14b83f74eeb080e9650a6c7aa6d0d9ea0ca7f804347b05fac6fbf18af", size = 3721663, upload-time = "2026-03-06T13:47:49.599Z" },
+    { url = "https://files.pythonhosted.org/packages/bf/3b/38ff88b347c3e346cda1d3fc1b65a7aa75d40632228d8b8a5d7b58508c24/h5py-3.16.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:c3f0a0e136f2e95dd0b67146abb6668af4f1a69c81ef8651a2d316e8e01de447", size = 3087630, upload-time = "2026-03-06T13:47:51.249Z" },
+    { url = "https://files.pythonhosted.org/packages/98/a8/2594cef906aee761601eff842c7dc598bea2b394a3e1c00966832b8eeb7c/h5py-3.16.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:a6fbc5367d4046801f9b7db9191b31895f22f1c6df1f9987d667854cac493538", size = 4823472, upload-time = "2026-03-06T13:47:53.085Z" },
+    { url = "https://files.pythonhosted.org/packages/52/a0/c1f604538ff6db22a0690be2dc44ab59178e115f63c917794e529356ab23/h5py-3.16.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:fb1720028d99040792bb2fb31facb8da44a6f29df7697e0b84f0d79aff2e9bd3", size = 5027150, upload-time = "2026-03-06T13:47:55.043Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/fd/301739083c2fc4fd89950f9bcfce75d6e14b40b0ca3d40e48a8993d1722c/h5py-3.16.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:314b6054fe0b1051c2b0cb2df5cbdab15622fb05e80f202e3b6a5eee0d6fe365", size = 4814544, upload-time = "2026-03-06T13:47:56.893Z" },
+    { url = "https://files.pythonhosted.org/packages/4c/42/2193ed41ccee78baba8fcc0cff2c925b8b9ee3793305b23e1f22c20bf4c7/h5py-3.16.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ffbab2fedd6581f6aa31cf1639ca2cb86e02779de525667892ebf4cc9fd26434", size = 5034013, upload-time = "2026-03-06T13:47:59.01Z" },
+    { url = "https://files.pythonhosted.org/packages/f7/20/e6c0ff62ca2ad1a396a34f4380bafccaaf8791ff8fccf3d995a1fc12d417/h5py-3.16.0-cp311-cp311-win_amd64.whl", hash = "sha256:17d1f1630f92ad74494a9a7392ab25982ce2b469fc62da6074c0ce48366a2999", size = 3191673, upload-time = "2026-03-06T13:48:00.626Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/48/239cbe352ac4f2b8243a8e620fa1a2034635f633731493a7ff1ed71e8658/h5py-3.16.0-cp311-cp311-win_arm64.whl", hash = "sha256:85b9c49dd58dc44cf70af944784e2c2038b6f799665d0dcbbc812a26e0faa859", size = 2673834, upload-time = "2026-03-06T13:48:02.579Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/c0/5d4119dba94093bbafede500d3defd2f5eab7897732998c04b54021e530b/h5py-3.16.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:c5313566f4643121a78503a473f0fb1e6dcc541d5115c44f05e037609c565c4d", size = 3685604, upload-time = "2026-03-06T13:48:04.198Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/42/c84efcc1d4caebafb1ecd8be4643f39c85c47a80fe254d92b8b43b1eadaf/h5py-3.16.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:42b012933a83e1a558c673176676a10ce2fd3759976a0fedee1e672d1e04fc9d", size = 3061940, upload-time = "2026-03-06T13:48:05.783Z" },
+    { url = "https://files.pythonhosted.org/packages/89/84/06281c82d4d1686fde1ac6b0f307c50918f1c0151062445ab3b6fa5a921d/h5py-3.16.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:ff24039e2573297787c3063df64b60aab0591980ac898329a08b0320e0cf2527", size = 5198852, upload-time = "2026-03-06T13:48:07.482Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/e9/1a19e42cd43cc1365e127db6aae85e1c671da1d9a5d746f4d34a50edb577/h5py-3.16.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:dfc21898ff025f1e8e67e194965a95a8d4754f452f83454538f98f8a3fcb207e", size = 5405250, upload-time = "2026-03-06T13:48:09.628Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/8e/9790c1655eabeb85b92b1ecab7d7e62a2069e53baefd58c98f0909c7a948/h5py-3.16.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:698dd69291272642ffda44a0ecd6cd3bda5faf9621452d255f57ce91487b9794", size = 5190108, upload-time = "2026-03-06T13:48:11.26Z" },
+    { url = "https://files.pythonhosted.org/packages/51/d7/ab693274f1bd7e8c5f9fdd6c7003a88d59bedeaf8752716a55f532924fbb/h5py-3.16.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2b2c02b0a160faed5fb33f1ba8a264a37ee240b22e049ecc827345d0d9043074", size = 5419216, upload-time = "2026-03-06T13:48:13.322Z" },
+    { url = "https://files.pythonhosted.org/packages/03/c1/0976b235cf29ead553e22f2fb6385a8252b533715e00d0ae52ed7b900582/h5py-3.16.0-cp312-cp312-win_amd64.whl", hash = "sha256:96b422019a1c8975c2d5dadcf61d4ba6f01c31f92bbde6e4649607885fe502d6", size = 3182868, upload-time = "2026-03-06T13:48:15.759Z" },
+    { url = "https://files.pythonhosted.org/packages/14/d9/866b7e570b39070f92d47b0ff1800f0f8239b6f9e45f02363d7112336c1f/h5py-3.16.0-cp312-cp312-win_arm64.whl", hash = "sha256:39c2838fb1e8d97bcf1755e60ad1f3dd76a7b2a475928dc321672752678b96db", size = 2653286, upload-time = "2026-03-06T13:48:17.279Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/9e/6142ebfda0cb6e9349c091eae73c2e01a770b7659255248d637bec54a88b/h5py-3.16.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:370a845f432c2c9619db8eed334d1e610c6015796122b0e57aa46312c22617d9", size = 3671808, upload-time = "2026-03-06T13:48:19.737Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/65/5e088a45d0f43cd814bc5bec521c051d42005a472e804b1a36c48dada09b/h5py-3.16.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:42108e93326c50c2810025aade9eac9d6827524cdccc7d4b75a546e5ab308edb", size = 3045837, upload-time = "2026-03-06T13:48:21.854Z" },
+    { url = "https://files.pythonhosted.org/packages/da/1e/6172269e18cc5a484e2913ced33339aad588e02ba407fafd00d369e22ef3/h5py-3.16.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:099f2525c9dcf28de366970a5fb34879aab20491589fa89ce2863a84218bb524", size = 5193860, upload-time = "2026-03-06T13:48:24.071Z" },
+    { url = "https://files.pythonhosted.org/packages/bd/98/ef2b6fe2903e377cbe870c3b2800d62552f1e3dbe81ce49e1923c53d1c5c/h5py-3.16.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9300ad32dea9dfc5171f94d5f6948e159ed93e4701280b0f508773b3f582f402", size = 5400417, upload-time = "2026-03-06T13:48:25.728Z" },
+    { url = "https://files.pythonhosted.org/packages/bc/81/5b62d760039eed64348c98129d17061fdfc7839fc9c04eaaad6dee1004e4/h5py-3.16.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:171038f23bccddfc23f344cadabdfc9917ff554db6a0d417180d2747fe4c75a7", size = 5185214, upload-time = "2026-03-06T13:48:27.436Z" },
+    { url = "https://files.pythonhosted.org/packages/28/c4/532123bcd9080e250696779c927f2cb906c8bf3447df98f5ceb8dcded539/h5py-3.16.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7e420b539fb6023a259a1b14d4c9f6df8cf50d7268f48e161169987a57b737ff", size = 5414598, upload-time = "2026-03-06T13:48:29.49Z" },
+    { url = "https://files.pythonhosted.org/packages/c3/d9/a27997f84341fc0dfcdd1fe4179b6ba6c32a7aa880fdb8c514d4dad6fba3/h5py-3.16.0-cp313-cp313-win_amd64.whl", hash = "sha256:18f2bbcd545e6991412253b98727374c356d67caa920e68dc79eab36bf5fedad", size = 3175509, upload-time = "2026-03-06T13:48:31.131Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/23/bb8647521d4fd770c30a76cfc6cb6a2f5495868904054e92f2394c5a78ff/h5py-3.16.0-cp313-cp313-win_arm64.whl", hash = "sha256:656f00e4d903199a1d58df06b711cf3ca632b874b4207b7dbec86185b5c8c7d4", size = 2647362, upload-time = "2026-03-06T13:48:33.411Z" },
+    { url = "https://files.pythonhosted.org/packages/48/3c/7fcd9b4c9eed82e91fb15568992561019ae7a829d1f696b2c844355d95dd/h5py-3.16.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:9c9d307c0ef862d1cd5714f72ecfafe0a5d7529c44845afa8de9f46e5ba8bd65", size = 3678608, upload-time = "2026-03-06T13:48:35.183Z" },
+    { url = "https://files.pythonhosted.org/packages/6a/b7/9366ed44ced9b7ef357ab48c94205280276db9d7f064aa3012a97227e966/h5py-3.16.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:8c1eff849cdd53cbc73c214c30ebdb6f1bb8b64790b4b4fc36acdb5e43570210", size = 3054773, upload-time = "2026-03-06T13:48:37.139Z" },
+    { url = "https://files.pythonhosted.org/packages/58/a5/4964bc0e91e86340c2bbda83420225b2f770dcf1eb8a39464871ad769436/h5py-3.16.0-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:e2c04d129f180019e216ee5f9c40b78a418634091c8782e1f723a6ca3658b965", size = 5198886, upload-time = "2026-03-06T13:48:38.879Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/16/d905e7f53e661ce2c24686c38048d8e2b750ffc4350009d41c4e6c6c9826/h5py-3.16.0-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:e4360f15875a532bc7b98196c7592ed4fc92672a57c0a621355961cafb17a6dd", size = 5404883, upload-time = "2026-03-06T13:48:41.324Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/f2/58f34cb74af46d39f4cd18ea20909a8514960c5a3e5b92fd06a28161e0a8/h5py-3.16.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:3fae9197390c325e62e0a1aa977f2f62d994aa87aab182abbea85479b791197c", size = 5192039, upload-time = "2026-03-06T13:48:43.117Z" },
+    { url = "https://files.pythonhosted.org/packages/ce/ca/934a39c24ce2e2db017268c08da0537c20fa0be7e1549be3e977313fc8f5/h5py-3.16.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:43259303989ac8adacc9986695b31e35dba6fd1e297ff9c6a04b7da5542139cc", size = 5421526, upload-time = "2026-03-06T13:48:44.838Z" },
+    { url = "https://files.pythonhosted.org/packages/3e/14/615a450205e1b56d16c6783f5ccd116cde05550faad70ae077c955654a75/h5py-3.16.0-cp314-cp314-win_amd64.whl", hash = "sha256:fa48993a0b799737ba7fd21e2350fa0a60701e58180fae9f2de834bc39a147ab", size = 3183263, upload-time = "2026-03-06T13:48:47.117Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/48/a6faef5ed632cae0c65ac6b214a6614a0b510c3183532c521bdb0055e117/h5py-3.16.0-cp314-cp314-win_arm64.whl", hash = "sha256:1897a771a7f40d05c262fc8f37376ec37873218544b70216872876c627640f63", size = 2663450, upload-time = "2026-03-06T13:48:48.707Z" },
+    { url = "https://files.pythonhosted.org/packages/5d/32/0c8bb8aedb62c772cf7c1d427c7d1951477e8c2835f872bc0a13d1f85f86/h5py-3.16.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:15922e485844f77c0b9d275396d435db3baa58292a9c2176a386e072e0cf2491", size = 3760693, upload-time = "2026-03-06T13:48:50.453Z" },
+    { url = "https://files.pythonhosted.org/packages/1d/1f/fcc5977d32d6387c5c9a694afee716a5e20658ac08b3ff24fdec79fb05f2/h5py-3.16.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:df02dd29bd247f98674634dfe41f89fd7c16ba3d7de8695ec958f58404a4e618", size = 3181305, upload-time = "2026-03-06T13:48:52.221Z" },
+    { url = "https://files.pythonhosted.org/packages/f5/a1/af87f64b9f986889884243643621ebbd4ac72472ba8ec8cec891ac8e2ca1/h5py-3.16.0-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:0f456f556e4e2cebeebd9d66adf8dc321770a42593494a0b6f0af54a7567b242", size = 5074061, upload-time = "2026-03-06T13:48:54.089Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/d0/146f5eaff3dc246a9c7f6e5e4f42bd45cc613bce16693bcd4d1f7c958bf5/h5py-3.16.0-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:3e6cb3387c756de6a9492d601553dffea3fe11b5f22b443aac708c69f3f55e16", size = 5279216, upload-time = "2026-03-06T13:48:56.75Z" },
+    { url = "https://files.pythonhosted.org/packages/a1/9d/12a13424f1e604fc7df9497b73c0356fb78c2fb206abd7465ce47226e8fd/h5py-3.16.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:8389e13a1fd745ad2856873e8187fd10268b2d9677877bb667b41aebd771d8b7", size = 5070068, upload-time = "2026-03-06T13:48:59.169Z" },
+    { url = "https://files.pythonhosted.org/packages/41/8c/bbe98f813722b4873818a8db3e15aa3e625b59278566905ac439725e8070/h5py-3.16.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:346df559a0f7dcb31cf8e44805319e2ab24b8957c45e7708ce503b2ec79ba725", size = 5300253, upload-time = "2026-03-06T13:49:02.033Z" },
+    { url = "https://files.pythonhosted.org/packages/32/9e/87e6705b4d6890e7cecdf876e2a7d3e40654a2ae37482d79a6f1b87f7b92/h5py-3.16.0-cp314-cp314t-win_amd64.whl", hash = "sha256:4c6ab014ab704b4feaa719ae783b86522ed0bf1f82184704ed3c9e4e3228796e", size = 3381671, upload-time = "2026-03-06T13:49:04.351Z" },
+    { url = "https://files.pythonhosted.org/packages/96/91/9fad90cfc5f9b2489c7c26ad897157bce82f0e9534a986a221b99760b23b/h5py-3.16.0-cp314-cp314t-win_arm64.whl", hash = "sha256:faca8fb4e4319c09d83337adc80b2ca7d5c5a343c2d6f1b6388f32cfecca13c1", size = 2740706, upload-time = "2026-03-06T13:49:06.347Z" },
+]
+
 [[package]]
 name = "hf-xet"
 version = "1.2.0"
@@ -1560,6 +1961,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" },
 ]
 
+[[package]]
+name = "httplib2"
+version = "0.31.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "pyparsing" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/c1/1f/e86365613582c027dda5ddb64e1010e57a3d53e99ab8a72093fa13d565ec/httplib2-0.31.2.tar.gz", hash = "sha256:385e0869d7397484f4eab426197a4c020b606edd43372492337c0b4010ae5d24", size = 250800, upload-time = "2026-01-23T11:04:44.165Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2f/90/fd509079dfcab01102c0fdd87f3a9506894bc70afcf9e9785ef6b2b3aff6/httplib2-0.31.2-py3-none-any.whl", hash = "sha256:dbf0c2fa3862acf3c55c078ea9c0bc4481d7dc5117cae71be9514912cf9f8349", size = 91099, upload-time = "2026-01-23T11:04:42.78Z" },
+]
+
 [[package]]
 name = "httpx"
 version = "0.28.1"
@@ -1707,10 +2120,12 @@ version = "9.8.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
 ]
 dependencies = [
@@ -1929,6 +2344,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/85/e2/05328bd2621be49a6fed9e3030b1e51a2d04537d3f816d211b9cc53c5262/json5-0.12.1-py3-none-any.whl", hash = "sha256:d9c9b3bc34a5f54d43c35e11ef7cb87d8bdd098c6ace87117a7b7e83e705c1d5", size = 36119, upload-time = "2025-08-12T19:47:41.131Z" },
 ]
 
+[[package]]
+name = "jsonlines"
+version = "4.0.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "attrs" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/35/87/bcda8e46c88d0e34cad2f09ee2d0c7f5957bccdb9791b0b934ec84d84be4/jsonlines-4.0.0.tar.gz", hash = "sha256:0c6d2c09117550c089995247f605ae4cf77dd1533041d366351f6f298822ea74", size = 11359, upload-time = "2023-09-01T12:34:44.187Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f8/62/d9ba6323b9202dd2fe166beab8a86d29465c41a0288cbe229fac60c1ab8d/jsonlines-4.0.0-py3-none-any.whl", hash = "sha256:185b334ff2ca5a91362993f42e83588a360cf95ce4b71a73548502bda52a7c55", size = 8701, upload-time = "2023-09-01T12:34:42.563Z" },
+]
+
 [[package]]
 name = "jsonpatch"
 version = "1.33"
@@ -2225,6 +2652,130 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/82/07/e2f42a8ec3ff1935debbf2a5255570d22033fca3fe3180d5af99a6c9ee8c/keybert-0.9.0-py3-none-any.whl", hash = "sha256:afa2f300a72f69d279e4482bc85d8b34493b119876dc0818cb4f260466285b36", size = 41364, upload-time = "2025-02-07T08:45:08.093Z" },
 ]
 
+[[package]]
+name = "kiwisolver"
+version = "1.5.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/d0/67/9c61eccb13f0bdca9307614e782fec49ffdde0f7a2314935d489fa93cd9c/kiwisolver-1.5.0.tar.gz", hash = "sha256:d4193f3d9dc3f6f79aaed0e5637f45d98850ebf01f7ca20e69457f3e8946b66a", size = 103482, upload-time = "2026-03-09T13:15:53.382Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ac/f8/06549565caa026e540b7e7bab5c5a90eb7ca986015f4c48dace243cd24d9/kiwisolver-1.5.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:32cc0a5365239a6ea0c6ed461e8838d053b57e397443c0ca894dcc8e388d4374", size = 122802, upload-time = "2026-03-09T13:12:37.515Z" },
+    { url = "https://files.pythonhosted.org/packages/84/eb/8476a0818850c563ff343ea7c9c05dcdcbd689a38e01aa31657df01f91fa/kiwisolver-1.5.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:cc0b66c1eec9021353a4b4483afb12dfd50e3669ffbb9152d6842eb34c7e29fd", size = 66216, upload-time = "2026-03-09T13:12:38.812Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/c4/f9c8a6b4c21aed4198566e45923512986d6cef530e7263b3a5f823546561/kiwisolver-1.5.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:86e0287879f75621ae85197b0877ed2f8b7aa57b511c7331dce2eb6f4de7d476", size = 63917, upload-time = "2026-03-09T13:12:40.053Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/0e/ba4ae25d03722f64de8b2c13e80d82ab537a06b30fc7065183c6439357e3/kiwisolver-1.5.0-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:62f59da443c4f4849f73a51a193b1d9d258dcad0c41bc4d1b8fb2bcc04bfeb22", size = 1628776, upload-time = "2026-03-09T13:12:41.976Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/e4/3f43a011bc8a0860d1c96f84d32fa87439d3feedf66e672fef03bf5e8bac/kiwisolver-1.5.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9190426b7aa26c5229501fa297b8d0653cfd3f5a36f7990c264e157cbf886b3b", size = 1228164, upload-time = "2026-03-09T13:12:44.002Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/34/3a901559a1e0c218404f9a61a93be82d45cb8f44453ba43088644980f033/kiwisolver-1.5.0-cp310-cp310-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c8277104ded0a51e699c8c3aff63ce2c56d4ed5519a5f73e0fd7057f959a2b9e", size = 1246656, upload-time = "2026-03-09T13:12:45.557Z" },
+    { url = "https://files.pythonhosted.org/packages/87/9e/f78c466ea20527822b95ad38f141f2de1dcd7f23fb8716b002b0d91bbe59/kiwisolver-1.5.0-cp310-cp310-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:8f9baf6f0a6e7571c45c8863010b45e837c3ee1c2c77fcd6ef423be91b21fedb", size = 1295562, upload-time = "2026-03-09T13:12:47.562Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/66/fd0e4a612e3a286c24e6d6f3a5428d11258ed1909bc530ba3b59807fd980/kiwisolver-1.5.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:cff8e5383db4989311f99e814feeb90c4723eb4edca425b9d5d9c3fefcdd9537", size = 2178473, upload-time = "2026-03-09T13:12:50.254Z" },
+    { url = "https://files.pythonhosted.org/packages/dc/8e/6cac929e0049539e5ee25c1ee937556f379ba5204840d03008363ced662d/kiwisolver-1.5.0-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:ebae99ed6764f2b5771c522477b311be313e8841d2e0376db2b10922daebbba4", size = 2274035, upload-time = "2026-03-09T13:12:51.785Z" },
+    { url = "https://files.pythonhosted.org/packages/ca/d3/9d0c18f1b52ea8074b792452cf17f1f5a56bd0302a85191f405cfbf9da16/kiwisolver-1.5.0-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:d5cd5189fc2b6a538b75ae45433140c4823463918f7b1617c31e68b085c0022c", size = 2443217, upload-time = "2026-03-09T13:12:53.329Z" },
+    { url = "https://files.pythonhosted.org/packages/45/2a/6e19368803a038b2a90857bf4ee9e3c7b667216d045866bf22d3439fd75e/kiwisolver-1.5.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f42c23db5d1521218a3276bb08666dcb662896a0be7347cba864eca45ff64ede", size = 2249196, upload-time = "2026-03-09T13:12:55.057Z" },
+    { url = "https://files.pythonhosted.org/packages/75/2b/3f641dfcbe72e222175d626bacf2f72c3b34312afec949dd1c50afa400f5/kiwisolver-1.5.0-cp310-cp310-win_amd64.whl", hash = "sha256:94eff26096eb5395136634622515b234ecb6c9979824c1f5004c6e3c3c85ccd2", size = 73389, upload-time = "2026-03-09T13:12:56.496Z" },
+    { url = "https://files.pythonhosted.org/packages/da/88/299b137b9e0025d8982e03d2d52c123b0a2b159e84b0ef1501ef446339cf/kiwisolver-1.5.0-cp310-cp310-win_arm64.whl", hash = "sha256:dd952e03bfbb096cfe2dd35cd9e00f269969b67536cb4370994afc20ff2d0875", size = 64782, upload-time = "2026-03-09T13:12:57.609Z" },
+    { url = "https://files.pythonhosted.org/packages/12/dd/a495a9c104be1c476f0386e714252caf2b7eca883915422a64c50b88c6f5/kiwisolver-1.5.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:9eed0f7edbb274413b6ee781cca50541c8c0facd3d6fd289779e494340a2b85c", size = 122798, upload-time = "2026-03-09T13:12:58.963Z" },
+    { url = "https://files.pythonhosted.org/packages/11/60/37b4047a2af0cf5ef6d8b4b26e91829ae6fc6a2d1f74524bcb0e7cd28a32/kiwisolver-1.5.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3c4923e404d6bcd91b6779c009542e5647fef32e4a5d75e115e3bbac6f2335eb", size = 66216, upload-time = "2026-03-09T13:13:00.155Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/aa/510dc933d87767584abfe03efa445889996c70c2990f6f87c3ebaa0a18c5/kiwisolver-1.5.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:0df54df7e686afa55e6f21fb86195224a6d9beb71d637e8d7920c95cf0f89aac", size = 63911, upload-time = "2026-03-09T13:13:01.671Z" },
+    { url = "https://files.pythonhosted.org/packages/80/46/bddc13df6c2a40741e0cc7865bb1c9ed4796b6760bd04ce5fae3928ef917/kiwisolver-1.5.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2517e24d7315eb51c10664cdb865195df38ab74456c677df67bb47f12d088a27", size = 1438209, upload-time = "2026-03-09T13:13:03.385Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/d6/76621246f5165e5372f02f5e6f3f48ea336a8f9e96e43997d45b240ed8cd/kiwisolver-1.5.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ff710414307fefa903e0d9bdf300972f892c23477829f49504e59834f4195398", size = 1248888, upload-time = "2026-03-09T13:13:05.231Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/c1/31559ec6fb39a5b48035ce29bb63ade628f321785f38c384dee3e2c08bc1/kiwisolver-1.5.0-cp311-cp311-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:6176c1811d9d5a04fa391c490cc44f451e240697a16977f11c6f722efb9041db", size = 1266304, upload-time = "2026-03-09T13:13:06.743Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/ef/1cb8276f2d29cc6a41e0a042f27946ca347d3a4a75acf85d0a16aa6dcc82/kiwisolver-1.5.0-cp311-cp311-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:50847dca5d197fcbd389c805aa1a1cf32f25d2e7273dc47ab181a517666b68cc", size = 1319650, upload-time = "2026-03-09T13:13:08.607Z" },
+    { url = "https://files.pythonhosted.org/packages/4c/e4/5ba3cecd7ce6236ae4a80f67e5d5531287337d0e1f076ca87a5abe4cd5d0/kiwisolver-1.5.0-cp311-cp311-manylinux_2_39_riscv64.whl", hash = "sha256:01808c6d15f4c3e8559595d6d1fe6411c68e4a3822b4b9972b44473b24f4e679", size = 970949, upload-time = "2026-03-09T13:13:10.299Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/69/dc61f7ae9a2f071f26004ced87f078235b5507ab6e5acd78f40365655034/kiwisolver-1.5.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:f1f9f4121ec58628c96baa3de1a55a4e3a333c5102c8e94b64e23bf7b2083309", size = 2199125, upload-time = "2026-03-09T13:13:11.841Z" },
+    { url = "https://files.pythonhosted.org/packages/e5/7b/abbe0f1b5afa85f8d084b73e90e5f801c0939eba16ac2e49af7c61a6c28d/kiwisolver-1.5.0-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:b7d335370ae48a780c6e6a6bbfa97342f563744c39c35562f3f367665f5c1de2", size = 2293783, upload-time = "2026-03-09T13:13:14.399Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/80/5908ae149d96d81580d604c7f8aefd0e98f4fd728cf172f477e9f2a81744/kiwisolver-1.5.0-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:800ee55980c18545af444d93fdd60c56b580db5cc54867d8cbf8a1dc0829938c", size = 1960726, upload-time = "2026-03-09T13:13:16.047Z" },
+    { url = "https://files.pythonhosted.org/packages/84/08/a78cb776f8c085b7143142ce479859cfec086bd09ee638a317040b6ef420/kiwisolver-1.5.0-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:c438f6ca858697c9ab67eb28246c92508af972e114cac34e57a6d4ba17a3ac08", size = 2464738, upload-time = "2026-03-09T13:13:17.897Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/e1/65584da5356ed6cb12c63791a10b208860ac40a83de165cb6a6751a686e3/kiwisolver-1.5.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:8c63c91f95173f9c2a67c7c526b2cea976828a0e7fced9cdcead2802dc10f8a4", size = 2270718, upload-time = "2026-03-09T13:13:19.421Z" },
+    { url = "https://files.pythonhosted.org/packages/be/6c/28f17390b62b8f2f520e2915095b3c94d88681ecf0041e75389d9667f202/kiwisolver-1.5.0-cp311-cp311-win_amd64.whl", hash = "sha256:beb7f344487cdcb9e1efe4b7a29681b74d34c08f0043a327a74da852a6749e7b", size = 73480, upload-time = "2026-03-09T13:13:20.818Z" },
+    { url = "https://files.pythonhosted.org/packages/d8/0e/2ee5debc4f77a625778fec5501ff3e8036fe361b7ee28ae402a485bb9694/kiwisolver-1.5.0-cp311-cp311-win_arm64.whl", hash = "sha256:ad4ae4ffd1ee9cd11357b4c66b612da9888f4f4daf2f36995eda64bd45370cac", size = 64930, upload-time = "2026-03-09T13:13:21.997Z" },
+    { url = "https://files.pythonhosted.org/packages/4d/b2/818b74ebea34dabe6d0c51cb1c572e046730e64844da6ed646d5298c40ce/kiwisolver-1.5.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:4e9750bc21b886308024f8a54ccb9a2cc38ac9fa813bf4348434e3d54f337ff9", size = 123158, upload-time = "2026-03-09T13:13:23.127Z" },
+    { url = "https://files.pythonhosted.org/packages/bf/d9/405320f8077e8e1c5c4bd6adc45e1e6edf6d727b6da7f2e2533cf58bff71/kiwisolver-1.5.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:72ec46b7eba5b395e0a7b63025490d3214c11013f4aacb4f5e8d6c3041829588", size = 66388, upload-time = "2026-03-09T13:13:24.765Z" },
+    { url = "https://files.pythonhosted.org/packages/99/9f/795fedf35634f746151ca8839d05681ceb6287fbed6cc1c9bf235f7887c2/kiwisolver-1.5.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:ed3a984b31da7481b103f68776f7128a89ef26ed40f4dc41a2223cda7fb24819", size = 64068, upload-time = "2026-03-09T13:13:25.878Z" },
+    { url = "https://files.pythonhosted.org/packages/c4/13/680c54afe3e65767bed7ec1a15571e1a2f1257128733851ade24abcefbcc/kiwisolver-1.5.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:bb5136fb5352d3f422df33f0c879a1b0c204004324150cc3b5e3c4f310c9049f", size = 1477934, upload-time = "2026-03-09T13:13:27.166Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/2f/cebfcdb60fd6a9b0f6b47a9337198bcbad6fbe15e68189b7011fd914911f/kiwisolver-1.5.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b2af221f268f5af85e776a73d62b0845fc8baf8ef0abfae79d29c77d0e776aaf", size = 1278537, upload-time = "2026-03-09T13:13:28.707Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/0d/9b782923aada3fafb1d6b84e13121954515c669b18af0c26e7d21f579855/kiwisolver-1.5.0-cp312-cp312-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b0f172dc8ffaccb8522d7c5d899de00133f2f1ca7b0a49b7da98e901de87bf2d", size = 1296685, upload-time = "2026-03-09T13:13:30.528Z" },
+    { url = "https://files.pythonhosted.org/packages/27/70/83241b6634b04fe44e892688d5208332bde130f38e610c0418f9ede47ded/kiwisolver-1.5.0-cp312-cp312-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:6ab8ba9152203feec73758dad83af9a0bbe05001eb4639e547207c40cfb52083", size = 1346024, upload-time = "2026-03-09T13:13:32.818Z" },
+    { url = "https://files.pythonhosted.org/packages/e4/db/30ed226fb271ae1a6431fc0fe0edffb2efe23cadb01e798caeb9f2ceae8f/kiwisolver-1.5.0-cp312-cp312-manylinux_2_39_riscv64.whl", hash = "sha256:cdee07c4d7f6d72008d3f73b9bf027f4e11550224c7c50d8df1ae4a37c1402a6", size = 987241, upload-time = "2026-03-09T13:13:34.435Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/bd/c314595208e4c9587652d50959ead9e461995389664e490f4dce7ff0f782/kiwisolver-1.5.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:7c60d3c9b06fb23bd9c6139281ccbdc384297579ae037f08ae90c69f6845c0b1", size = 2227742, upload-time = "2026-03-09T13:13:36.4Z" },
+    { url = "https://files.pythonhosted.org/packages/c1/43/0499cec932d935229b5543d073c2b87c9c22846aab48881e9d8d6e742a2d/kiwisolver-1.5.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:e315e5ec90d88e140f57696ff85b484ff68bb311e36f2c414aa4286293e6dee0", size = 2323966, upload-time = "2026-03-09T13:13:38.204Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/6f/79b0d760907965acfd9d61826a3d41f8f093c538f55cd2633d3f0db269f6/kiwisolver-1.5.0-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:1465387ac63576c3e125e5337a6892b9e99e0627d52317f3ca79e6930d889d15", size = 1977417, upload-time = "2026-03-09T13:13:39.966Z" },
+    { url = "https://files.pythonhosted.org/packages/ab/31/01d0537c41cb75a551a438c3c7a80d0c60d60b81f694dac83dd436aec0d0/kiwisolver-1.5.0-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:530a3fd64c87cffa844d4b6b9768774763d9caa299e9b75d8eca6a4423b31314", size = 2491238, upload-time = "2026-03-09T13:13:41.698Z" },
+    { url = "https://files.pythonhosted.org/packages/e4/34/8aefdd0be9cfd00a44509251ba864f5caf2991e36772e61c408007e7f417/kiwisolver-1.5.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:1d9daea4ea6b9be74fe2f01f7fbade8d6ffab263e781274cffca0dba9be9eec9", size = 2294947, upload-time = "2026-03-09T13:13:43.343Z" },
+    { url = "https://files.pythonhosted.org/packages/ad/cf/0348374369ca588f8fe9c338fae49fa4e16eeb10ffb3d012f23a54578a9e/kiwisolver-1.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:f18c2d9782259a6dc132fdc7a63c168cbc74b35284b6d75c673958982a378384", size = 73569, upload-time = "2026-03-09T13:13:45.792Z" },
+    { url = "https://files.pythonhosted.org/packages/28/26/192b26196e2316e2bd29deef67e37cdf9870d9af8e085e521afff0fed526/kiwisolver-1.5.0-cp312-cp312-win_arm64.whl", hash = "sha256:f7c7553b13f69c1b29a5bde08ddc6d9d0c8bfb84f9ed01c30db25944aeb852a7", size = 64997, upload-time = "2026-03-09T13:13:46.878Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/69/024d6711d5ba575aa65d5538042e99964104e97fa153a9f10bc369182bc2/kiwisolver-1.5.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:fd40bb9cd0891c4c3cb1ddf83f8bbfa15731a248fdc8162669405451e2724b09", size = 123166, upload-time = "2026-03-09T13:13:48.032Z" },
+    { url = "https://files.pythonhosted.org/packages/ce/48/adbb40df306f587054a348831220812b9b1d787aff714cfbc8556e38fccd/kiwisolver-1.5.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:c0e1403fd7c26d77c1f03e096dc58a5c726503fa0db0456678b8668f76f521e3", size = 66395, upload-time = "2026-03-09T13:13:49.365Z" },
+    { url = "https://files.pythonhosted.org/packages/a8/3a/d0a972b34e1c63e2409413104216cd1caa02c5a37cb668d1687d466c1c45/kiwisolver-1.5.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:dda366d548e89a90d88a86c692377d18d8bd64b39c1fb2b92cb31370e2896bbd", size = 64065, upload-time = "2026-03-09T13:13:50.562Z" },
+    { url = "https://files.pythonhosted.org/packages/2b/0a/7b98e1e119878a27ba8618ca1e18b14f992ff1eda40f47bccccf4de44121/kiwisolver-1.5.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:332b4f0145c30b5f5ad9374881133e5aa64320428a57c2c2b61e9d891a51c2f3", size = 1477903, upload-time = "2026-03-09T13:13:52.084Z" },
+    { url = "https://files.pythonhosted.org/packages/18/d8/55638d89ffd27799d5cc3d8aa28e12f4ce7a64d67b285114dbedc8ea4136/kiwisolver-1.5.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0c50b89ffd3e1a911c69a1dd3de7173c0cd10b130f56222e57898683841e4f96", size = 1278751, upload-time = "2026-03-09T13:13:54.673Z" },
+    { url = "https://files.pythonhosted.org/packages/b8/97/b4c8d0d18421ecceba20ad8701358453b88e32414e6f6950b5a4bad54e65/kiwisolver-1.5.0-cp313-cp313-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4db576bb8c3ef9365f8b40fe0f671644de6736ae2c27a2c62d7d8a1b4329f099", size = 1296793, upload-time = "2026-03-09T13:13:56.287Z" },
+    { url = "https://files.pythonhosted.org/packages/c4/10/f862f94b6389d8957448ec9df59450b81bec4abb318805375c401a1e6892/kiwisolver-1.5.0-cp313-cp313-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0b85aad90cea8ac6797a53b5d5f2e967334fa4d1149f031c4537569972596cb8", size = 1346041, upload-time = "2026-03-09T13:13:58.269Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/6a/f1650af35821eaf09de398ec0bc2aefc8f211f0cda50204c9f1673741ba9/kiwisolver-1.5.0-cp313-cp313-manylinux_2_39_riscv64.whl", hash = "sha256:d36ca54cb4c6c4686f7cbb7b817f66f5911c12ddb519450bbe86707155028f87", size = 987292, upload-time = "2026-03-09T13:13:59.871Z" },
+    { url = "https://files.pythonhosted.org/packages/de/19/d7fb82984b9238115fe629c915007be608ebd23dc8629703d917dbfaffd4/kiwisolver-1.5.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:38f4a703656f493b0ad185211ccfca7f0386120f022066b018eb5296d8613e23", size = 2227865, upload-time = "2026-03-09T13:14:01.401Z" },
+    { url = "https://files.pythonhosted.org/packages/7f/b9/46b7f386589fd222dac9e9de9c956ce5bcefe2ee73b4e79891381dda8654/kiwisolver-1.5.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:3ac2360e93cb41be81121755c6462cff3beaa9967188c866e5fce5cf13170859", size = 2324369, upload-time = "2026-03-09T13:14:02.972Z" },
+    { url = "https://files.pythonhosted.org/packages/92/8b/95e237cf3d9c642960153c769ddcbe278f182c8affb20cecc1cc983e7cc5/kiwisolver-1.5.0-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:c95cab08d1965db3d84a121f1c7ce7479bdd4072c9b3dafd8fecce48a2e6b902", size = 1977989, upload-time = "2026-03-09T13:14:04.503Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/95/980c9df53501892784997820136c01f62bc1865e31b82b9560f980c0e649/kiwisolver-1.5.0-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:fc20894c3d21194d8041a28b65622d5b86db786da6e3cfe73f0c762951a61167", size = 2491645, upload-time = "2026-03-09T13:14:06.106Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/32/900647fd0840abebe1561792c6b31e6a7c0e278fc3973d30572a965ca14c/kiwisolver-1.5.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7a32f72973f0f950c1920475d5c5ea3d971b81b6f0ec53b8d0a956cc965f22e0", size = 2295237, upload-time = "2026-03-09T13:14:08.891Z" },
+    { url = "https://files.pythonhosted.org/packages/be/8a/be60e3bbcf513cc5a50f4a3e88e1dcecebb79c1ad607a7222877becaa101/kiwisolver-1.5.0-cp313-cp313-win_amd64.whl", hash = "sha256:0bf3acf1419fa93064a4c2189ac0b58e3be7872bf6ee6177b0d4c63dc4cea276", size = 73573, upload-time = "2026-03-09T13:14:12.327Z" },
+    { url = "https://files.pythonhosted.org/packages/4d/d2/64be2e429eb4fca7f7e1c52a91b12663aeaf25de3895e5cca0f47ef2a8d0/kiwisolver-1.5.0-cp313-cp313-win_arm64.whl", hash = "sha256:fa8eb9ecdb7efb0b226acec134e0d709e87a909fa4971a54c0c4f6e88635484c", size = 64998, upload-time = "2026-03-09T13:14:13.469Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/69/ce68dd0c85755ae2de490bf015b62f2cea5f6b14ff00a463f9d0774449ff/kiwisolver-1.5.0-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:db485b3847d182b908b483b2ed133c66d88d49cacf98fd278fadafe11b4478d1", size = 125700, upload-time = "2026-03-09T13:14:14.636Z" },
+    { url = "https://files.pythonhosted.org/packages/74/aa/937aac021cf9d4349990d47eb319309a51355ed1dbdc9c077cdc9224cb11/kiwisolver-1.5.0-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:be12f931839a3bdfe28b584db0e640a65a8bcbc24560ae3fdb025a449b3d754e", size = 67537, upload-time = "2026-03-09T13:14:15.808Z" },
+    { url = "https://files.pythonhosted.org/packages/ee/20/3a87fbece2c40ad0f6f0aefa93542559159c5f99831d596050e8afae7a9f/kiwisolver-1.5.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:16b85d37c2cbb3253226d26e64663f755d88a03439a9c47df6246b35defbdfb7", size = 65514, upload-time = "2026-03-09T13:14:18.035Z" },
+    { url = "https://files.pythonhosted.org/packages/f0/7f/f943879cda9007c45e1f7dba216d705c3a18d6b35830e488b6c6a4e7cdf0/kiwisolver-1.5.0-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4432b835675f0ea7414aab3d37d119f7226d24869b7a829caeab49ebda407b0c", size = 1584848, upload-time = "2026-03-09T13:14:19.745Z" },
+    { url = "https://files.pythonhosted.org/packages/37/f8/4d4f85cc1870c127c88d950913370dd76138482161cd07eabbc450deff01/kiwisolver-1.5.0-cp313-cp313t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1b0feb50971481a2cc44d94e88bdb02cdd497618252ae226b8eb1201b957e368", size = 1391542, upload-time = "2026-03-09T13:14:21.54Z" },
+    { url = "https://files.pythonhosted.org/packages/04/0b/65dd2916c84d252b244bd405303220f729e7c17c9d7d33dca6feeff9ffc4/kiwisolver-1.5.0-cp313-cp313t-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:56fa888f10d0f367155e76ce849fa1166fc9730d13bd2d65a2aa13b6f5424489", size = 1404447, upload-time = "2026-03-09T13:14:23.205Z" },
+    { url = "https://files.pythonhosted.org/packages/39/5c/2606a373247babce9b1d056c03a04b65f3cf5290a8eac5d7bdead0a17e21/kiwisolver-1.5.0-cp313-cp313t-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:940dda65d5e764406b9fb92761cbf462e4e63f712ab60ed98f70552e496f3bf1", size = 1455918, upload-time = "2026-03-09T13:14:24.74Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/d1/c6078b5756670658e9192a2ef11e939c92918833d2745f85cd14a6004bdf/kiwisolver-1.5.0-cp313-cp313t-manylinux_2_39_riscv64.whl", hash = "sha256:89fc958c702ee9a745e4700378f5d23fddbc46ff89e8fdbf5395c24d5c1452a3", size = 1072856, upload-time = "2026-03-09T13:14:26.597Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/c8/7def6ddf16eb2b3741d8b172bdaa9af882b03c78e9b0772975408801fa63/kiwisolver-1.5.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:9027d773c4ff81487181a925945743413f6069634d0b122d0b37684ccf4f1e18", size = 2333580, upload-time = "2026-03-09T13:14:28.237Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/87/2ac1fce0eb1e616fcd3c35caa23e665e9b1948bb984f4764790924594128/kiwisolver-1.5.0-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:5b233ea3e165e43e35dba1d2b8ecc21cf070b45b65ae17dd2747d2713d942021", size = 2423018, upload-time = "2026-03-09T13:14:30.018Z" },
+    { url = "https://files.pythonhosted.org/packages/67/13/c6700ccc6cc218716bfcda4935e4b2997039869b4ad8a94f364c5a3b8e63/kiwisolver-1.5.0-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:ce9bf03dad3b46408c08649c6fbd6ca28a9fce0eb32fdfffa6775a13103b5310", size = 2062804, upload-time = "2026-03-09T13:14:32.888Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/bd/877056304626943ff0f1f44c08f584300c199b887cb3176cd7e34f1515f1/kiwisolver-1.5.0-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:fc4d3f1fb9ca0ae9f97b095963bc6326f1dbfd3779d6679a1e016b9baaa153d3", size = 2597482, upload-time = "2026-03-09T13:14:34.971Z" },
+    { url = "https://files.pythonhosted.org/packages/75/19/c60626c47bf0f8ac5dcf72c6c98e266d714f2fbbfd50cf6dab5ede3aaa50/kiwisolver-1.5.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:f443b4825c50a51ee68585522ab4a1d1257fac65896f282b4c6763337ac9f5d2", size = 2394328, upload-time = "2026-03-09T13:14:36.816Z" },
+    { url = "https://files.pythonhosted.org/packages/47/84/6a6d5e5bb8273756c27b7d810d47f7ef2f1f9b9fd23c9ee9a3f8c75c9cef/kiwisolver-1.5.0-cp313-cp313t-win_arm64.whl", hash = "sha256:893ff3a711d1b515ba9da14ee090519bad4610ed1962fbe298a434e8c5f8db53", size = 68410, upload-time = "2026-03-09T13:14:38.695Z" },
+    { url = "https://files.pythonhosted.org/packages/e4/d7/060f45052f2a01ad5762c8fdecd6d7a752b43400dc29ff75cd47225a40fd/kiwisolver-1.5.0-cp314-cp314-macosx_10_15_universal2.whl", hash = "sha256:8df31fe574b8b3993cc61764f40941111b25c2d9fea13d3ce24a49907cd2d615", size = 123231, upload-time = "2026-03-09T13:14:41.323Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/a7/78da680eadd06ff35edef6ef68a1ad273bad3e2a0936c9a885103230aece/kiwisolver-1.5.0-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:1d49a49ac4cbfb7c1375301cd1ec90169dfeae55ff84710d782260ce77a75a02", size = 66489, upload-time = "2026-03-09T13:14:42.534Z" },
+    { url = "https://files.pythonhosted.org/packages/49/b2/97980f3ad4fae37dd7fe31626e2bf75fbf8bdf5d303950ec1fab39a12da8/kiwisolver-1.5.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:0cbe94b69b819209a62cb27bdfa5dc2a8977d8de2f89dfd97ba4f53ed3af754e", size = 64063, upload-time = "2026-03-09T13:14:44.759Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/f9/b06c934a6aa8bc91f566bd2a214fd04c30506c2d9e2b6b171953216a65b6/kiwisolver-1.5.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:80aa065ffd378ff784822a6d7c3212f2d5f5e9c3589614b5c228b311fd3063ac", size = 1475913, upload-time = "2026-03-09T13:14:46.247Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/f0/f768ae564a710135630672981231320bc403cf9152b5596ec5289de0f106/kiwisolver-1.5.0-cp314-cp314-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4e7f886f47ab881692f278ae901039a234e4025a68e6dfab514263a0b1c4ae05", size = 1282782, upload-time = "2026-03-09T13:14:48.458Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/9f/1de7aad00697325f05238a5f2eafbd487fb637cc27a558b5367a5f37fb7f/kiwisolver-1.5.0-cp314-cp314-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:5060731cc3ed12ca3a8b57acd4aeca5bbc2f49216dd0bec1650a1acd89486bcd", size = 1300815, upload-time = "2026-03-09T13:14:50.721Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/c2/297f25141d2e468e0ce7f7a7b92e0cf8918143a0cbd3422c1ad627e85a06/kiwisolver-1.5.0-cp314-cp314-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:7a4aa69609f40fce3cbc3f87b2061f042eee32f94b8f11db707b66a26461591a", size = 1347925, upload-time = "2026-03-09T13:14:52.304Z" },
+    { url = "https://files.pythonhosted.org/packages/b9/d3/f4c73a02eb41520c47610207b21afa8cdd18fdbf64ffd94674ae21c4812d/kiwisolver-1.5.0-cp314-cp314-manylinux_2_39_riscv64.whl", hash = "sha256:d168fda2dbff7b9b5f38e693182d792a938c31db4dac3a80a4888de603c99554", size = 991322, upload-time = "2026-03-09T13:14:54.637Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/46/d3f2efef7732fcda98d22bf4ad5d3d71d545167a852ca710a494f4c15343/kiwisolver-1.5.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:413b820229730d358efd838ecbab79902fe97094565fdc80ddb6b0a18c18a581", size = 2232857, upload-time = "2026-03-09T13:14:56.471Z" },
+    { url = "https://files.pythonhosted.org/packages/3f/ec/2d9756bf2b6d26ae4349b8d3662fb3993f16d80c1f971c179ce862b9dbae/kiwisolver-1.5.0-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:5124d1ea754509b09e53738ec185584cc609aae4a3b510aaf4ed6aa047ef9303", size = 2329376, upload-time = "2026-03-09T13:14:58.072Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/9f/876a0a0f2260f1bde92e002b3019a5fabc35e0939c7d945e0fa66185eb20/kiwisolver-1.5.0-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:e4415a8db000bf49a6dd1c478bf70062eaacff0f462b92b0ba68791a905861f9", size = 1982549, upload-time = "2026-03-09T13:14:59.668Z" },
+    { url = "https://files.pythonhosted.org/packages/6c/4f/ba3624dfac23a64d54ac4179832860cb537c1b0af06024936e82ca4154a0/kiwisolver-1.5.0-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:d618fd27420381a4f6044faa71f46d8bfd911bd077c555f7138ed88729bfbe79", size = 2494680, upload-time = "2026-03-09T13:15:01.364Z" },
+    { url = "https://files.pythonhosted.org/packages/39/b7/97716b190ab98911b20d10bf92eca469121ec483b8ce0edd314f51bc85af/kiwisolver-1.5.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5092eb5b1172947f57d6ea7d89b2f29650414e4293c47707eb499ec07a0ac796", size = 2297905, upload-time = "2026-03-09T13:15:03.925Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/36/4e551e8aa55c9188bca9abb5096805edbf7431072b76e2298e34fd3a3008/kiwisolver-1.5.0-cp314-cp314-win_amd64.whl", hash = "sha256:d76e2d8c75051d58177e762164d2e9ab92886534e3a12e795f103524f221dd8e", size = 75086, upload-time = "2026-03-09T13:15:07.775Z" },
+    { url = "https://files.pythonhosted.org/packages/70/15/9b90f7df0e31a003c71649cf66ef61c3c1b862f48c81007fa2383c8bd8d7/kiwisolver-1.5.0-cp314-cp314-win_arm64.whl", hash = "sha256:fa6248cd194edff41d7ea9425ced8ca3a6f838bfb295f6f1d6e6bb694a8518df", size = 66577, upload-time = "2026-03-09T13:15:09.139Z" },
+    { url = "https://files.pythonhosted.org/packages/17/01/7dc8c5443ff42b38e72731643ed7cf1ed9bf01691ae5cdca98501999ed83/kiwisolver-1.5.0-cp314-cp314t-macosx_10_15_universal2.whl", hash = "sha256:d1ffeb80b5676463d7a7d56acbe8e37a20ce725570e09549fe738e02ca6b7e1e", size = 125794, upload-time = "2026-03-09T13:15:10.525Z" },
+    { url = "https://files.pythonhosted.org/packages/46/8a/b4ebe46ebaac6a303417fab10c2e165c557ddaff558f9699d302b256bc53/kiwisolver-1.5.0-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:bc4d8e252f532ab46a1de9349e2d27b91fce46736a9eedaa37beaca66f574ed4", size = 67646, upload-time = "2026-03-09T13:15:12.016Z" },
+    { url = "https://files.pythonhosted.org/packages/60/35/10a844afc5f19d6f567359bf4789e26661755a2f36200d5d1ed8ad0126e5/kiwisolver-1.5.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:6783e069732715ad0c3ce96dbf21dbc2235ab0593f2baf6338101f70371f4028", size = 65511, upload-time = "2026-03-09T13:15:13.311Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/8a/685b297052dd041dcebce8e8787b58923b6e78acc6115a0dc9189011c44b/kiwisolver-1.5.0-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:e7c4c09a490dc4d4a7f8cbee56c606a320f9dc28cf92a7157a39d1ce7676a657", size = 1584858, upload-time = "2026-03-09T13:15:15.103Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/80/04865e3d4638ac5bddec28908916df4a3075b8c6cc101786a96803188b96/kiwisolver-1.5.0-cp314-cp314t-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2a075bd7bd19c70cf67c8badfa36cf7c5d8de3c9ddb8420c51e10d9c50e94920", size = 1392539, upload-time = "2026-03-09T13:15:16.661Z" },
+    { url = "https://files.pythonhosted.org/packages/ba/01/77a19cacc0893fa13fafa46d1bba06fb4dc2360b3292baf4b56d8e067b24/kiwisolver-1.5.0-cp314-cp314t-manylinux_2_24_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:bdd3e53429ff02aa319ba59dfe4ceeec345bf46cf180ec2cf6fd5b942e7975e9", size = 1405310, upload-time = "2026-03-09T13:15:18.229Z" },
+    { url = "https://files.pythonhosted.org/packages/53/39/bcaf5d0cca50e604cfa9b4e3ae1d64b50ca1ae5b754122396084599ef903/kiwisolver-1.5.0-cp314-cp314t-manylinux_2_24_s390x.manylinux_2_28_s390x.whl", hash = "sha256:3cdcb35dc9d807259c981a85531048ede628eabcffb3239adf3d17463518992d", size = 1456244, upload-time = "2026-03-09T13:15:20.444Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/7a/72c187abc6975f6978c3e39b7cf67aeb8b3c0a8f9790aa7fd412855e9e1f/kiwisolver-1.5.0-cp314-cp314t-manylinux_2_39_riscv64.whl", hash = "sha256:70d593af6a6ca332d1df73d519fddb5148edb15cd90d5f0155e3746a6d4fcc65", size = 1073154, upload-time = "2026-03-09T13:15:22.039Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/ca/cf5b25783ebbd59143b4371ed0c8428a278abe68d6d0104b01865b1bbd0f/kiwisolver-1.5.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:377815a8616074cabbf3f53354e1d040c35815a134e01d7614b7692e4bf8acfa", size = 2334377, upload-time = "2026-03-09T13:15:23.741Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/e5/b1f492adc516796e88751282276745340e2a72dcd0d36cf7173e0daf3210/kiwisolver-1.5.0-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:0255a027391d52944eae1dbb5d4cc5903f57092f3674e8e544cdd2622826b3f0", size = 2425288, upload-time = "2026-03-09T13:15:25.789Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/e5/9b21fbe91a61b8f409d74a26498706e97a48008bfcd1864373d32a6ba31c/kiwisolver-1.5.0-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:012b1eb16e28718fa782b5e61dc6f2da1f0792ca73bd05d54de6cb9561665fc9", size = 2063158, upload-time = "2026-03-09T13:15:27.63Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/02/83f47986138310f95ea95531f851b2a62227c11cbc3e690ae1374fe49f0f/kiwisolver-1.5.0-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:0e3aafb33aed7479377e5e9a82e9d4bf87063741fc99fc7ae48b0f16e32bdd6f", size = 2597260, upload-time = "2026-03-09T13:15:29.421Z" },
+    { url = "https://files.pythonhosted.org/packages/07/18/43a5f24608d8c313dd189cf838c8e68d75b115567c6279de7796197cfb6a/kiwisolver-1.5.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:e7a116ae737f0000343218c4edf5bd45893bfeaff0993c0b215d7124c9f77646", size = 2394403, upload-time = "2026-03-09T13:15:31.517Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/b5/98222136d839b8afabcaa943b09bd05888c2d36355b7e448550211d1fca4/kiwisolver-1.5.0-cp314-cp314t-win_amd64.whl", hash = "sha256:1dd9b0b119a350976a6d781e7278ec7aca0b201e1a9e2d23d9804afecb6ca681", size = 79687, upload-time = "2026-03-09T13:15:33.204Z" },
+    { url = "https://files.pythonhosted.org/packages/99/a2/ca7dc962848040befed12732dff6acae7fb3c4f6fc4272b3f6c9a30b8713/kiwisolver-1.5.0-cp314-cp314t-win_arm64.whl", hash = "sha256:58f812017cd2985c21fbffb4864d59174d4903dd66fa23815e74bbc7a0e2dd57", size = 70032, upload-time = "2026-03-09T13:15:34.411Z" },
+    { url = "https://files.pythonhosted.org/packages/1c/fa/2910df836372d8761bb6eff7d8bdcb1613b5c2e03f260efe7abe34d388a7/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-macosx_10_13_x86_64.whl", hash = "sha256:5ae8e62c147495b01a0f4765c878e9bfdf843412446a247e28df59936e99e797", size = 130262, upload-time = "2026-03-09T13:15:35.629Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/41/c5f71f9f00aabcc71fee8b7475e3f64747282580c2fe748961ba29b18385/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:f6764a4ccab3078db14a632420930f6186058750df066b8ea2a7106df91d3203", size = 138036, upload-time = "2026-03-09T13:15:36.894Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/06/7399a607f434119c6e1fdc8ec89a8d51ccccadf3341dee4ead6bd14caaf5/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c31c13da98624f957b0fb1b5bae5383b2333c2c3f6793d9825dd5ce79b525cb7", size = 194295, upload-time = "2026-03-09T13:15:38.22Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/91/53255615acd2a1eaca307ede3c90eb550bae9c94581f8c00081b6b1c8f44/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-win_amd64.whl", hash = "sha256:1f1489f769582498610e015a8ef2d36f28f505ab3096d0e16b4858a9ec214f57", size = 75987, upload-time = "2026-03-09T13:15:39.65Z" },
+    { url = "https://files.pythonhosted.org/packages/17/6f/6fd4f690a40c2582fa34b97d2678f718acf3706b91d270c65ecb455d0a06/kiwisolver-1.5.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:295d9ffe712caa9f8a3081de8d32fc60191b4b51c76f02f951fd8407253528f4", size = 59606, upload-time = "2026-03-09T13:15:40.81Z" },
+    { url = "https://files.pythonhosted.org/packages/82/a0/2355d5e3b338f13ce63f361abb181e3b6ea5fffdb73f739b3e80efa76159/kiwisolver-1.5.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:51e8c4084897de9f05898c2c2a39af6318044ae969d46ff7a34ed3f96274adca", size = 57537, upload-time = "2026-03-09T13:15:42.071Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/b9/1d50e610ecadebe205b71d6728fd224ce0e0ca6aba7b9cbe1da049203ac5/kiwisolver-1.5.0-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:b83af57bdddef03c01a9138034c6ff03181a3028d9a1003b301eb1a55e161a3f", size = 79888, upload-time = "2026-03-09T13:15:43.317Z" },
+    { url = "https://files.pythonhosted.org/packages/cd/ee/b85ffcd75afed0357d74f0e6fc02a4507da441165de1ca4760b9f496390d/kiwisolver-1.5.0-pp310-pypy310_pp73-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bf4679a3d71012a7c2bf360e5cd878fbd5e4fcac0896b56393dec239d81529ed", size = 77584, upload-time = "2026-03-09T13:15:44.605Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/dd/644d0dde6010a8583b4cd66dd41c5f83f5325464d15c4f490b3340ab73b4/kiwisolver-1.5.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:41024ed50e44ab1a60d3fe0a9d15a4ccc9f5f2b1d814ff283c8d01134d5b81bc", size = 73390, upload-time = "2026-03-09T13:15:45.832Z" },
+    { url = "https://files.pythonhosted.org/packages/e9/eb/5fcbbbf9a0e2c3a35effb88831a483345326bbc3a030a3b5b69aee647f84/kiwisolver-1.5.0-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:ec4c85dc4b687c7f7f15f553ff26a98bfe8c58f5f7f0ac8905f0ba4c7be60232", size = 59532, upload-time = "2026-03-09T13:15:47.047Z" },
+    { url = "https://files.pythonhosted.org/packages/c3/9b/e17104555bb4db148fd52327feea1e96be4b88e8e008b029002c281a21ab/kiwisolver-1.5.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:12e91c215a96e39f57989c8912ae761286ac5a9584d04030ceb3368a357f017a", size = 57420, upload-time = "2026-03-09T13:15:48.199Z" },
+    { url = "https://files.pythonhosted.org/packages/48/44/2b5b95b7aa39fb2d8d9d956e0f3d5d45aef2ae1d942d4c3ffac2f9cfed1a/kiwisolver-1.5.0-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:be4a51a55833dc29ab5d7503e7bcb3b3af3402d266018137127450005cdfe737", size = 79892, upload-time = "2026-03-09T13:15:49.694Z" },
+    { url = "https://files.pythonhosted.org/packages/52/7d/7157f9bba6b455cfb4632ed411e199fc8b8977642c2b12082e1bd9e6d173/kiwisolver-1.5.0-pp311-pypy311_pp73-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:daae526907e262de627d8f70058a0f64acc9e2641c164c99c8f594b34a799a16", size = 77603, upload-time = "2026-03-09T13:15:50.945Z" },
+    { url = "https://files.pythonhosted.org/packages/0a/dd/8050c947d435c8d4bc94e3252f4d8bb8a76cfb424f043a8680be637a57f1/kiwisolver-1.5.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:59cd8683f575d96df5bb48f6add94afc055012c29e28124fcae2b63661b9efb1", size = 73558, upload-time = "2026-03-09T13:15:52.112Z" },
+]
+
 [[package]]
 name = "langchain"
 version = "1.2.0"
@@ -2561,6 +3112,61 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/90/ac/e911594a2f10445717ea45b61b3a93f3bb91594320745fe1bb796c2dc87a/llama_index_workflows-2.11.5-py3-none-any.whl", hash = "sha256:3c5a419129114bb0b1bd83b88aa5f653f84181b2e39e33473e8747ec6e88538e", size = 91982, upload-time = "2025-11-24T18:37:58.265Z" },
 ]
 
+[[package]]
+name = "llvmlite"
+version = "0.46.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/74/cd/08ae687ba099c7e3d21fe2ea536500563ef1943c5105bf6ab4ee3829f68e/llvmlite-0.46.0.tar.gz", hash = "sha256:227c9fd6d09dce2783c18b754b7cd9d9b3b3515210c46acc2d3c5badd9870ceb", size = 193456, upload-time = "2025-12-08T18:15:36.295Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3d/a4/3959e1c61c5ca9db7921e5fd115b344c29b9d57a5dadd87bef97963ca1a5/llvmlite-0.46.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:4323177e936d61ae0f73e653e2e614284d97d14d5dd12579adc92b6c2b0597b0", size = 37232766, upload-time = "2025-12-08T18:14:34.765Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/a5/a4d916f1015106e1da876028606a8e87fd5d5c840f98c87bc2d5153b6a2f/llvmlite-0.46.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0a2d461cb89537b7c20feb04c46c32e12d5ad4f0896c9dfc0f60336219ff248e", size = 56275176, upload-time = "2025-12-08T18:14:37.944Z" },
+    { url = "https://files.pythonhosted.org/packages/79/7f/a7f2028805dac8c1a6fae7bda4e739b7ebbcd45b29e15bf6d21556fcd3d5/llvmlite-0.46.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b1f6595a35b7b39c3518b85a28bf18f45e075264e4b2dce3f0c2a4f232b4a910", size = 55128629, upload-time = "2025-12-08T18:14:41.674Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/bc/4689e1ba0c073c196b594471eb21be0aa51d9e64b911728aa13cd85ef0ae/llvmlite-0.46.0-cp310-cp310-win_amd64.whl", hash = "sha256:e7a34d4aa6f9a97ee006b504be6d2b8cb7f755b80ab2f344dda1ef992f828559", size = 38138651, upload-time = "2025-12-08T18:14:45.845Z" },
+    { url = "https://files.pythonhosted.org/packages/7a/a1/2ad4b2367915faeebe8447f0a057861f646dbf5fbbb3561db42c65659cf3/llvmlite-0.46.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:82f3d39b16f19aa1a56d5fe625883a6ab600d5cc9ea8906cca70ce94cabba067", size = 37232766, upload-time = "2025-12-08T18:14:48.836Z" },
+    { url = "https://files.pythonhosted.org/packages/12/b5/99cf8772fdd846c07da4fd70f07812a3c8fd17ea2409522c946bb0f2b277/llvmlite-0.46.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a3df43900119803bbc52720e758c76f316a9a0f34612a886862dfe0a5591a17e", size = 56275175, upload-time = "2025-12-08T18:14:51.604Z" },
+    { url = "https://files.pythonhosted.org/packages/38/f2/ed806f9c003563732da156139c45d970ee435bd0bfa5ed8de87ba972b452/llvmlite-0.46.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:de183fefc8022d21b0aa37fc3e90410bc3524aed8617f0ff76732fc6c3af5361", size = 55128630, upload-time = "2025-12-08T18:14:55.107Z" },
+    { url = "https://files.pythonhosted.org/packages/19/0c/8f5a37a65fc9b7b17408508145edd5f86263ad69c19d3574e818f533a0eb/llvmlite-0.46.0-cp311-cp311-win_amd64.whl", hash = "sha256:e8b10bc585c58bdffec9e0c309bb7d51be1f2f15e169a4b4d42f2389e431eb93", size = 38138652, upload-time = "2025-12-08T18:14:58.171Z" },
+    { url = "https://files.pythonhosted.org/packages/2b/f8/4db016a5e547d4e054ff2f3b99203d63a497465f81ab78ec8eb2ff7b2304/llvmlite-0.46.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:6b9588ad4c63b4f0175a3984b85494f0c927c6b001e3a246a3a7fb3920d9a137", size = 37232767, upload-time = "2025-12-08T18:15:00.737Z" },
+    { url = "https://files.pythonhosted.org/packages/aa/85/4890a7c14b4fa54400945cb52ac3cd88545bbdb973c440f98ca41591cdc5/llvmlite-0.46.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3535bd2bb6a2d7ae4012681ac228e5132cdb75fefb1bcb24e33f2f3e0c865ed4", size = 56275176, upload-time = "2025-12-08T18:15:03.936Z" },
+    { url = "https://files.pythonhosted.org/packages/6a/07/3d31d39c1a1a08cd5337e78299fca77e6aebc07c059fbd0033e3edfab45c/llvmlite-0.46.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4cbfd366e60ff87ea6cc62f50bc4cd800ebb13ed4c149466f50cf2163a473d1e", size = 55128630, upload-time = "2025-12-08T18:15:07.196Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/6b/d139535d7590a1bba1ceb68751bef22fadaa5b815bbdf0e858e3875726b2/llvmlite-0.46.0-cp312-cp312-win_amd64.whl", hash = "sha256:398b39db462c39563a97b912d4f2866cd37cba60537975a09679b28fbbc0fb38", size = 38138940, upload-time = "2025-12-08T18:15:10.162Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/ff/3eba7eb0aed4b6fca37125387cd417e8c458e750621fce56d2c541f67fa8/llvmlite-0.46.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:30b60892d034bc560e0ec6654737aaa74e5ca327bd8114d82136aa071d611172", size = 37232767, upload-time = "2025-12-08T18:15:13.22Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/54/737755c0a91558364b9200702c3c9c15d70ed63f9b98a2c32f1c2aa1f3ba/llvmlite-0.46.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:6cc19b051753368a9c9f31dc041299059ee91aceec81bd57b0e385e5d5bf1a54", size = 56275176, upload-time = "2025-12-08T18:15:16.339Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/91/14f32e1d70905c1c0aa4e6609ab5d705c3183116ca02ac6df2091868413a/llvmlite-0.46.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bca185892908f9ede48c0acd547fe4dc1bafefb8a4967d47db6cf664f9332d12", size = 55128629, upload-time = "2025-12-08T18:15:19.493Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/a7/d526ae86708cea531935ae777b6dbcabe7db52718e6401e0fb9c5edea80e/llvmlite-0.46.0-cp313-cp313-win_amd64.whl", hash = "sha256:67438fd30e12349ebb054d86a5a1a57fd5e87d264d2451bcfafbbbaa25b82a35", size = 38138941, upload-time = "2025-12-08T18:15:22.536Z" },
+    { url = "https://files.pythonhosted.org/packages/95/ae/af0ffb724814cc2ea64445acad05f71cff5f799bb7efb22e47ee99340dbc/llvmlite-0.46.0-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:d252edfb9f4ac1fcf20652258e3f102b26b03eef738dc8a6ffdab7d7d341d547", size = 37232768, upload-time = "2025-12-08T18:15:25.055Z" },
+    { url = "https://files.pythonhosted.org/packages/c9/19/5018e5352019be753b7b07f7759cdabb69ca5779fea2494be8839270df4c/llvmlite-0.46.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:379fdd1c59badeff8982cb47e4694a6143bec3bb49aa10a466e095410522064d", size = 56275173, upload-time = "2025-12-08T18:15:28.109Z" },
+    { url = "https://files.pythonhosted.org/packages/9f/c9/d57877759d707e84c082163c543853245f91b70c804115a5010532890f18/llvmlite-0.46.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2e8cbfff7f6db0fa2c771ad24154e2a7e457c2444d7673e6de06b8b698c3b269", size = 55128628, upload-time = "2025-12-08T18:15:31.098Z" },
+    { url = "https://files.pythonhosted.org/packages/30/a8/e61a8c2b3cc7a597073d9cde1fcbb567e9d827f1db30c93cf80422eac70d/llvmlite-0.46.0-cp314-cp314-win_amd64.whl", hash = "sha256:7821eda3ec1f18050f981819756631d60b6d7ab1a6cf806d9efefbe3f4082d61", size = 39153056, upload-time = "2025-12-08T18:15:33.938Z" },
+]
+
+[[package]]
+name = "lm-eval"
+version = "0.4.9.1"
+source = { git = "https://github.com/arubique/lm-evaluation-harness.git?rev=main#bef38e00bfd084f5050f3fc09278eefd6d3ad376" }
+dependencies = [
+    { name = "accelerate" },
+    { name = "datasets" },
+    { name = "dill" },
+    { name = "evaluate" },
+    { name = "jsonlines" },
+    { name = "more-itertools" },
+    { name = "numexpr" },
+    { name = "peft" },
+    { name = "pybind11" },
+    { name = "pytablewriter" },
+    { name = "rouge-score" },
+    { name = "sacrebleu" },
+    { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "sqlitedict" },
+    { name = "torch" },
+    { name = "tqdm-multiprocess" },
+    { name = "transformers" },
+    { name = "word2number" },
+    { name = "zstandard" },
+]
+
 [[package]]
 name = "lxml"
 version = "6.0.2"
@@ -2841,20 +3447,28 @@ dependencies = [
 
 [package.optional-dependencies]
 all = [
+    { name = "accelerate" },
     { name = "addict" },
+    { name = "aiohttp" },
     { name = "anthropic" },
     { name = "arxiv" },
     { name = "beartype" },
     { name = "beautifulsoup4" },
     { name = "camel-ai" },
+    { name = "click" },
     { name = "colorlog" },
     { name = "datasets" },
     { name = "docstring-parser" },
     { name = "flask" },
+    { name = "gdown" },
     { name = "google-genai" },
+    { name = "h5py" },
     { name = "ipykernel" },
+    { name = "ipython", version = "8.37.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "ipython", version = "9.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
     { name = "ipywidgets" },
     { name = "javascript" },
+    { name = "jsonlines" },
     { name = "jupyter" },
     { name = "keybert" },
     { name = "langchain" },
@@ -2864,6 +3478,8 @@ all = [
     { name = "levenshtein" },
     { name = "litellm" },
     { name = "llama-index-core" },
+    { name = "lm-eval" },
+    { name = "matplotlib" },
     { name = "mcp" },
     { name = "meta-agents-research-environments" },
     { name = "names" },
@@ -2877,11 +3493,19 @@ all = [
     { name = "ruamel-yaml" },
     { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
     { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scipy", version = "1.17.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
     { name = "semanticscholar" },
     { name = "sentence-transformers" },
     { name = "smolagents" },
+    { name = "stnd" },
+    { name = "tiktoken" },
+    { name = "torch" },
+    { name = "torchvision" },
     { name = "transformers" },
+    { name = "typer" },
     { name = "typing-extensions" },
+    { name = "umap-learn" },
     { name = "waitress" },
     { name = "wandb" },
 ]
@@ -2891,15 +3515,46 @@ anthropic = [
 camel = [
     { name = "camel-ai" },
 ]
+disco = [
+    { name = "aiohttp" },
+    { name = "click" },
+    { name = "datasets" },
+    { name = "gdown" },
+    { name = "h5py" },
+    { name = "ipython", version = "8.37.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "ipython", version = "9.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "jsonlines" },
+    { name = "lm-eval" },
+    { name = "matplotlib" },
+    { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scipy", version = "1.17.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "stnd" },
+    { name = "tiktoken" },
+    { name = "torch" },
+    { name = "torchvision" },
+    { name = "transformers" },
+    { name = "typer" },
+    { name = "umap-learn" },
+]
 examples = [
+    { name = "accelerate" },
     { name = "addict" },
+    { name = "aiohttp" },
     { name = "anthropic" },
     { name = "camel-ai" },
+    { name = "click" },
     { name = "datasets" },
     { name = "docstring-parser" },
+    { name = "gdown" },
     { name = "google-genai" },
+    { name = "h5py" },
     { name = "ipykernel" },
+    { name = "ipython", version = "8.37.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "ipython", version = "9.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
     { name = "ipywidgets" },
+    { name = "jsonlines" },
     { name = "jupyter" },
     { name = "langchain" },
     { name = "langchain-google-genai" },
@@ -2907,12 +3562,25 @@ examples = [
     { name = "langgraph" },
     { name = "litellm" },
     { name = "llama-index-core" },
+    { name = "lm-eval" },
+    { name = "matplotlib" },
     { name = "mcp" },
     { name = "meta-agents-research-environments" },
     { name = "openai" },
     { name = "python-dotenv" },
+    { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scipy", version = "1.17.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
     { name = "smolagents" },
+    { name = "stnd" },
+    { name = "tiktoken" },
+    { name = "torch" },
+    { name = "torchvision" },
+    { name = "transformers" },
+    { name = "typer" },
     { name = "typing-extensions" },
+    { name = "umap-learn" },
 ]
 gaia2 = [
     { name = "datasets" },
@@ -2933,6 +3601,17 @@ litellm = [
 llamaindex = [
     { name = "llama-index-core" },
 ]
+lm-eval = [
+    { name = "aiohttp" },
+    { name = "lm-eval" },
+    { name = "transformers" },
+]
+mmlu = [
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "torch" },
+    { name = "transformers" },
+]
 multiagentbench = [
     { name = "arxiv" },
     { name = "beartype" },
@@ -2994,21 +3673,30 @@ docs = [
 
 [package.metadata]
 requires-dist = [
+    { name = "accelerate", marker = "extra == 'examples'", specifier = ">=1.11.0" },
     { name = "addict", marker = "extra == 'tau2'", specifier = ">=2.4.0" },
+    { name = "aiohttp", marker = "extra == 'disco'", specifier = ">=3.9.0" },
+    { name = "aiohttp", marker = "extra == 'lm-eval'", specifier = ">=3.9.0" },
     { name = "anthropic", marker = "extra == 'anthropic'", specifier = ">=0.40.0" },
     { name = "arxiv", marker = "extra == 'multiagentbench'", specifier = ">=2.1.0" },
     { name = "beartype", marker = "extra == 'multiagentbench'" },
     { name = "beautifulsoup4", marker = "extra == 'multiagentbench'", specifier = ">=4.12.0" },
     { name = "camel-ai", marker = "extra == 'camel'", specifier = ">=0.2.0" },
+    { name = "click", marker = "extra == 'disco'", specifier = ">=8.1.0" },
     { name = "colorlog", marker = "extra == 'multiagentbench'", specifier = ">=6.0.0" },
+    { name = "datasets", marker = "extra == 'disco'", specifier = "==4.4.1" },
     { name = "datasets", marker = "extra == 'gaia2'", specifier = ">=3.0.0" },
     { name = "docstring-parser", marker = "extra == 'tau2'", specifier = ">=0.16" },
     { name = "flask", marker = "extra == 'multiagentbench'", specifier = ">=3.0.0" },
+    { name = "gdown", marker = "extra == 'disco'", specifier = ">=4.6.0" },
     { name = "gitpython", specifier = ">=3.1.0" },
     { name = "google-genai", marker = "extra == 'google-genai'", specifier = ">=1.37.0" },
+    { name = "h5py", marker = "extra == 'disco'", specifier = ">=3.0.0" },
     { name = "ipykernel", marker = "extra == 'examples'", specifier = ">=6.0.0" },
+    { name = "ipython", marker = "extra == 'disco'", specifier = ">=8.0.0" },
     { name = "ipywidgets", marker = "extra == 'examples'", specifier = ">=8.0.0" },
     { name = "javascript", marker = "extra == 'multiagentbench'", specifier = ">=1!1.2.0" },
+    { name = "jsonlines", marker = "extra == 'disco'", specifier = ">=4.0.0" },
     { name = "jupyter", marker = "extra == 'examples'", specifier = ">=1.0.0" },
     { name = "keybert", marker = "extra == 'multiagentbench'", specifier = ">=0.8.0" },
     { name = "langchain", marker = "extra == 'examples'", specifier = ">=0.3.27" },
@@ -3019,11 +3707,15 @@ requires-dist = [
     { name = "litellm", marker = "extra == 'litellm'", specifier = ">=1.0.0" },
     { name = "litellm", marker = "extra == 'multiagentbench'", specifier = ">=1.0.0" },
     { name = "llama-index-core", marker = "extra == 'llamaindex'", specifier = ">=0.12.0" },
+    { name = "lm-eval", marker = "extra == 'disco'", git = "https://github.com/arubique/lm-evaluation-harness.git?rev=main" },
+    { name = "lm-eval", marker = "extra == 'lm-eval'", git = "https://github.com/arubique/lm-evaluation-harness.git?rev=main" },
     { name = "maseval", extras = ["examples", "transformers", "wandb", "multiagentbench"], marker = "extra == 'all'" },
-    { name = "maseval", extras = ["smolagents", "langgraph", "llamaindex", "camel", "anthropic", "openai", "google-genai", "litellm", "langfuse", "gaia2", "macs", "tau2"], marker = "extra == 'examples'" },
+    { name = "maseval", extras = ["smolagents", "langgraph", "llamaindex", "camel", "anthropic", "openai", "google-genai", "litellm", "langfuse", "gaia2", "macs", "tau2", "disco"], marker = "extra == 'examples'" },
+    { name = "matplotlib", marker = "extra == 'disco'", specifier = ">=3.5.0" },
     { name = "mcp", marker = "extra == 'examples'", specifier = ">=1.22.0" },
     { name = "meta-agents-research-environments", marker = "extra == 'gaia2'", specifier = ">=1.2.0" },
     { name = "names", marker = "extra == 'multiagentbench'", specifier = ">=0.3.0" },
+    { name = "numpy", marker = "extra == 'mmlu'", specifier = ">=1.20.0" },
     { name = "openai", marker = "extra == 'openai'", specifier = ">=1.107.2" },
     { name = "psycopg2-binary", marker = "extra == 'multiagentbench'", specifier = ">=2.9.0" },
     { name = "pydantic", specifier = ">=2.10.6" },
@@ -3034,17 +3726,29 @@ requires-dist = [
     { name = "requests", marker = "extra == 'multiagentbench'", specifier = ">=2.28.0" },
     { name = "rich", specifier = ">=14.1.0" },
     { name = "ruamel-yaml", marker = "extra == 'multiagentbench'", specifier = ">=0.17.0" },
+    { name = "scikit-learn", marker = "extra == 'disco'", specifier = ">=1.7.2" },
     { name = "scikit-learn", marker = "extra == 'multiagentbench'", specifier = ">=1.3.0" },
+    { name = "scipy", marker = "extra == 'disco'", specifier = ">=1.11.0" },
     { name = "semanticscholar", marker = "extra == 'multiagentbench'", specifier = ">=0.8.0" },
     { name = "sentence-transformers", marker = "extra == 'multiagentbench'", specifier = ">=2.3.0" },
     { name = "smolagents", marker = "extra == 'smolagents'", specifier = ">=1.21.3" },
+    { name = "stnd", marker = "extra == 'disco'", git = "https://github.com/arubique/stnd.git?rev=0d23b52f7742c08b28be560d2d52d450fcd274b7" },
+    { name = "tiktoken", marker = "extra == 'disco'", specifier = ">=0.5.0" },
+    { name = "torch", marker = "extra == 'disco'", specifier = "==2.9.1" },
+    { name = "torch", marker = "extra == 'mmlu'", specifier = ">=2.0.0" },
+    { name = "torchvision", marker = "extra == 'disco'", specifier = ">=0.15.0" },
     { name = "tqdm", specifier = ">=4.66.0" },
+    { name = "transformers", marker = "extra == 'disco'", specifier = ">=4.37.0,<5.0.0" },
+    { name = "transformers", marker = "extra == 'lm-eval'", specifier = ">=4.37.0,<5.0.0" },
+    { name = "transformers", marker = "extra == 'mmlu'", specifier = ">=4.37.0" },
     { name = "transformers", marker = "extra == 'transformers'", specifier = ">=4.37.0" },
+    { name = "typer", marker = "extra == 'disco'", specifier = ">=0.9.0" },
     { name = "typing-extensions", marker = "extra == 'examples'", specifier = ">=4.0.0" },
+    { name = "umap-learn", marker = "extra == 'disco'", specifier = ">=0.5.0" },
     { name = "waitress", marker = "extra == 'multiagentbench'", specifier = ">=3.0.0" },
     { name = "wandb", marker = "extra == 'wandb'", specifier = ">=0.15.0" },
 ]
-provides-extras = ["smolagents", "langgraph", "llamaindex", "camel", "anthropic", "openai", "google-genai", "transformers", "litellm", "wandb", "langfuse", "gaia2", "macs", "multiagentbench", "tau2", "converse", "examples", "all"]
+provides-extras = ["smolagents", "langgraph", "llamaindex", "camel", "anthropic", "openai", "google-genai", "transformers", "litellm", "wandb", "langfuse", "gaia2", "macs", "multiagentbench", "tau2", "converse", "mmlu", "lm-eval", "disco", "examples", "all"]
 
 [package.metadata.requires-dev]
 dev = [
@@ -3065,6 +3769,81 @@ docs = [
     { name = "pymdown-extensions", specifier = ">=10.0.0" },
 ]
 
+[[package]]
+name = "matplotlib"
+version = "3.10.8"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "contourpy", version = "1.3.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "contourpy", version = "1.3.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "cycler" },
+    { name = "fonttools" },
+    { name = "kiwisolver" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "packaging" },
+    { name = "pillow" },
+    { name = "pyparsing" },
+    { name = "python-dateutil" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/8a/76/d3c6e3a13fe484ebe7718d14e269c9569c4eb0020a968a327acb3b9a8fe6/matplotlib-3.10.8.tar.gz", hash = "sha256:2299372c19d56bcd35cf05a2738308758d32b9eaed2371898d8f5bd33f084aa3", size = 34806269, upload-time = "2025-12-10T22:56:51.155Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/58/be/a30bd917018ad220c400169fba298f2bb7003c8ccbc0c3e24ae2aacad1e8/matplotlib-3.10.8-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:00270d217d6b20d14b584c521f810d60c5c78406dc289859776550df837dcda7", size = 8239828, upload-time = "2025-12-10T22:55:02.313Z" },
+    { url = "https://files.pythonhosted.org/packages/58/27/ca01e043c4841078e82cf6e80a6993dfecd315c3d79f5f3153afbb8e1ec6/matplotlib-3.10.8-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:37b3c1cc42aa184b3f738cfa18c1c1d72fd496d85467a6cf7b807936d39aa656", size = 8128050, upload-time = "2025-12-10T22:55:04.997Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/aa/7ab67f2b729ae6a91bcf9dcac0affb95fb8c56f7fd2b2af894ae0b0cf6fa/matplotlib-3.10.8-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ee40c27c795bda6a5292e9cff9890189d32f7e3a0bf04e0e3c9430c4a00c37df", size = 8700452, upload-time = "2025-12-10T22:55:07.47Z" },
+    { url = "https://files.pythonhosted.org/packages/73/ae/2d5817b0acee3c49b7e7ccfbf5b273f284957cc8e270adf36375db353190/matplotlib-3.10.8-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:a48f2b74020919552ea25d222d5cc6af9ca3f4eb43a93e14d068457f545c2a17", size = 9534928, upload-time = "2025-12-10T22:55:10.566Z" },
+    { url = "https://files.pythonhosted.org/packages/c9/5b/8e66653e9f7c39cb2e5cab25fce4810daffa2bff02cbf5f3077cea9e942c/matplotlib-3.10.8-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f254d118d14a7f99d616271d6c3c27922c092dac11112670b157798b89bf4933", size = 9586377, upload-time = "2025-12-10T22:55:12.362Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/e2/fd0bbadf837f81edb0d208ba8f8cb552874c3b16e27cb91a31977d90875d/matplotlib-3.10.8-cp310-cp310-win_amd64.whl", hash = "sha256:f9b587c9c7274c1613a30afabf65a272114cd6cdbe67b3406f818c79d7ab2e2a", size = 8128127, upload-time = "2025-12-10T22:55:14.436Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/86/de7e3a1cdcfc941483af70609edc06b83e7c8a0e0dc9ac325200a3f4d220/matplotlib-3.10.8-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:6be43b667360fef5c754dda5d25a32e6307a03c204f3c0fc5468b78fa87b4160", size = 8251215, upload-time = "2025-12-10T22:55:16.175Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/14/baad3222f424b19ce6ad243c71de1ad9ec6b2e4eb1e458a48fdc6d120401/matplotlib-3.10.8-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:a2b336e2d91a3d7006864e0990c83b216fcdca64b5a6484912902cef87313d78", size = 8139625, upload-time = "2025-12-10T22:55:17.712Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/a0/7024215e95d456de5883e6732e708d8187d9753a21d32f8ddb3befc0c445/matplotlib-3.10.8-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:efb30e3baaea72ce5928e32bab719ab4770099079d66726a62b11b1ef7273be4", size = 8712614, upload-time = "2025-12-10T22:55:20.8Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/f4/b8347351da9a5b3f41e26cf547252d861f685c6867d179a7c9d60ad50189/matplotlib-3.10.8-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d56a1efd5bfd61486c8bc968fa18734464556f0fb8e51690f4ac25d85cbbbbc2", size = 9540997, upload-time = "2025-12-10T22:55:23.258Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/c0/c7b914e297efe0bc36917bf216b2acb91044b91e930e878ae12981e461e5/matplotlib-3.10.8-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:238b7ce5717600615c895050239ec955d91f321c209dd110db988500558e70d6", size = 9596825, upload-time = "2025-12-10T22:55:25.217Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/d3/a4bbc01c237ab710a1f22b4da72f4ff6d77eb4c7735ea9811a94ae239067/matplotlib-3.10.8-cp311-cp311-win_amd64.whl", hash = "sha256:18821ace09c763ec93aef5eeff087ee493a24051936d7b9ebcad9662f66501f9", size = 8135090, upload-time = "2025-12-10T22:55:27.162Z" },
+    { url = "https://files.pythonhosted.org/packages/89/dd/a0b6588f102beab33ca6f5218b31725216577b2a24172f327eaf6417d5c9/matplotlib-3.10.8-cp311-cp311-win_arm64.whl", hash = "sha256:bab485bcf8b1c7d2060b4fcb6fc368a9e6f4cd754c9c2fea281f4be21df394a2", size = 8012377, upload-time = "2025-12-10T22:55:29.185Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/67/f997cdcbb514012eb0d10cd2b4b332667997fb5ebe26b8d41d04962fa0e6/matplotlib-3.10.8-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:64fcc24778ca0404ce0cb7b6b77ae1f4c7231cdd60e6778f999ee05cbd581b9a", size = 8260453, upload-time = "2025-12-10T22:55:30.709Z" },
+    { url = "https://files.pythonhosted.org/packages/7e/65/07d5f5c7f7c994f12c768708bd2e17a4f01a2b0f44a1c9eccad872433e2e/matplotlib-3.10.8-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b9a5ca4ac220a0cdd1ba6bcba3608547117d30468fefce49bb26f55c1a3d5c58", size = 8148321, upload-time = "2025-12-10T22:55:33.265Z" },
+    { url = "https://files.pythonhosted.org/packages/3e/f3/c5195b1ae57ef85339fd7285dfb603b22c8b4e79114bae5f4f0fcf688677/matplotlib-3.10.8-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3ab4aabc72de4ff77b3ec33a6d78a68227bf1123465887f9905ba79184a1cc04", size = 8716944, upload-time = "2025-12-10T22:55:34.922Z" },
+    { url = "https://files.pythonhosted.org/packages/00/f9/7638f5cc82ec8a7aa005de48622eecc3ed7c9854b96ba15bd76b7fd27574/matplotlib-3.10.8-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:24d50994d8c5816ddc35411e50a86ab05f575e2530c02752e02538122613371f", size = 9550099, upload-time = "2025-12-10T22:55:36.789Z" },
+    { url = "https://files.pythonhosted.org/packages/57/61/78cd5920d35b29fd2a0fe894de8adf672ff52939d2e9b43cb83cd5ce1bc7/matplotlib-3.10.8-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:99eefd13c0dc3b3c1b4d561c1169e65fe47aab7b8158754d7c084088e2329466", size = 9613040, upload-time = "2025-12-10T22:55:38.715Z" },
+    { url = "https://files.pythonhosted.org/packages/30/4e/c10f171b6e2f44d9e3a2b96efa38b1677439d79c99357600a62cc1e9594e/matplotlib-3.10.8-cp312-cp312-win_amd64.whl", hash = "sha256:dd80ecb295460a5d9d260df63c43f4afbdd832d725a531f008dad1664f458adf", size = 8142717, upload-time = "2025-12-10T22:55:41.103Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/76/934db220026b5fef85f45d51a738b91dea7d70207581063cd9bd8fafcf74/matplotlib-3.10.8-cp312-cp312-win_arm64.whl", hash = "sha256:3c624e43ed56313651bc18a47f838b60d7b8032ed348911c54906b130b20071b", size = 8012751, upload-time = "2025-12-10T22:55:42.684Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/b9/15fd5541ef4f5b9a17eefd379356cf12175fe577424e7b1d80676516031a/matplotlib-3.10.8-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:3f2e409836d7f5ac2f1c013110a4d50b9f7edc26328c108915f9075d7d7a91b6", size = 8261076, upload-time = "2025-12-10T22:55:44.648Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/a0/2ba3473c1b66b9c74dc7107c67e9008cb1782edbe896d4c899d39ae9cf78/matplotlib-3.10.8-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:56271f3dac49a88d7fca5060f004d9d22b865f743a12a23b1e937a0be4818ee1", size = 8148794, upload-time = "2025-12-10T22:55:46.252Z" },
+    { url = "https://files.pythonhosted.org/packages/75/97/a471f1c3eb1fd6f6c24a31a5858f443891d5127e63a7788678d14e249aea/matplotlib-3.10.8-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:a0a7f52498f72f13d4a25ea70f35f4cb60642b466cbb0a9be951b5bc3f45a486", size = 8718474, upload-time = "2025-12-10T22:55:47.864Z" },
+    { url = "https://files.pythonhosted.org/packages/01/be/cd478f4b66f48256f42927d0acbcd63a26a893136456cd079c0cc24fbabf/matplotlib-3.10.8-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:646d95230efb9ca614a7a594d4fcacde0ac61d25e37dd51710b36477594963ce", size = 9549637, upload-time = "2025-12-10T22:55:50.048Z" },
+    { url = "https://files.pythonhosted.org/packages/5d/7c/8dc289776eae5109e268c4fb92baf870678dc048a25d4ac903683b86d5bf/matplotlib-3.10.8-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:f89c151aab2e2e23cb3fe0acad1e8b82841fd265379c4cecd0f3fcb34c15e0f6", size = 9613678, upload-time = "2025-12-10T22:55:52.21Z" },
+    { url = "https://files.pythonhosted.org/packages/64/40/37612487cc8a437d4dd261b32ca21fe2d79510fe74af74e1f42becb1bdb8/matplotlib-3.10.8-cp313-cp313-win_amd64.whl", hash = "sha256:e8ea3e2d4066083e264e75c829078f9e149fa119d27e19acd503de65e0b13149", size = 8142686, upload-time = "2025-12-10T22:55:54.253Z" },
+    { url = "https://files.pythonhosted.org/packages/66/52/8d8a8730e968185514680c2a6625943f70269509c3dcfc0dcf7d75928cb8/matplotlib-3.10.8-cp313-cp313-win_arm64.whl", hash = "sha256:c108a1d6fa78a50646029cb6d49808ff0fc1330fda87fa6f6250c6b5369b6645", size = 8012917, upload-time = "2025-12-10T22:55:56.268Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/27/51fe26e1062f298af5ef66343d8ef460e090a27fea73036c76c35821df04/matplotlib-3.10.8-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:ad3d9833a64cf48cc4300f2b406c3d0f4f4724a91c0bd5640678a6ba7c102077", size = 8305679, upload-time = "2025-12-10T22:55:57.856Z" },
+    { url = "https://files.pythonhosted.org/packages/2c/1e/4de865bc591ac8e3062e835f42dd7fe7a93168d519557837f0e37513f629/matplotlib-3.10.8-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:eb3823f11823deade26ce3b9f40dcb4a213da7a670013929f31d5f5ed1055b22", size = 8198336, upload-time = "2025-12-10T22:55:59.371Z" },
+    { url = "https://files.pythonhosted.org/packages/c6/cb/2f7b6e75fb4dce87ef91f60cac4f6e34f4c145ab036a22318ec837971300/matplotlib-3.10.8-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d9050fee89a89ed57b4fb2c1bfac9a3d0c57a0d55aed95949eedbc42070fea39", size = 8731653, upload-time = "2025-12-10T22:56:01.032Z" },
+    { url = "https://files.pythonhosted.org/packages/46/b3/bd9c57d6ba670a37ab31fb87ec3e8691b947134b201f881665b28cc039ff/matplotlib-3.10.8-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b44d07310e404ba95f8c25aa5536f154c0a8ec473303535949e52eb71d0a1565", size = 9561356, upload-time = "2025-12-10T22:56:02.95Z" },
+    { url = "https://files.pythonhosted.org/packages/c0/3d/8b94a481456dfc9dfe6e39e93b5ab376e50998cddfd23f4ae3b431708f16/matplotlib-3.10.8-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:0a33deb84c15ede243aead39f77e990469fff93ad1521163305095b77b72ce4a", size = 9614000, upload-time = "2025-12-10T22:56:05.411Z" },
+    { url = "https://files.pythonhosted.org/packages/bd/cd/bc06149fe5585ba800b189a6a654a75f1f127e8aab02fd2be10df7fa500c/matplotlib-3.10.8-cp313-cp313t-win_amd64.whl", hash = "sha256:3a48a78d2786784cc2413e57397981fb45c79e968d99656706018d6e62e57958", size = 8220043, upload-time = "2025-12-10T22:56:07.551Z" },
+    { url = "https://files.pythonhosted.org/packages/e3/de/b22cf255abec916562cc04eef457c13e58a1990048de0c0c3604d082355e/matplotlib-3.10.8-cp313-cp313t-win_arm64.whl", hash = "sha256:15d30132718972c2c074cd14638c7f4592bd98719e2308bccea40e0538bc0cb5", size = 8062075, upload-time = "2025-12-10T22:56:09.178Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/43/9c0ff7a2f11615e516c3b058e1e6e8f9614ddeca53faca06da267c48345d/matplotlib-3.10.8-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:b53285e65d4fa4c86399979e956235deb900be5baa7fc1218ea67fbfaeaadd6f", size = 8262481, upload-time = "2025-12-10T22:56:10.885Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/ca/e8ae28649fcdf039fda5ef554b40a95f50592a3c47e6f7270c9561c12b07/matplotlib-3.10.8-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:32f8dce744be5569bebe789e46727946041199030db8aeb2954d26013a0eb26b", size = 8151473, upload-time = "2025-12-10T22:56:12.377Z" },
+    { url = "https://files.pythonhosted.org/packages/f1/6f/009d129ae70b75e88cbe7e503a12a4c0670e08ed748a902c2568909e9eb5/matplotlib-3.10.8-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4cf267add95b1c88300d96ca837833d4112756045364f5c734a2276038dae27d", size = 9553896, upload-time = "2025-12-10T22:56:14.432Z" },
+    { url = "https://files.pythonhosted.org/packages/f5/26/4221a741eb97967bc1fd5e4c52b9aa5a91b2f4ec05b59f6def4d820f9df9/matplotlib-3.10.8-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2cf5bd12cecf46908f286d7838b2abc6c91cda506c0445b8223a7c19a00df008", size = 9824193, upload-time = "2025-12-10T22:56:16.29Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/f3/3abf75f38605772cf48a9daf5821cd4f563472f38b4b828c6fba6fa6d06e/matplotlib-3.10.8-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:41703cc95688f2516b480f7f339d8851a6035f18e100ee6a32bc0b8536a12a9c", size = 9615444, upload-time = "2025-12-10T22:56:18.155Z" },
+    { url = "https://files.pythonhosted.org/packages/93/a5/de89ac80f10b8dc615807ee1133cd99ac74082581196d4d9590bea10690d/matplotlib-3.10.8-cp314-cp314-win_amd64.whl", hash = "sha256:83d282364ea9f3e52363da262ce32a09dfe241e4080dcedda3c0db059d3c1f11", size = 8272719, upload-time = "2025-12-10T22:56:20.366Z" },
+    { url = "https://files.pythonhosted.org/packages/69/ce/b006495c19ccc0a137b48083168a37bd056392dee02f87dba0472f2797fe/matplotlib-3.10.8-cp314-cp314-win_arm64.whl", hash = "sha256:2c1998e92cd5999e295a731bcb2911c75f597d937341f3030cc24ef2733d78a8", size = 8144205, upload-time = "2025-12-10T22:56:22.239Z" },
+    { url = "https://files.pythonhosted.org/packages/68/d9/b31116a3a855bd313c6fcdb7226926d59b041f26061c6c5b1be66a08c826/matplotlib-3.10.8-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:b5a2b97dbdc7d4f353ebf343744f1d1f1cca8aa8bfddb4262fcf4306c3761d50", size = 8305785, upload-time = "2025-12-10T22:56:24.218Z" },
+    { url = "https://files.pythonhosted.org/packages/1e/90/6effe8103f0272685767ba5f094f453784057072f49b393e3ea178fe70a5/matplotlib-3.10.8-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:3f5c3e4da343bba819f0234186b9004faba952cc420fbc522dc4e103c1985908", size = 8198361, upload-time = "2025-12-10T22:56:26.787Z" },
+    { url = "https://files.pythonhosted.org/packages/d7/65/a73188711bea603615fc0baecca1061429ac16940e2385433cc778a9d8e7/matplotlib-3.10.8-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5f62550b9a30afde8c1c3ae450e5eb547d579dd69b25c2fc7a1c67f934c1717a", size = 9561357, upload-time = "2025-12-10T22:56:28.953Z" },
+    { url = "https://files.pythonhosted.org/packages/f4/3d/b5c5d5d5be8ce63292567f0e2c43dde9953d3ed86ac2de0a72e93c8f07a1/matplotlib-3.10.8-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:495672de149445ec1b772ff2c9ede9b769e3cb4f0d0aa7fa730d7f59e2d4e1c1", size = 9823610, upload-time = "2025-12-10T22:56:31.455Z" },
+    { url = "https://files.pythonhosted.org/packages/4d/4b/e7beb6bbd49f6bae727a12b270a2654d13c397576d25bd6786e47033300f/matplotlib-3.10.8-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:595ba4d8fe983b88f0eec8c26a241e16d6376fe1979086232f481f8f3f67494c", size = 9614011, upload-time = "2025-12-10T22:56:33.85Z" },
+    { url = "https://files.pythonhosted.org/packages/7c/e6/76f2813d31f032e65f6f797e3f2f6e4aab95b65015924b1c51370395c28a/matplotlib-3.10.8-cp314-cp314t-win_amd64.whl", hash = "sha256:25d380fe8b1dc32cf8f0b1b448470a77afb195438bafdf1d858bfb876f3edf7b", size = 8362801, upload-time = "2025-12-10T22:56:36.107Z" },
+    { url = "https://files.pythonhosted.org/packages/5d/49/d651878698a0b67f23aa28e17f45a6d6dd3d3f933fa29087fa4ce5947b5a/matplotlib-3.10.8-cp314-cp314t-win_arm64.whl", hash = "sha256:113bb52413ea508ce954a02c10ffd0d565f9c3bc7f2eddc27dfe1731e71c7b5f", size = 8192560, upload-time = "2025-12-10T22:56:38.008Z" },
+    { url = "https://files.pythonhosted.org/packages/f5/43/31d59500bb950b0d188e149a2e552040528c13d6e3d6e84d0cccac593dcd/matplotlib-3.10.8-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:f97aeb209c3d2511443f8797e3e5a569aebb040d4f8bc79aa3ee78a8fb9e3dd8", size = 8237252, upload-time = "2025-12-10T22:56:39.529Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/2c/615c09984f3c5f907f51c886538ad785cf72e0e11a3225de2c0f9442aecc/matplotlib-3.10.8-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:fb061f596dad3a0f52b60dc6a5dec4a0c300dec41e058a7efe09256188d170b7", size = 8124693, upload-time = "2025-12-10T22:56:41.758Z" },
+    { url = "https://files.pythonhosted.org/packages/91/e1/2757277a1c56041e1fc104b51a0f7b9a4afc8eb737865d63cababe30bc61/matplotlib-3.10.8-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:12d90df9183093fcd479f4172ac26b322b1248b15729cb57f42f71f24c7e37a3", size = 8702205, upload-time = "2025-12-10T22:56:43.415Z" },
+    { url = "https://files.pythonhosted.org/packages/04/30/3afaa31c757f34b7725ab9d2ba8b48b5e89c2019c003e7d0ead143aabc5a/matplotlib-3.10.8-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:6da7c2ce169267d0d066adcf63758f0604aa6c3eebf67458930f9d9b79ad1db1", size = 8249198, upload-time = "2025-12-10T22:56:45.584Z" },
+    { url = "https://files.pythonhosted.org/packages/48/2f/6334aec331f57485a642a7c8be03cb286f29111ae71c46c38b363230063c/matplotlib-3.10.8-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:9153c3292705be9f9c64498a8872118540c3f4123d1a1c840172edf262c8be4a", size = 8136817, upload-time = "2025-12-10T22:56:47.339Z" },
+    { url = "https://files.pythonhosted.org/packages/73/e4/6d6f14b2a759c622f191b2d67e9075a3f56aaccb3be4bb9bb6890030d0a0/matplotlib-3.10.8-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ae029229a57cd1e8fe542485f27e7ca7b23aa9e8944ddb4985d0bc444f1eca2", size = 8713867, upload-time = "2025-12-10T22:56:48.954Z" },
+]
+
 [[package]]
 name = "matplotlib-inline"
 version = "0.2.1"
@@ -3077,6 +3856,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/af/33/ee4519fa02ed11a94aef9559552f3b17bb863f2ecfe1a35dc7f548cde231/matplotlib_inline-0.2.1-py3-none-any.whl", hash = "sha256:d56ce5156ba6085e00a9d54fead6ed29a9c47e215cd1bba2e976ef39f5710a76", size = 9516, upload-time = "2025-10-23T09:00:20.675Z" },
 ]
 
+[[package]]
+name = "mbstrdecoder"
+version = "1.1.4"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "chardet" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/31/ab/05ae008357c8bdb6245ebf8a101d99f26c096e0ea20800b318153da23796/mbstrdecoder-1.1.4.tar.gz", hash = "sha256:8105ef9cf6b7d7d69fe7fd6b68a2d8f281ca9b365d7a9b670be376b2e6c81b21", size = 14527, upload-time = "2025-01-18T10:07:31.089Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/30/ac/5ce64a1d4cce00390beab88622a290420401f1cabf05caf2fc0995157c21/mbstrdecoder-1.1.4-py3-none-any.whl", hash = "sha256:03dae4ec50ec0d2ff4743e63fdbd5e0022815857494d35224b60775d3d934a8c", size = 7933, upload-time = "2025-01-18T10:07:29.562Z" },
+]
+
 [[package]]
 name = "mcp"
 version = "1.25.0"
@@ -3330,6 +4121,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/81/06/c5f8deba7d2cbdfa7967a716ae801aa9ca5f734b8f54fd473ef77a088dbe/mkdocstrings_python-2.0.1-py3-none-any.whl", hash = "sha256:66ecff45c5f8b71bf174e11d49afc845c2dfc7fc0ab17a86b6b337e0f24d8d90", size = 105055, upload-time = "2025-12-03T14:26:10.184Z" },
 ]
 
+[[package]]
+name = "more-itertools"
+version = "10.8.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/ea/5d/38b681d3fce7a266dd9ab73c66959406d565b3e85f21d5e66e1181d93721/more_itertools-10.8.0.tar.gz", hash = "sha256:f638ddf8a1a0d134181275fb5d58b086ead7c6a72429ad725c67503f13ba30bd", size = 137431, upload-time = "2025-09-02T15:23:11.018Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/a4/8e/469e5a4a2f5855992e425f3cb33804cc07bf18d48f2db061aec61ce50270/more_itertools-10.8.0-py3-none-any.whl", hash = "sha256:52d4362373dcf7c52546bc4af9a86ee7c4579df9a8dc268be0a2f949d376cc9b", size = 69667, upload-time = "2025-09-02T15:23:09.635Z" },
+]
+
 [[package]]
 name = "mpmath"
 version = "1.3.0"
@@ -3593,10 +4393,12 @@ version = "3.6.1"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/6a/51/63fe664f3908c97be9d2e4f1158eb633317598cfa6e1fc14af5383f17512/networkx-3.6.1.tar.gz", hash = "sha256:26b7c357accc0c8cde558ad486283728b65b6a95d85ee1cd66bafab4c8168509", size = 2517025, upload-time = "2025-12-08T17:02:39.908Z" }
@@ -3656,6 +4458,107 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/f9/33/bd5b9137445ea4b680023eb0469b2bb969d61303dedb2aac6560ff3d14a1/notebook_shim-0.2.4-py3-none-any.whl", hash = "sha256:411a5be4e9dc882a074ccbcae671eda64cceb068767e9a3419096986560e1cef", size = 13307, upload-time = "2024-02-14T23:35:16.286Z" },
 ]
 
+[[package]]
+name = "numba"
+version = "0.64.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "llvmlite" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/23/c9/a0fb41787d01d621046138da30f6c2100d80857bf34b3390dd68040f27a3/numba-0.64.0.tar.gz", hash = "sha256:95e7300af648baa3308127b1955b52ce6d11889d16e8cfe637b4f85d2fca52b1", size = 2765679, upload-time = "2026-02-18T18:41:20.974Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/4c/5e/604fed821cd7e3426bb3bc99a7ed6ac0bcb489f4cd93052256437d082f95/numba-0.64.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:cc09b79440952e3098eeebea4bf6e8d2355fb7f12734fcd9fc5039f0dca90727", size = 2683250, upload-time = "2026-02-18T18:40:45.829Z" },
+    { url = "https://files.pythonhosted.org/packages/4f/9f/9275a723d050b5f1a9b1c7fb7dbfce324fef301a8e50c5f88338569db06c/numba-0.64.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1afe3a80b8c2f376b211fb7a49e536ef9eafc92436afc95a2f41ea5392f8cc65", size = 3742168, upload-time = "2026-02-18T18:40:48.066Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/d1/97ca7dddaa36b16f4c46319bdb6b4913ba15d0245317d0d8ccde7b2d7d92/numba-0.64.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:23804194b93b8cd416c6444b5fbc4956082a45fed2d25436ef49c594666e7f7e", size = 3449103, upload-time = "2026-02-18T18:40:49.905Z" },
+    { url = "https://files.pythonhosted.org/packages/52/0a/b9e137ad78415373e3353564500e8bf29dbce3c0d73633bb384d4e5d7537/numba-0.64.0-cp310-cp310-win_amd64.whl", hash = "sha256:e2a9fe998bb2cf848960b34db02c2c3b5e02cf82c07a26d9eef3494069740278", size = 2749950, upload-time = "2026-02-18T18:40:51.536Z" },
+    { url = "https://files.pythonhosted.org/packages/89/a3/1a4286a1c16136c8896d8e2090d950e79b3ec626d3a8dc9620f6234d5a38/numba-0.64.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:766156ee4b8afeeb2b2e23c81307c5d19031f18d5ce76ae2c5fb1429e72fa92b", size = 2682938, upload-time = "2026-02-18T18:40:52.897Z" },
+    { url = "https://files.pythonhosted.org/packages/19/16/aa6e3ba3cd45435c117d1101b278b646444ed05b7c712af631b91353f573/numba-0.64.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d17071b4ffc9d39b75d8e6c101a36f0c81b646123859898c9799cb31807c8f78", size = 3747376, upload-time = "2026-02-18T18:40:54.925Z" },
+    { url = "https://files.pythonhosted.org/packages/c0/f1/dd2f25e18d75fdf897f730b78c5a7b00cc4450f2405564dbebfaf359f21f/numba-0.64.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4ead5630434133bac87fa67526eacb264535e4e9a2d5ec780e0b4fc381a7d275", size = 3453292, upload-time = "2026-02-18T18:40:56.818Z" },
+    { url = "https://files.pythonhosted.org/packages/31/29/e09d5630578a50a2b3fa154990b6b839cf95327aa0709e2d50d0b6816cd1/numba-0.64.0-cp311-cp311-win_amd64.whl", hash = "sha256:f2b1fd93e7aaac07d6fbaed059c00679f591f2423885c206d8c1b55d65ca3f2d", size = 2749824, upload-time = "2026-02-18T18:40:58.392Z" },
+    { url = "https://files.pythonhosted.org/packages/70/a6/9fc52cb4f0d5e6d8b5f4d81615bc01012e3cf24e1052a60f17a68deb8092/numba-0.64.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:69440a8e8bc1a81028446f06b363e28635aa67bd51b1e498023f03b812e0ce68", size = 2683418, upload-time = "2026-02-18T18:40:59.886Z" },
+    { url = "https://files.pythonhosted.org/packages/9b/89/1a74ea99b180b7a5587b0301ed1b183a2937c4b4b67f7994689b5d36fc34/numba-0.64.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:f13721011f693ba558b8dd4e4db7f2640462bba1b855bdc804be45bbeb55031a", size = 3804087, upload-time = "2026-02-18T18:41:01.699Z" },
+    { url = "https://files.pythonhosted.org/packages/91/e1/583c647404b15f807410510fec1eb9b80cb8474165940b7749f026f21cbc/numba-0.64.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e0b180b1133f2b5d8b3f09d96b6d7a9e51a7da5dda3c09e998b5bcfac85d222c", size = 3504309, upload-time = "2026-02-18T18:41:03.252Z" },
+    { url = "https://files.pythonhosted.org/packages/85/23/0fce5789b8a5035e7ace21216a468143f3144e02013252116616c58339aa/numba-0.64.0-cp312-cp312-win_amd64.whl", hash = "sha256:e63dc94023b47894849b8b106db28ccb98b49d5498b98878fac1a38f83ac007a", size = 2752740, upload-time = "2026-02-18T18:41:05.097Z" },
+    { url = "https://files.pythonhosted.org/packages/52/80/2734de90f9300a6e2503b35ee50d9599926b90cbb7ac54f9e40074cd07f1/numba-0.64.0-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:3bab2c872194dcd985f1153b70782ec0fbbe348fffef340264eacd3a76d59fd6", size = 2683392, upload-time = "2026-02-18T18:41:06.563Z" },
+    { url = "https://files.pythonhosted.org/packages/42/e8/14b5853ebefd5b37723ef365c5318a30ce0702d39057eaa8d7d76392859d/numba-0.64.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:703a246c60832cad231d2e73c1182f25bf3cc8b699759ec8fe58a2dbc689a70c", size = 3812245, upload-time = "2026-02-18T18:41:07.963Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/a2/f60dc6c96d19b7185144265a5fbf01c14993d37ff4cd324b09d0212aa7ce/numba-0.64.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:7e2e49a7900ee971d32af7609adc0cfe6aa7477c6f6cccdf6d8138538cf7756f", size = 3511328, upload-time = "2026-02-18T18:41:09.504Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/2a/fe7003ea7e7237ee7014f8eaeeb7b0d228a2db22572ca85bab2648cf52cb/numba-0.64.0-cp313-cp313-win_amd64.whl", hash = "sha256:396f43c3f77e78d7ec84cdfc6b04969c78f8f169351b3c4db814b97e7acf4245", size = 2752668, upload-time = "2026-02-18T18:41:11.455Z" },
+    { url = "https://files.pythonhosted.org/packages/3d/8a/77d26afe0988c592dd97cb8d4e80bfb3dfc7dbdacfca7d74a7c5c81dd8c2/numba-0.64.0-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:f565d55eaeff382cbc86c63c8c610347453af3d1e7afb2b6569aac1c9b5c93ce", size = 2683590, upload-time = "2026-02-18T18:41:12.897Z" },
+    { url = "https://files.pythonhosted.org/packages/8e/4b/600b8b7cdbc7f9cebee9ea3d13bb70052a79baf28944024ffcb59f0712e3/numba-0.64.0-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9b55169b18892c783f85e9ad9e6f5297a6d12967e4414e6b71361086025ff0bb", size = 3781163, upload-time = "2026-02-18T18:41:15.377Z" },
+    { url = "https://files.pythonhosted.org/packages/ff/73/53f2d32bfa45b7175e9944f6b816d8c32840178c3eee9325033db5bf838e/numba-0.64.0-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:196bcafa02c9dd1707e068434f6d5cedde0feb787e3432f7f1f0e993cc336c4c", size = 3481172, upload-time = "2026-02-18T18:41:17.281Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/00/aebd2f7f1e11e38814bb96e95a27580817a7b340608d3ac085fdbab83174/numba-0.64.0-cp314-cp314-win_amd64.whl", hash = "sha256:213e9acbe7f1c05090592e79020315c1749dd52517b90e94c517dca3f014d4a1", size = 2754700, upload-time = "2026-02-18T18:41:19.277Z" },
+]
+
+[[package]]
+name = "numexpr"
+version = "2.14.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/cb/2f/fdba158c9dbe5caca9c3eca3eaffffb251f2fb8674bf8e2d0aed5f38d319/numexpr-2.14.1.tar.gz", hash = "sha256:4be00b1086c7b7a5c32e31558122b7b80243fe098579b170967da83f3152b48b", size = 119400, upload-time = "2025-10-13T16:17:27.351Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/db/91/ccd504cbe5b88d06987c77f42ba37a13ef05065fdab4afe6dcfeb2961faf/numexpr-2.14.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:d0fab3fd06a04f6b86102552b26aa5d85e20ac7d8296c15764c726eeabae6cc8", size = 163200, upload-time = "2025-10-13T16:16:25.47Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/89/6b07977baf2af75fb6692f9e7a1fb612a15f600fc921f3f565366de01f4a/numexpr-2.14.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:64ae5dfd62d74a3ef82fe0b37f80527247f3626171ad82025900f46ffca4b39a", size = 152085, upload-time = "2025-10-13T16:16:29.508Z" },
+    { url = "https://files.pythonhosted.org/packages/28/c2/c5775541256c4bf16b4d88fa1cffa74a0126703e513093c8774d911b0bb7/numexpr-2.14.1-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:955c92b064f9074d2970cf3138f5e3b965be673b82024962ed526f39bc25a920", size = 449435, upload-time = "2025-10-13T16:13:16.257Z" },
+    { url = "https://files.pythonhosted.org/packages/34/d4/d1a410901c620f7a6a3c5c2b1fc9dab22170be05a89d2c02ae699e27bd3f/numexpr-2.14.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:75440c54fc01e130396650fdf307aa9d41a67dc06ddbfb288971b591c13a395b", size = 440197, upload-time = "2025-10-13T16:14:44.109Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/c8/fa85f0cc5c39db587ba4927b862a92477c017ee8476e415e8120a100457b/numexpr-2.14.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:dde9fa47ed319e1e1728940a539df3cb78326b7754bc7c6ab3152afc91808f9b", size = 1414125, upload-time = "2025-10-13T16:13:19.882Z" },
+    { url = "https://files.pythonhosted.org/packages/08/72/a58ddc05e0eabb3fa8d3fcd319f3d97870e6b41520832acfd04a6734c2c0/numexpr-2.14.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:76db0bc6267e591ab9c4df405ffb533598e4c88239db7338d11ae9e4b368a85a", size = 1463041, upload-time = "2025-10-13T16:14:47.502Z" },
+    { url = "https://files.pythonhosted.org/packages/c4/c5/bdd1862302bb71a78dba941eaf7060e1274f1cf6af2d1b0f1880bfcb289b/numexpr-2.14.1-cp310-cp310-win32.whl", hash = "sha256:0d1dcbdc4d0374c0d523cee2f94f06b001623cbc1fd163612841017a3495427c", size = 166833, upload-time = "2025-10-13T16:17:03.543Z" },
+    { url = "https://files.pythonhosted.org/packages/18/af/26773a246716922794388786529e5640676399efabb0ee217ce034df9d27/numexpr-2.14.1-cp310-cp310-win_amd64.whl", hash = "sha256:823cd82c8e7937981339f634e7a9c6a92cb2d0b9d0a5cf627a5e394fffc05377", size = 160068, upload-time = "2025-10-13T16:17:05.191Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/a3/67999bdd1ed1f938d38f3fedd4969632f2f197b090e50505f7cc1fa82510/numexpr-2.14.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:2d03fcb4644a12f70a14d74006f72662824da5b6128bf1bcd10cc3ed80e64c34", size = 163195, upload-time = "2025-10-13T16:16:31.212Z" },
+    { url = "https://files.pythonhosted.org/packages/25/95/d64f680ea1fc56d165457287e0851d6708800f9fcea346fc1b9957942ee6/numexpr-2.14.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2773ee1133f77009a1fc2f34fe236f3d9823779f5f75450e183137d49f00499f", size = 152088, upload-time = "2025-10-13T16:16:33.186Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/7f/3bae417cb13ae08afd86d08bb0301c32440fe0cae4e6262b530e0819aeda/numexpr-2.14.1-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ebe4980f9494b9f94d10d2e526edc29e72516698d3bf95670ba79415492212a4", size = 451126, upload-time = "2025-10-13T16:13:22.248Z" },
+    { url = "https://files.pythonhosted.org/packages/4c/1a/edbe839109518364ac0bd9e918cf874c755bb2c128040e920f198c494263/numexpr-2.14.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2a381e5e919a745c9503bcefffc1c7f98c972c04ec58fc8e999ed1a929e01ba6", size = 442012, upload-time = "2025-10-13T16:14:51.416Z" },
+    { url = "https://files.pythonhosted.org/packages/66/b1/be4ce99bff769a5003baddac103f34681997b31d4640d5a75c0e8ed59c78/numexpr-2.14.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d08856cfc1b440eb1caaa60515235369654321995dd68eb9377577392020f6cb", size = 1415975, upload-time = "2025-10-13T16:13:26.088Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/33/b33b8fdc032a05d9ebb44a51bfcd4b92c178a2572cd3e6c1b03d8a4b45b2/numexpr-2.14.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:03130afa04edf83a7b590d207444f05a00363c9b9ea5d81c0f53b1ea13fad55a", size = 1464683, upload-time = "2025-10-13T16:14:58.87Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/b2/ddcf0ac6cf0a1d605e5aecd4281507fd79a9628a67896795ab2e975de5df/numexpr-2.14.1-cp311-cp311-win32.whl", hash = "sha256:db78fa0c9fcbaded3ae7453faf060bd7a18b0dc10299d7fcd02d9362be1213ed", size = 166838, upload-time = "2025-10-13T16:17:06.765Z" },
+    { url = "https://files.pythonhosted.org/packages/64/72/4ca9bd97b2eb6dce9f5e70a3b6acec1a93e1fb9b079cb4cba2cdfbbf295d/numexpr-2.14.1-cp311-cp311-win_amd64.whl", hash = "sha256:e9b2f957798c67a2428be96b04bce85439bed05efe78eb78e4c2ca43737578e7", size = 160069, upload-time = "2025-10-13T16:17:08.752Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/20/c473fc04a371f5e2f8c5749e04505c13e7a8ede27c09e9f099b2ad6f43d6/numexpr-2.14.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:91ebae0ab18c799b0e6b8c5a8d11e1fa3848eb4011271d99848b297468a39430", size = 162790, upload-time = "2025-10-13T16:16:34.903Z" },
+    { url = "https://files.pythonhosted.org/packages/45/93/b6760dd1904c2a498e5f43d1bb436f59383c3ddea3815f1461dfaa259373/numexpr-2.14.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:47041f2f7b9e69498fb311af672ba914a60e6e6d804011caacb17d66f639e659", size = 152196, upload-time = "2025-10-13T16:16:36.593Z" },
+    { url = "https://files.pythonhosted.org/packages/72/94/cc921e35593b820521e464cbbeaf8212bbdb07f16dc79fe283168df38195/numexpr-2.14.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d686dfb2c1382d9e6e0ee0b7647f943c1886dba3adbf606c625479f35f1956c1", size = 452468, upload-time = "2025-10-13T16:13:29.531Z" },
+    { url = "https://files.pythonhosted.org/packages/d9/43/560e9ba23c02c904b5934496486d061bcb14cd3ebba2e3cf0e2dccb6c22b/numexpr-2.14.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:eee6d4fbbbc368e6cdd0772734d6249128d957b3b8ad47a100789009f4de7083", size = 443631, upload-time = "2025-10-13T16:15:02.473Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/6c/78f83b6219f61c2c22d71ab6e6c2d4e5d7381334c6c29b77204e59edb039/numexpr-2.14.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3a2839efa25f3c8d4133252ea7342d8f81226c7c4dda81f97a57e090b9d87a48", size = 1417670, upload-time = "2025-10-13T16:13:33.464Z" },
+    { url = "https://files.pythonhosted.org/packages/0e/bb/1ccc9dcaf46281568ce769888bf16294c40e98a5158e4b16c241de31d0d3/numexpr-2.14.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:9f9137f1351b310436662b5dc6f4082a245efa8950c3b0d9008028df92fefb9b", size = 1466212, upload-time = "2025-10-13T16:15:12.828Z" },
+    { url = "https://files.pythonhosted.org/packages/31/9f/203d82b9e39dadd91d64bca55b3c8ca432e981b822468dcef41a4418626b/numexpr-2.14.1-cp312-cp312-win32.whl", hash = "sha256:36f8d5c1bd1355df93b43d766790f9046cccfc1e32b7c6163f75bcde682cda07", size = 166996, upload-time = "2025-10-13T16:17:10.369Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/67/ffe750b5452eb66de788c34e7d21ec6d886abb4d7c43ad1dc88ceb3d998f/numexpr-2.14.1-cp312-cp312-win_amd64.whl", hash = "sha256:fdd886f4b7dbaf167633ee396478f0d0aa58ea2f9e7ccc3c6431019623e8d68f", size = 160187, upload-time = "2025-10-13T16:17:11.974Z" },
+    { url = "https://files.pythonhosted.org/packages/73/b4/9f6d637fd79df42be1be29ee7ba1f050fab63b7182cb922a0e08adc12320/numexpr-2.14.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:09078ba73cffe94745abfbcc2d81ab8b4b4e9d7bfbbde6cac2ee5dbf38eee222", size = 162794, upload-time = "2025-10-13T16:16:38.291Z" },
+    { url = "https://files.pythonhosted.org/packages/35/ae/d58558d8043de0c49f385ea2fa789e3cfe4d436c96be80200c5292f45f15/numexpr-2.14.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:dce0b5a0447baa7b44bc218ec2d7dcd175b8eee6083605293349c0c1d9b82fb6", size = 152203, upload-time = "2025-10-13T16:16:39.907Z" },
+    { url = "https://files.pythonhosted.org/packages/13/65/72b065f9c75baf8f474fd5d2b768350935989d4917db1c6c75b866d4067c/numexpr-2.14.1-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:06855053de7a3a8425429bd996e8ae3c50b57637ad3e757e0fa0602a7874be30", size = 455860, upload-time = "2025-10-13T16:13:35.811Z" },
+    { url = "https://files.pythonhosted.org/packages/fc/f9/c9457652dfe28e2eb898372da2fe786c6db81af9540c0f853ee04a0699cc/numexpr-2.14.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:05f9366d23a2e991fd5a8b5e61a17558f028ba86158a4552f8f239b005cdf83c", size = 446574, upload-time = "2025-10-13T16:15:17.367Z" },
+    { url = "https://files.pythonhosted.org/packages/b6/99/8d3879c4d67d3db5560cf2de65ce1778b80b75f6fa415eb5c3e7bd37ba27/numexpr-2.14.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:c5f1b1605695778896534dfc6e130d54a65cd52be7ed2cd0cfee3981fd676bf5", size = 1417306, upload-time = "2025-10-13T16:13:42.813Z" },
+    { url = "https://files.pythonhosted.org/packages/ea/05/6bddac9f18598ba94281e27a6943093f7d0976544b0cb5d92272c64719bd/numexpr-2.14.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:a4ba71db47ea99c659d88ee6233fa77b6dc83392f1d324e0c90ddf617ae3f421", size = 1466145, upload-time = "2025-10-13T16:15:27.464Z" },
+    { url = "https://files.pythonhosted.org/packages/24/5d/cbeb67aca0c5a76ead13df7e8bd8dd5e0d49145f90da697ba1d9f07005b0/numexpr-2.14.1-cp313-cp313-win32.whl", hash = "sha256:638dce8320f4a1483d5ca4fda69f60a70ed7e66be6e68bc23fb9f1a6b78a9e3b", size = 166996, upload-time = "2025-10-13T16:17:13.803Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/23/9281bceaeb282cead95f0aa5f7f222ffc895670ea689cc1398355f6e3001/numexpr-2.14.1-cp313-cp313-win_amd64.whl", hash = "sha256:9fdcd4735121658a313f878fd31136d1bfc6a5b913219e7274e9fca9f8dac3bb", size = 160189, upload-time = "2025-10-13T16:17:15.417Z" },
+    { url = "https://files.pythonhosted.org/packages/f3/76/7aac965fd93a56803cbe502aee2adcad667253ae34b0badf6c5af7908b6c/numexpr-2.14.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:557887ad7f5d3c2a40fd7310e50597045a68e66b20a77b3f44d7bc7608523b4b", size = 163524, upload-time = "2025-10-13T16:16:42.213Z" },
+    { url = "https://files.pythonhosted.org/packages/58/65/79d592d5e63fbfab3b59a60c386853d9186a44a3fa3c87ba26bdc25b6195/numexpr-2.14.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:af111c8fe6fc55d15e4c7cab11920fc50740d913636d486545b080192cd0ad73", size = 152919, upload-time = "2025-10-13T16:16:44.229Z" },
+    { url = "https://files.pythonhosted.org/packages/84/78/3c8335f713d4aeb99fa758d7c62f0be1482d4947ce5b508e2052bb7aeee9/numexpr-2.14.1-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:33265294376e7e2ae4d264d75b798a915d2acf37b9dd2b9405e8b04f84d05cfc", size = 465972, upload-time = "2025-10-13T16:13:45.061Z" },
+    { url = "https://files.pythonhosted.org/packages/35/81/9ee5f69b811e8f18746c12d6f71848617684edd3161927f95eee7a305631/numexpr-2.14.1-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:83647d846d3eeeb9a9255311236135286728b398d0d41d35dedb532dca807fe9", size = 456953, upload-time = "2025-10-13T16:15:31.186Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/39/9b8bc6e294d85cbb54a634e47b833e9f3276a8bdf7ce92aa808718a0212d/numexpr-2.14.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:6e575fd3ad41ddf3355d0c7ef6bd0168619dc1779a98fe46693cad5e95d25e6e", size = 1426199, upload-time = "2025-10-13T16:13:48.231Z" },
+    { url = "https://files.pythonhosted.org/packages/1e/ce/0d4fcd31ab49319740d934fba1734d7dad13aa485532ca754e555ca16c8b/numexpr-2.14.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:67ea4771029ce818573b1998f5ca416bd255156feea017841b86176a938f7d19", size = 1474214, upload-time = "2025-10-13T16:15:38.893Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/47/b2a93cbdb3ba4e009728ad1b9ef1550e2655ea2c86958ebaf03b9615f275/numexpr-2.14.1-cp313-cp313t-win32.whl", hash = "sha256:15015d47d3d1487072d58c0e7682ef2eb608321e14099c39d52e2dd689483611", size = 167676, upload-time = "2025-10-13T16:17:17.351Z" },
+    { url = "https://files.pythonhosted.org/packages/86/99/ee3accc589ed032eea68e12172515ed96a5568534c213ad109e1f4411df1/numexpr-2.14.1-cp313-cp313t-win_amd64.whl", hash = "sha256:94c711f6d8f17dfb4606842b403699603aa591ab9f6bf23038b488ea9cfb0f09", size = 161096, upload-time = "2025-10-13T16:17:19.174Z" },
+    { url = "https://files.pythonhosted.org/packages/ac/36/9db78dfbfdfa1f8bf0872993f1a334cdd8fca5a5b6567e47dcb128bcb7c2/numexpr-2.14.1-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:ede79f7ff06629f599081de644546ce7324f1581c09b0ac174da88a470d39c21", size = 162848, upload-time = "2025-10-13T16:16:46.216Z" },
+    { url = "https://files.pythonhosted.org/packages/13/c1/a5c78ae637402c5550e2e0ba175275d2515d432ec28af0cdc23c9b476e65/numexpr-2.14.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:2eac7a5a2f70b3768c67056445d1ceb4ecd9b853c8eda9563823b551aeaa5082", size = 152270, upload-time = "2025-10-13T16:16:47.92Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/ed/aabd8678077848dd9a751c5558c2057839f5a09e2a176d8dfcd0850ee00e/numexpr-2.14.1-cp314-cp314-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5aedf38d4c0c19d3cecfe0334c3f4099fb496f54c146223d30fa930084bc8574", size = 455918, upload-time = "2025-10-13T16:13:50.338Z" },
+    { url = "https://files.pythonhosted.org/packages/88/e1/3db65117f02cdefb0e5e4c440daf1c30beb45051b7f47aded25b7f4f2f34/numexpr-2.14.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:439ec4d57b853792ebe5456e3160312281c3a7071ecac5532ded3278ede614de", size = 446512, upload-time = "2025-10-13T16:15:42.313Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/fb/7ceb9ee55b5f67e4a3e4d73d5af4c7e37e3c9f37f54bee90361b64b17e3f/numexpr-2.14.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:e23b87f744e04e302d82ac5e2189ae20a533566aec76a46885376e20b0645bf8", size = 1417845, upload-time = "2025-10-13T16:13:53.836Z" },
+    { url = "https://files.pythonhosted.org/packages/45/2d/9b5764d0eafbbb2889288f80de773791358acf6fad1a55767538d8b79599/numexpr-2.14.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:44f84e0e5af219dbb62a081606156420815890e041b87252fbcea5df55214c4c", size = 1466211, upload-time = "2025-10-13T16:15:48.985Z" },
+    { url = "https://files.pythonhosted.org/packages/5d/21/204db708eccd71aa8bc55bcad55bc0fc6c5a4e01ad78e14ee5714a749386/numexpr-2.14.1-cp314-cp314-win32.whl", hash = "sha256:1f1a5e817c534539351aa75d26088e9e1e0ef1b3a6ab484047618a652ccc4fc3", size = 168835, upload-time = "2025-10-13T16:17:20.82Z" },
+    { url = "https://files.pythonhosted.org/packages/4f/3e/d83e9401a1c3449a124f7d4b3fb44084798e0d30f7c11e60712d9b94cf11/numexpr-2.14.1-cp314-cp314-win_amd64.whl", hash = "sha256:587c41509bc373dfb1fe6086ba55a73147297247bedb6d588cda69169fc412f2", size = 162608, upload-time = "2025-10-13T16:17:22.228Z" },
+    { url = "https://files.pythonhosted.org/packages/7f/d6/ec947806bb57836d6379a8c8a253c2aeaa602b12fef2336bfd2462bb4ed5/numexpr-2.14.1-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:ec368819502b64f190c3f71be14a304780b5935c42aae5bf22c27cc2cbba70b5", size = 163525, upload-time = "2025-10-13T16:16:50.133Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/77/048f30dcf661a3d52963a88c29b52b6d5ce996d38e9313a56a922451c1e0/numexpr-2.14.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:7e87f6d203ac57239de32261c941e9748f9309cbc0da6295eabd0c438b920d3a", size = 152917, upload-time = "2025-10-13T16:16:52.055Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/d3/956a13e628d722d649fbf2fded615134a308c082e122a48bad0e90a99ce9/numexpr-2.14.1-cp314-cp314t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:dd72d8c2a165fe45ea7650b16eb8cc1792a94a722022006bb97c86fe51fd2091", size = 466242, upload-time = "2025-10-13T16:13:55.795Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/dd/abe848678d82486940892f2cacf39e82eec790e8930d4d713d3f9191063b/numexpr-2.14.1-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:70d80fcb418a54ca208e9a38e58ddc425c07f66485176b261d9a67c7f2864f73", size = 457149, upload-time = "2025-10-13T16:15:52.036Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/bb/797b583b5fb9da5700a5708ca6eb4f889c94d81abb28de4d642c0f4b3258/numexpr-2.14.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:edea2f20c2040df8b54ee8ca8ebda63de9545b2112872466118e9df4d0ae99f3", size = 1426493, upload-time = "2025-10-13T16:13:59.244Z" },
+    { url = "https://files.pythonhosted.org/packages/77/c4/0519ab028fdc35e3e7ee700def7f2b4631b175cd9e1202bd7966c1695c33/numexpr-2.14.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:790447be6879a6c51b9545f79612d24c9ea0a41d537a84e15e6a8ddef0b6268e", size = 1474413, upload-time = "2025-10-13T16:15:59.211Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/4a/33044878c8f4a75213cfe9c11d4c02058bb710a7a063fe14f362e8de1077/numexpr-2.14.1-cp314-cp314t-win32.whl", hash = "sha256:538961096c2300ea44240209181e31fae82759d26b51713b589332b9f2a4117e", size = 169502, upload-time = "2025-10-13T16:17:23.829Z" },
+    { url = "https://files.pythonhosted.org/packages/41/a2/5a1a2c72528b429337f49911b18c302ecd36eeab00f409147e1aa4ae4519/numexpr-2.14.1-cp314-cp314t-win_amd64.whl", hash = "sha256:a40b350cd45b4446076fa11843fa32bbe07024747aeddf6d467290bf9011b392", size = 163589, upload-time = "2025-10-13T16:17:25.696Z" },
+]
+
 [[package]]
 name = "numpy"
 version = "2.2.6"
@@ -3728,10 +4631,12 @@ version = "2.4.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/a4/7a/6a3d14e205d292b738db449d0de649b373a59edb0d0b4493821d0a3e8718/numpy-2.4.0.tar.gz", hash = "sha256:6e504f7b16118198f138ef31ba24d985b124c2c469fe8467007cf30fd992f934", size = 20685720, upload-time = "2025-12-20T16:18:19.023Z" }
@@ -3929,10 +4834,10 @@ wheels = [
 
 [[package]]
 name = "nvidia-nvshmem-cu12"
-version = "3.4.5"
+version = "3.3.20"
 source = { registry = "https://pypi.org/simple" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b5/09/6ea3ea725f82e1e76684f0708bbedd871fc96da89945adeba65c3835a64c/nvidia_nvshmem_cu12-3.4.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:042f2500f24c021db8a06c5eec2539027d57460e1c1a762055a6554f72c369bd", size = 139103095, upload-time = "2025-09-06T00:32:31.266Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/6c/99acb2f9eb85c29fc6f3a7ac4dccfd992e22666dd08a642b303311326a97/nvidia_nvshmem_cu12-3.3.20-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d00f26d3f9b2e3c3065be895e3059d6479ea5c638a3f38c9fec49b1b9dd7c1e5", size = 124657145, upload-time = "2025-08-04T20:25:19.995Z" },
 ]
 
 [[package]]
@@ -3943,6 +4848,31 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a2/eb/86626c1bbc2edb86323022371c39aa48df6fd8b0a1647bc274577f72e90b/nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5b17e2001cc0d751a5bc2c6ec6d26ad95913324a4adb86788c944f8ce9ba441f", size = 89954, upload-time = "2025-03-07T01:42:44.131Z" },
 ]
 
+[[package]]
+name = "oauth2client"
+version = "4.1.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "httplib2" },
+    { name = "pyasn1" },
+    { name = "pyasn1-modules" },
+    { name = "rsa" },
+    { name = "six" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/a6/7b/17244b1083e8e604bf154cf9b716aecd6388acd656dd01893d0d244c94d9/oauth2client-4.1.3.tar.gz", hash = "sha256:d486741e451287f69568a4d26d70d9acd73a2bbfa275746c535b4209891cccc6", size = 155910, upload-time = "2018-09-07T21:38:18.036Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/95/a9/4f25a14d23f0786b64875b91784607c2277eff25d48f915e39ff0cff505a/oauth2client-4.1.3-py2.py3-none-any.whl", hash = "sha256:b8a81cc5d60e2d364f0b1b98f958dbd472887acaf1a5b05e21c28c31a2d6d3ac", size = 98206, upload-time = "2018-09-07T21:38:16.742Z" },
+]
+
+[[package]]
+name = "oauthlib"
+version = "3.3.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/0b/5f/19930f824ffeb0ad4372da4812c50edbd1434f678c90c2733e1188edfc63/oauthlib-3.3.1.tar.gz", hash = "sha256:0f0f8aa759826a193cf66c12ea1af1637f87b9b4622d46e866952bb022e538c9", size = 185918, upload-time = "2025-06-19T22:48:08.269Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/be/9c/92789c596b8df838baa98fa71844d84283302f7604ed565dafe5a6b5041a/oauthlib-3.3.1-py3-none-any.whl", hash = "sha256:88119c938d2b8fb88561af5f6ee0eec8cc8d552b7bb1f712743136eb7523b7a1", size = 160065, upload-time = "2025-06-19T22:48:06.508Z" },
+]
+
 [[package]]
 name = "openai"
 version = "1.109.1"
@@ -4283,6 +5213,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl", hash = "sha256:a0d503e138a4c123b27490a4f7beda6a01c6f288df0e4a8b79c7eb0dc7b4cc08", size = 31191, upload-time = "2023-12-10T22:30:43.14Z" },
 ]
 
+[[package]]
+name = "pathvalidate"
+version = "3.3.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/fa/2a/52a8da6fe965dea6192eb716b357558e103aea0a1e9a8352ad575a8406ca/pathvalidate-3.3.1.tar.gz", hash = "sha256:b18c07212bfead624345bb8e1d6141cdcf15a39736994ea0b94035ad2b1ba177", size = 63262, upload-time = "2025-06-15T09:07:20.736Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/9a/70/875f4a23bfc4731703a5835487d0d2fb999031bd415e7d17c0ae615c18b7/pathvalidate-3.3.1-py3-none-any.whl", hash = "sha256:5263baab691f8e1af96092fa5137ee17df5bdfbd6cff1fcac4d6ef4bc2e1735f", size = 24305, upload-time = "2025-06-15T09:07:19.117Z" },
+]
+
 [[package]]
 name = "pdfminer-six"
 version = "20231228"
@@ -4296,6 +5235,28 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/eb/9c/e46fe7502b32d7db6af6e36a9105abb93301fa1ec475b5ddcba8b35ae23a/pdfminer.six-20231228-py3-none-any.whl", hash = "sha256:e8d3c3310e6fbc1fe414090123ab01351634b4ecb021232206c4c9a8ca3e3b8f", size = 5614515, upload-time = "2023-12-28T21:25:30.329Z" },
 ]
 
+[[package]]
+name = "peft"
+version = "0.18.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "accelerate" },
+    { name = "huggingface-hub" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "packaging" },
+    { name = "psutil" },
+    { name = "pyyaml" },
+    { name = "safetensors" },
+    { name = "torch" },
+    { name = "tqdm" },
+    { name = "transformers" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/d8/48/147b3ea999560b40a34fd78724c7777aa9d18409c2250bdcaf9c4f2db7fc/peft-0.18.1.tar.gz", hash = "sha256:2dd0d6bfce936d1850e48aaddbd250941c5c02fc8ef3237cd8fd5aac35e0bae2", size = 635030, upload-time = "2026-01-09T13:08:01.136Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b3/14/b4e3f574acf349ae6f61f9c000a77f97a3b315b4bb6ad03791e79ae4a568/peft-0.18.1-py3-none-any.whl", hash = "sha256:0bf06847a3551e3019fc58c440cffc9a6b73e6e2962c95b52e224f77bbdb50f1", size = 556960, upload-time = "2026-01-09T13:07:55.865Z" },
+]
+
 [[package]]
 name = "pexpect"
 version = "4.9.0"
@@ -4407,6 +5368,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/ce/ad/bf3db68d30ac798ca31c80624709a0c03aa890e2e20e5ca987d7e55fcfc2/polars_lts_cpu-1.33.1-cp39-abi3-win_arm64.whl", hash = "sha256:c99ab56b059cee6bcabe9fb89e97f5813be1012a2251bf77f76e15c2d1cba934", size = 35445244, upload-time = "2025-09-09T08:37:22.97Z" },
 ]
 
+[[package]]
+name = "portalocker"
+version = "3.2.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "pywin32", marker = "sys_platform == 'win32'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/5e/77/65b857a69ed876e1951e88aaba60f5ce6120c33703f7cb61a3c894b8c1b6/portalocker-3.2.0.tar.gz", hash = "sha256:1f3002956a54a8c3730586c5c77bf18fae4149e07eaf1c29fc3faf4d5a3f89ac", size = 95644, upload-time = "2025-06-14T13:20:40.03Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/4b/a6/38c8e2f318bf67d338f4d629e93b0b4b9af331f455f0390ea8ce4a099b26/portalocker-3.2.0-py3-none-any.whl", hash = "sha256:3cdc5f565312224bc570c49337bd21428bba0ef363bbcf58b9ef4a9f11779968", size = 22424, upload-time = "2025-06-14T13:20:38.083Z" },
+]
+
 [[package]]
 name = "pre-commit"
 version = "4.5.1"
@@ -4558,6 +5531,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/5b/5a/bc7b4a4ef808fa59a816c17b20c4bef6884daebbdf627ff2a161da67da19/propcache-0.4.1-py3-none-any.whl", hash = "sha256:af2a6052aeb6cf17d3e46ee169099044fd8224cbaf75c76a2ef596e8163e2237", size = 13305, upload-time = "2025-10-08T19:49:00.792Z" },
 ]
 
+[[package]]
+name = "proto-plus"
+version = "1.27.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "protobuf" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/81/0d/94dfe80193e79d55258345901acd2917523d56e8381bc4dee7fd38e3868a/proto_plus-1.27.2.tar.gz", hash = "sha256:b2adde53adadf75737c44d3dcb0104fde65250dfc83ad59168b4aa3e574b6a24", size = 57204, upload-time = "2026-03-26T22:18:57.174Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/84/f3/1fba73eeffafc998a25d59703b63f8be4fe8a5cb12eaff7386a0ba0f7125/proto_plus-1.27.2-py3-none-any.whl", hash = "sha256:6432f75893d3b9e70b9c412f1d2f03f65b11fb164b793d14ae2ca01821d22718", size = 50450, upload-time = "2026-03-26T22:13:42.927Z" },
+]
+
 [[package]]
 name = "protobuf"
 version = "6.33.2"
@@ -4755,6 +5740,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/47/8d/d529b5d697919ba8c11ad626e835d4039be708a35b0d22de83a269a6682c/pyasn1_modules-0.4.2-py3-none-any.whl", hash = "sha256:29253a9207ce32b64c3ac6600edc75368f98473906e8fd1043bd6b5b1de2c14a", size = 181259, upload-time = "2025-03-28T02:41:19.028Z" },
 ]
 
+[[package]]
+name = "pybind11"
+version = "3.0.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/a5/98/9118a0659646f1628c592ef9bb48e0056efa6bf27c951fd12a178e0136fb/pybind11-3.0.2.tar.gz", hash = "sha256:432f01aeb68e361a3a7fc7575c2c7f497595bf640f747acd909ff238dd766e06", size = 577131, upload-time = "2026-02-17T04:46:52.556Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/88/c5/e98d9c51f3d5300d5e40ad9037dd6b3b60736fd02ab68dcc98c96be7592d/pybind11-3.0.2-py3-none-any.whl", hash = "sha256:f8a6500548919cc33bcd220d5f984688326f574fa97f1107f2f4fdb4c6fb019f", size = 310158, upload-time = "2026-02-17T04:46:49.91Z" },
+]
+
 [[package]]
 name = "pycparser"
 version = "2.23"
@@ -4911,6 +5905,22 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/c1/60/5d4751ba3f4a40a6891f24eec885f51afd78d208498268c734e256fb13c4/pydantic_settings-2.12.0-py3-none-any.whl", hash = "sha256:fddb9fd99a5b18da837b29710391e945b1e30c135477f484084ee513adb93809", size = 51880, upload-time = "2025-11-10T14:25:45.546Z" },
 ]
 
+[[package]]
+name = "pydrive2"
+version = "1.21.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cryptography" },
+    { name = "google-api-python-client" },
+    { name = "oauth2client" },
+    { name = "pyopenssl" },
+    { name = "pyyaml" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/3f/dc/92b0beba58f09441219bb6720bebdb895317632db4778cfe1d21532d27e5/pydrive2-1.21.3.tar.gz", hash = "sha256:649b84d60c637bc7146485039535aa8f1254ad156423739f07e5d32507447c13", size = 63348, upload-time = "2024-11-29T09:49:53.556Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/8c/de/eef2e2661371b02d4231c5cacbb758a52ea9ea98cb5f52d69298641e2631/PyDrive2-1.21.3-py3-none-any.whl", hash = "sha256:843a304f500e71508162807001f5e19487f272e8ff5648f43582bd24c6250200", size = 47972, upload-time = "2024-11-29T09:49:51.254Z" },
+]
+
 [[package]]
 name = "pygments"
 version = "2.19.2"
@@ -4956,6 +5966,45 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7c/4c/ad33b92b9864cbde84f259d5df035a6447f91891f5be77788e2a3892bce3/pymysql-1.1.2-py3-none-any.whl", hash = "sha256:e6b1d89711dd51f8f74b1631fe08f039e7d76cf67a42a323d3178f0f25762ed9", size = 45300, upload-time = "2025-08-24T12:55:53.394Z" },
 ]
 
+[[package]]
+name = "pynndescent"
+version = "0.6.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "joblib" },
+    { name = "llvmlite" },
+    { name = "numba" },
+    { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scipy", version = "1.17.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/4a/fb/7f58c397fb31666756457ee2ac4c0289ef2daad57f4ae4be8dec12f80b03/pynndescent-0.6.0.tar.gz", hash = "sha256:7ffde0fb5b400741e055a9f7d377e3702e02250616834231f6c209e39aac24f5", size = 2992987, upload-time = "2026-01-08T21:29:58.943Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b2/e6/94145d714402fd5ade00b5661f2d0ab981219e07f7db9bfa16786cdb9c04/pynndescent-0.6.0-py3-none-any.whl", hash = "sha256:dc8c74844e4c7f5cbd1e0cd6909da86fdc789e6ff4997336e344779c3d5538ef", size = 73511, upload-time = "2026-01-08T21:29:57.306Z" },
+]
+
+[[package]]
+name = "pyopenssl"
+version = "24.2.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cryptography" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/5d/70/ff56a63248562e77c0c8ee4aefc3224258f1856977e0c1472672b62dadb8/pyopenssl-24.2.1.tar.gz", hash = "sha256:4247f0dbe3748d560dcbb2ff3ea01af0f9a1a001ef5f7c4c647956ed8cbf0e95", size = 184323, upload-time = "2024-07-20T17:26:31.252Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/d9/dd/e0aa7ebef5168c75b772eda64978c597a9129b46be17779054652a7999e4/pyOpenSSL-24.2.1-py3-none-any.whl", hash = "sha256:967d5719b12b243588573f39b0c677637145c7a1ffedcd495a487e58177fbb8d", size = 58390, upload-time = "2024-07-20T17:26:29.057Z" },
+]
+
+[[package]]
+name = "pyparsing"
+version = "3.3.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f3/91/9c6ee907786a473bf81c5f53cf703ba0957b23ab84c264080fb5a450416f/pyparsing-3.3.2.tar.gz", hash = "sha256:c777f4d763f140633dcb6d8a3eda953bf7a214dc4eff598413c070bcdc117cbc", size = 6851574, upload-time = "2026-01-21T03:57:59.36Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/10/bd/c038d7cc38edc1aa5bf91ab8068b63d4308c66c4c8bb3cbba7dfbc049f9c/pyparsing-3.3.2-py3-none-any.whl", hash = "sha256:850ba148bd908d7e2411587e247a1e4f0327839c40e2e5e6d05a007ecc69911d", size = 122781, upload-time = "2026-01-21T03:57:55.912Z" },
+]
+
 [[package]]
 name = "pypdf2"
 version = "3.0.1"
@@ -4965,6 +6014,33 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/8e/5e/c86a5643653825d3c913719e788e41386bee415c2b87b4f955432f2de6b2/pypdf2-3.0.1-py3-none-any.whl", hash = "sha256:d16e4205cfee272fbdc0568b68d82be796540b1537508cef59388f839c191928", size = 232572, upload-time = "2022-12-31T10:36:10.327Z" },
 ]
 
+[[package]]
+name = "pysocks"
+version = "1.7.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/bd/11/293dd436aea955d45fc4e8a35b6ae7270f5b8e00b53cf6c024c83b657a11/PySocks-1.7.1.tar.gz", hash = "sha256:3f8804571ebe159c380ac6de37643bb4685970655d3bba243530d6558b799aa0", size = 284429, upload-time = "2019-09-20T02:07:35.714Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/8d/59/b4572118e098ac8e46e399a1dd0f2d85403ce8bbaad9ec79373ed6badaf9/PySocks-1.7.1-py3-none-any.whl", hash = "sha256:2725bd0a9925919b9b51739eea5f9e2bae91e83288108a9ad338b2e3a4435ee5", size = 16725, upload-time = "2019-09-20T02:06:22.938Z" },
+]
+
+[[package]]
+name = "pytablewriter"
+version = "1.2.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "dataproperty" },
+    { name = "mbstrdecoder" },
+    { name = "pathvalidate" },
+    { name = "setuptools" },
+    { name = "tabledata" },
+    { name = "tcolorpy" },
+    { name = "typepy", extra = ["datetime"] },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/f6/a1/617730f290f04d347103ab40bf67d317df6691b14746f6e1ea039fb57062/pytablewriter-1.2.1.tar.gz", hash = "sha256:7bd0f4f397e070e3b8a34edcf1b9257ccbb18305493d8350a5dbc9957fced959", size = 619241, upload-time = "2025-01-01T15:37:00.04Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/21/4c/c199512f01c845dfe5a7840ab3aae6c60463b5dc2a775be72502dfd9170a/pytablewriter-1.2.1-py3-none-any.whl", hash = "sha256:e906ff7ff5151d70a5f66e0f7b75642a7f2dce8d893c265b79cc9cf6bc04ddb4", size = 91083, upload-time = "2025-01-01T15:36:55.63Z" },
+]
+
 [[package]]
 name = "pytest"
 version = "9.0.2"
@@ -5456,6 +6532,24 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl", hash = "sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6", size = 64738, upload-time = "2025-08-18T20:46:00.542Z" },
 ]
 
+[package.optional-dependencies]
+socks = [
+    { name = "pysocks" },
+]
+
+[[package]]
+name = "requests-oauthlib"
+version = "2.0.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "oauthlib" },
+    { name = "requests" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/42/f2/05f29bc3913aea15eb670be136045bf5c5bbf4b99ecb839da9b422bb2c85/requests-oauthlib-2.0.0.tar.gz", hash = "sha256:b3dffaebd884d8cd778494369603a9e7b58d29111bf6b41bdc2dcd87203af4e9", size = 55650, upload-time = "2024-03-22T20:32:29.939Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3b/5d/63d4ae3b9daea098d5d6f5da83984853c1bbacd5dc826764b249fe119d24/requests_oauthlib-2.0.0-py2.py3-none-any.whl", hash = "sha256:7dd8a5c40426b779b0868c404bdef9768deccf22749cde15852df527e6269b36", size = 24179, upload-time = "2024-03-22T20:32:28.055Z" },
+]
+
 [[package]]
 name = "requests-toolbelt"
 version = "1.0.0"
@@ -5526,6 +6620,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/25/7a/b0178788f8dc6cafce37a212c99565fa1fe7872c70c6c9c1e1a372d9d88f/rich-14.2.0-py3-none-any.whl", hash = "sha256:76bc51fe2e57d2b1be1f96c524b890b816e334ab4c1e45888799bfaab0021edd", size = 243393, upload-time = "2025-10-09T14:16:51.245Z" },
 ]
 
+[[package]]
+name = "rouge-score"
+version = "0.1.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "absl-py" },
+    { name = "nltk" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "six" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/e2/c5/9136736c37022a6ad27fea38f3111eb8f02fe75d067f9a985cc358653102/rouge_score-0.1.2.tar.gz", hash = "sha256:c7d4da2683e68c9abf0135ef915d63a46643666f848e558a1b9f7ead17ff0f04", size = 17400, upload-time = "2022-07-22T22:46:22.909Z" }
+
 [[package]]
 name = "rpds-py"
 version = "0.30.0"
@@ -5695,6 +6802,24 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/74/31/b0e29d572670dca3674eeee78e418f20bdf97fa8aa9ea71380885e175ca0/ruff-0.14.10-py3-none-win_arm64.whl", hash = "sha256:e51d046cf6dda98a4633b8a8a771451107413b0f07183b2bef03f075599e44e6", size = 13729839, upload-time = "2025-12-18T19:28:48.636Z" },
 ]
 
+[[package]]
+name = "sacrebleu"
+version = "2.6.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "colorama" },
+    { name = "lxml" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "portalocker" },
+    { name = "regex" },
+    { name = "tabulate" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/d3/ed/d7acddcff74d690c56fe26a1f7828bdde548262828d0743414ea916c40c1/sacrebleu-2.6.0.tar.gz", hash = "sha256:91499b6cd46138d95154fff1e863c2f9be57e82f0c719d8dd718d0006cf6c566", size = 1893419, upload-time = "2026-01-12T17:17:20.799Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/06/f2/6c90ccf3ad1d09a7d662a405b274f3c93b92df59c8d6a025d26aaf34d302/sacrebleu-2.6.0-py3-none-any.whl", hash = "sha256:3edc1531575cfe4ad04ce53491a9307e234af1c3f805a1f491cbec844229a8a8", size = 100785, upload-time = "2026-01-12T17:17:18.868Z" },
+]
+
 [[package]]
 name = "safetensors"
 version = "0.7.0"
@@ -5775,10 +6900,12 @@ version = "1.8.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
 ]
 dependencies = [
@@ -5893,10 +7020,12 @@ version = "1.17.0"
 source = { registry = "https://pypi.org/simple" }
 resolution-markers = [
     "python_full_version >= '3.14' and sys_platform == 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform == 'linux'",
+    "python_full_version == '3.13.*' and sys_platform == 'linux'",
+    "python_full_version == '3.12.*' and sys_platform == 'linux'",
     "python_full_version == '3.11.*' and sys_platform == 'linux'",
     "python_full_version >= '3.14' and sys_platform != 'linux'",
-    "python_full_version >= '3.12' and python_full_version < '3.14' and sys_platform != 'linux'",
+    "python_full_version == '3.13.*' and sys_platform != 'linux'",
+    "python_full_version == '3.12.*' and sys_platform != 'linux'",
     "python_full_version == '3.11.*' and sys_platform != 'linux'",
 ]
 dependencies = [
@@ -6039,6 +7168,15 @@ version = "1.0.0"
 source = { registry = "https://pypi.org/simple" }
 sdist = { url = "https://files.pythonhosted.org/packages/9e/bd/3704a8c3e0942d711c1299ebf7b9091930adae6675d7c8f476a7ce48653c/sgmllib3k-1.0.0.tar.gz", hash = "sha256:7868fb1c8bfa764c1ac563d3cf369c381d1325d36124933a726f29fcdaa812e9", size = 5750, upload-time = "2010-08-24T14:33:52.445Z" }
 
+[[package]]
+name = "shellingham"
+version = "1.5.4"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/58/15/8b3609fd3830ef7b27b655beb4b4e9c62313a4e8da8c676e142cc210d58e/shellingham-1.5.4.tar.gz", hash = "sha256:8dbca0739d487e5bd35ab3ca4b36e11c4078f3a234bfce294b0a0291363404de", size = 10310, upload-time = "2023-10-24T04:13:40.426Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e0/f9/0595336914c5619e5f28a1fb793285925a8cd4b432c9da0a987836c7f822/shellingham-1.5.4-py2.py3-none-any.whl", hash = "sha256:7ecfff8f2fd72616f7481040475a65b2bf8af90a56c89140852d1120324e8686", size = 9755, upload-time = "2023-10-24T04:13:38.866Z" },
+]
+
 [[package]]
 name = "six"
 version = "1.17.0"
@@ -6146,6 +7284,12 @@ asyncio = [
     { name = "greenlet" },
 ]
 
+[[package]]
+name = "sqlitedict"
+version = "2.1.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/12/9a/7620d1e9dcb02839ed6d4b14064e609cdd7a8ae1e47289aa0456796dd9ca/sqlitedict-2.1.0.tar.gz", hash = "sha256:03d9cfb96d602996f1d4c2db2856f1224b96a9c431bdd16e78032a72940f9e8c", size = 21846, upload-time = "2022-12-03T13:39:13.102Z" }
+
 [[package]]
 name = "sse-starlette"
 version = "3.0.4"
@@ -6186,6 +7330,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/d9/52/1064f510b141bd54025f9b55105e26d1fa970b9be67ad766380a3c9b74b0/starlette-0.50.0-py3-none-any.whl", hash = "sha256:9e5391843ec9b6e472eed1365a78c8098cfceb7a74bfd4d6b1c0c0095efb3bca", size = 74033, upload-time = "2025-11-01T15:25:25.461Z" },
 ]
 
+[[package]]
+name = "stnd"
+version = "1.0.0"
+source = { git = "https://github.com/arubique/stnd.git?rev=0d23b52f7742c08b28be560d2d52d450fcd274b7#0d23b52f7742c08b28be560d2d52d450fcd274b7" }
+dependencies = [
+    { name = "filelock" },
+    { name = "gspread" },
+    { name = "pandas" },
+    { name = "psutil" },
+    { name = "pydrive2" },
+    { name = "wandb" },
+]
+
 [[package]]
 name = "sympy"
 version = "1.14.0"
@@ -6198,6 +7355,37 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353, upload-time = "2025-04-27T18:04:59.103Z" },
 ]
 
+[[package]]
+name = "tabledata"
+version = "1.3.4"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "dataproperty" },
+    { name = "typepy" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b2/35/171c8977162f1163368406deddde4c59673b62bd0cb2f34948a02effb075/tabledata-1.3.4.tar.gz", hash = "sha256:e9649cab129d718f3bff4150083b77f8a78c30f6634a30caf692b10fdc60cb97", size = 25074, upload-time = "2024-12-31T14:12:31.198Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/08/64/fa4160151976ee4b2cf0c1217a99443ffaeb991956feddfeac9eee9952f8/tabledata-1.3.4-py3-none-any.whl", hash = "sha256:1f56e433bfdeb89f4487abfa48c4603a3b07c5d3a3c7e05ff73dd018c24bd0d4", size = 11820, upload-time = "2024-12-31T14:12:28.584Z" },
+]
+
+[[package]]
+name = "tabulate"
+version = "0.10.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/46/58/8c37dea7bbf769b20d58e7ace7e5edfe65b849442b00ffcdd56be88697c6/tabulate-0.10.0.tar.gz", hash = "sha256:e2cfde8f79420f6deeffdeda9aaec3b6bc5abce947655d17ac662b126e48a60d", size = 91754, upload-time = "2026-03-04T18:55:34.402Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/99/55/db07de81b5c630da5cbf5c7df646580ca26dfaefa593667fc6f2fe016d2e/tabulate-0.10.0-py3-none-any.whl", hash = "sha256:f0b0622e567335c8fabaaa659f1b33bcb6ddfe2e496071b743aa113f8774f2d3", size = 39814, upload-time = "2026-03-04T18:55:31.284Z" },
+]
+
+[[package]]
+name = "tcolorpy"
+version = "0.1.7"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/80/cc/44f2d81d8f9093aad81c3467a5bf5718d2b5f786e887b6e4adcfc17ec6b9/tcolorpy-0.1.7.tar.gz", hash = "sha256:0fbf6bf238890bbc2e32662aa25736769a29bf6d880328f310c910a327632614", size = 299437, upload-time = "2024-12-29T15:24:23.847Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/05/a2/ed023f2edd1e011b4d99b6727bce8253842d66c3fbf9ed0a26fc09a92571/tcolorpy-0.1.7-py3-none-any.whl", hash = "sha256:26a59d52027e175a37e0aba72efc99dda43f074db71f55b316d3de37d3251378", size = 8096, upload-time = "2024-12-29T15:24:21.33Z" },
+]
+
 [[package]]
 name = "tenacity"
 version = "9.1.2"
@@ -6360,10 +7548,9 @@ wheels = [
 
 [[package]]
 name = "torch"
-version = "2.10.0"
+version = "2.9.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "cuda-bindings", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
     { name = "filelock" },
     { name = "fsspec" },
     { name = "jinja2" },
@@ -6390,38 +7577,75 @@ dependencies = [
     { name = "typing-extensions" },
 ]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/5b/30/bfebdd8ec77db9a79775121789992d6b3b75ee5494971294d7b4b7c999bc/torch-2.10.0-2-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:2b980edd8d7c0a68c4e951ee1856334a43193f98730d97408fbd148c1a933313", size = 79411457, upload-time = "2026-02-10T21:44:59.189Z" },
-    { url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
-    { url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
-    { url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/1a/c61f36cfd446170ec27b3a4984f072fd06dab6b5d7ce27e11adb35d6c838/torch-2.10.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:5276fa790a666ee8becaffff8acb711922252521b28fbce5db7db5cf9cb2026d", size = 145992962, upload-time = "2026-01-21T16:24:14.04Z" },
-    { url = "https://files.pythonhosted.org/packages/b5/60/6662535354191e2d1555296045b63e4279e5a9dbad49acf55a5d38655a39/torch-2.10.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:aaf663927bcd490ae971469a624c322202a2a1e68936eb952535ca4cd3b90444", size = 915599237, upload-time = "2026-01-21T16:23:25.497Z" },
-    { url = "https://files.pythonhosted.org/packages/40/b8/66bbe96f0d79be2b5c697b2e0b187ed792a15c6c4b8904613454651db848/torch-2.10.0-cp310-cp310-win_amd64.whl", hash = "sha256:a4be6a2a190b32ff5c8002a0977a25ea60e64f7ba46b1be37093c141d9c49aeb", size = 113720931, upload-time = "2026-01-21T16:24:23.743Z" },
-    { url = "https://files.pythonhosted.org/packages/76/bb/d820f90e69cda6c8169b32a0c6a3ab7b17bf7990b8f2c680077c24a3c14c/torch-2.10.0-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:35e407430795c8d3edb07a1d711c41cc1f9eaddc8b2f1cc0a165a6767a8fb73d", size = 79411450, upload-time = "2026-01-21T16:25:30.692Z" },
-    { url = "https://files.pythonhosted.org/packages/78/89/f5554b13ebd71e05c0b002f95148033e730d3f7067f67423026cc9c69410/torch-2.10.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:3282d9febd1e4e476630a099692b44fdc214ee9bf8ee5377732d9d9dfe5712e4", size = 145992610, upload-time = "2026-01-21T16:25:26.327Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/30/a3a2120621bf9c17779b169fc17e3dc29b230c29d0f8222f499f5e159aa8/torch-2.10.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:a2f9edd8dbc99f62bc4dfb78af7bf89499bca3d753423ac1b4e06592e467b763", size = 915607863, upload-time = "2026-01-21T16:25:06.696Z" },
-    { url = "https://files.pythonhosted.org/packages/6f/3d/c87b33c5f260a2a8ad68da7147e105f05868c281c63d65ed85aa4da98c66/torch-2.10.0-cp311-cp311-win_amd64.whl", hash = "sha256:29b7009dba4b7a1c960260fc8ac85022c784250af43af9fb0ebafc9883782ebd", size = 113723116, upload-time = "2026-01-21T16:25:21.916Z" },
-    { url = "https://files.pythonhosted.org/packages/61/d8/15b9d9d3a6b0c01b883787bd056acbe5cc321090d4b216d3ea89a8fcfdf3/torch-2.10.0-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:b7bd80f3477b830dd166c707c5b0b82a898e7b16f59a7d9d42778dd058272e8b", size = 79423461, upload-time = "2026-01-21T16:24:50.266Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/af/758e242e9102e9988969b5e621d41f36b8f258bb4a099109b7a4b4b50ea4/torch-2.10.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:5fd4117d89ffd47e3dcc71e71a22efac24828ad781c7e46aaaf56bf7f2796acf", size = 145996088, upload-time = "2026-01-21T16:24:44.171Z" },
-    { url = "https://files.pythonhosted.org/packages/23/8e/3c74db5e53bff7ed9e34c8123e6a8bfef718b2450c35eefab85bb4a7e270/torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:787124e7db3b379d4f1ed54dd12ae7c741c16a4d29b49c0226a89bea50923ffb", size = 915711952, upload-time = "2026-01-21T16:23:53.503Z" },
-    { url = "https://files.pythonhosted.org/packages/6e/01/624c4324ca01f66ae4c7cd1b74eb16fb52596dce66dbe51eff95ef9e7a4c/torch-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:2c66c61f44c5f903046cc696d088e21062644cbe541c7f1c4eaae88b2ad23547", size = 113757972, upload-time = "2026-01-21T16:24:39.516Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/5c/dee910b87c4d5c0fcb41b50839ae04df87c1cfc663cf1b5fca7ea565eeaa/torch-2.10.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:6d3707a61863d1c4d6ebba7be4ca320f42b869ee657e9b2c21c736bf17000294", size = 79498198, upload-time = "2026-01-21T16:24:34.704Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/6f/f2e91e34e3fcba2e3fc8d8f74e7d6c22e74e480bbd1db7bc8900fdf3e95c/torch-2.10.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:5c4d217b14741e40776dd7074d9006fd28b8a97ef5654db959d8635b2fe5f29b", size = 146004247, upload-time = "2026-01-21T16:24:29.335Z" },
-    { url = "https://files.pythonhosted.org/packages/98/fb/5160261aeb5e1ee12ee95fe599d0541f7c976c3701d607d8fc29e623229f/torch-2.10.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:6b71486353fce0f9714ca0c9ef1c850a2ae766b409808acd58e9678a3edb7738", size = 915716445, upload-time = "2026-01-21T16:22:45.353Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/16/502fb1b41e6d868e8deb5b0e3ae926bbb36dab8ceb0d1b769b266ad7b0c3/torch-2.10.0-cp313-cp313-win_amd64.whl", hash = "sha256:c2ee399c644dc92ef7bc0d4f7e74b5360c37cdbe7c5ba11318dda49ffac2bc57", size = 113757050, upload-time = "2026-01-21T16:24:19.204Z" },
-    { url = "https://files.pythonhosted.org/packages/1a/0b/39929b148f4824bc3ad6f9f72a29d4ad865bcf7ebfc2fa67584773e083d2/torch-2.10.0-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:3202429f58309b9fa96a614885eace4b7995729f44beb54d3e4a47773649d382", size = 79851305, upload-time = "2026-01-21T16:24:09.209Z" },
-    { url = "https://files.pythonhosted.org/packages/d8/14/21fbce63bc452381ba5f74a2c0a959fdf5ad5803ccc0c654e752e0dbe91a/torch-2.10.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:aae1b29cd68e50a9397f5ee897b9c24742e9e306f88a807a27d617f07adb3bd8", size = 146005472, upload-time = "2026-01-21T16:22:29.022Z" },
-    { url = "https://files.pythonhosted.org/packages/54/fd/b207d1c525cb570ef47f3e9f836b154685011fce11a2f444ba8a4084d042/torch-2.10.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:6021db85958db2f07ec94e1bc77212721ba4920c12a18dc552d2ae36a3eb163f", size = 915612644, upload-time = "2026-01-21T16:21:47.019Z" },
-    { url = "https://files.pythonhosted.org/packages/36/53/0197f868c75f1050b199fe58f9bf3bf3aecac9b4e85cc9c964383d745403/torch-2.10.0-cp313-cp313t-win_amd64.whl", hash = "sha256:ff43db38af76fda183156153983c9a096fc4c78d0cd1e07b14a2314c7f01c2c8", size = 113997015, upload-time = "2026-01-21T16:23:00.767Z" },
-    { url = "https://files.pythonhosted.org/packages/0e/13/e76b4d9c160e89fff48bf16b449ea324bda84745d2ab30294c37c2434c0d/torch-2.10.0-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:cdf2a523d699b70d613243211ecaac14fe9c5df8a0b0a9c02add60fb2a413e0f", size = 79498248, upload-time = "2026-01-21T16:23:09.315Z" },
-    { url = "https://files.pythonhosted.org/packages/4f/93/716b5ac0155f1be70ed81bacc21269c3ece8dba0c249b9994094110bfc51/torch-2.10.0-cp314-cp314-macosx_14_0_arm64.whl", hash = "sha256:bf0d9ff448b0218e0433aeb198805192346c4fd659c852370d5cc245f602a06a", size = 79464992, upload-time = "2026-01-21T16:23:05.162Z" },
-    { url = "https://files.pythonhosted.org/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:233aed0659a2503b831d8a67e9da66a62c996204c0bba4f4c442ccc0c68a3f60", size = 146018567, upload-time = "2026-01-21T16:22:23.393Z" },
-    { url = "https://files.pythonhosted.org/packages/5e/cd/4b95ef7f293b927c283db0b136c42be91c8ec6845c44de0238c8c23bdc80/torch-2.10.0-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:682497e16bdfa6efeec8cde66531bc8d1fbbbb4d8788ec6173c089ed3cc2bfe5", size = 915721646, upload-time = "2026-01-21T16:21:16.983Z" },
-    { url = "https://files.pythonhosted.org/packages/56/97/078a007208f8056d88ae43198833469e61a0a355abc0b070edd2c085eb9a/torch-2.10.0-cp314-cp314-win_amd64.whl", hash = "sha256:6528f13d2a8593a1a412ea07a99812495bec07e9224c28b2a25c0a30c7da025c", size = 113752373, upload-time = "2026-01-21T16:22:13.471Z" },
-    { url = "https://files.pythonhosted.org/packages/d8/94/71994e7d0d5238393df9732fdab607e37e2b56d26a746cb59fdb415f8966/torch-2.10.0-cp314-cp314t-macosx_14_0_arm64.whl", hash = "sha256:f5ab4ba32383061be0fb74bda772d470140a12c1c3b58a0cfbf3dae94d164c28", size = 79850324, upload-time = "2026-01-21T16:22:09.494Z" },
-    { url = "https://files.pythonhosted.org/packages/e2/65/1a05346b418ea8ccd10360eef4b3e0ce688fba544e76edec26913a8d0ee0/torch-2.10.0-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:716b01a176c2a5659c98f6b01bf868244abdd896526f1c692712ab36dbaf9b63", size = 146006482, upload-time = "2026-01-21T16:22:18.42Z" },
-    { url = "https://files.pythonhosted.org/packages/1d/b9/5f6f9d9e859fc3235f60578fa64f52c9c6e9b4327f0fe0defb6de5c0de31/torch-2.10.0-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:d8f5912ba938233f86361e891789595ff35ca4b4e2ac8fe3670895e5976731d6", size = 915613050, upload-time = "2026-01-21T16:20:49.035Z" },
-    { url = "https://files.pythonhosted.org/packages/66/4d/35352043ee0eaffdeff154fad67cd4a31dbed7ff8e3be1cc4549717d6d51/torch-2.10.0-cp314-cp314t-win_amd64.whl", hash = "sha256:71283a373f0ee2c89e0f0d5f446039bdabe8dbc3c9ccf35f0f784908b0acd185", size = 113995816, upload-time = "2026-01-21T16:22:05.312Z" },
+    { url = "https://files.pythonhosted.org/packages/5f/56/9577683b23072075ed2e40d725c52c2019d71a972fab8e083763da8e707e/torch-2.9.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:1cc208435f6c379f9b8fdfd5ceb5be1e3b72a6bdf1cb46c0d2812aa73472db9e", size = 104207681, upload-time = "2025-11-12T15:19:56.48Z" },
+    { url = "https://files.pythonhosted.org/packages/38/45/be5a74f221df8f4b609b78ff79dc789b0cc9017624544ac4dd1c03973150/torch-2.9.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:9fd35c68b3679378c11f5eb73220fdcb4e6f4592295277fbb657d31fd053237c", size = 899794036, upload-time = "2025-11-12T15:21:01.886Z" },
+    { url = "https://files.pythonhosted.org/packages/67/95/a581e8a382596b69385a44bab2733f1273d45c842f5d4a504c0edc3133b6/torch-2.9.1-cp310-cp310-win_amd64.whl", hash = "sha256:2af70e3be4a13becba4655d6cc07dcfec7ae844db6ac38d6c1dafeb245d17d65", size = 110969861, upload-time = "2025-11-12T15:21:30.145Z" },
+    { url = "https://files.pythonhosted.org/packages/ad/51/1756dc128d2bf6ea4e0a915cb89ea5e730315ff33d60c1ff56fd626ba3eb/torch-2.9.1-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:a83b0e84cc375e3318a808d032510dde99d696a85fe9473fc8575612b63ae951", size = 74452222, upload-time = "2025-11-12T15:20:46.223Z" },
+    { url = "https://files.pythonhosted.org/packages/15/db/c064112ac0089af3d2f7a2b5bfbabf4aa407a78b74f87889e524b91c5402/torch-2.9.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:62b3fd888277946918cba4478cf849303da5359f0fb4e3bfb86b0533ba2eaf8d", size = 104220430, upload-time = "2025-11-12T15:20:31.705Z" },
+    { url = "https://files.pythonhosted.org/packages/56/be/76eaa36c9cd032d3b01b001e2c5a05943df75f26211f68fae79e62f87734/torch-2.9.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:d033ff0ac3f5400df862a51bdde9bad83561f3739ea0046e68f5401ebfa67c1b", size = 899821446, upload-time = "2025-11-12T15:20:15.544Z" },
+    { url = "https://files.pythonhosted.org/packages/47/cc/7a2949e38dfe3244c4df21f0e1c27bce8aedd6c604a587dd44fc21017cb4/torch-2.9.1-cp311-cp311-win_amd64.whl", hash = "sha256:0d06b30a9207b7c3516a9e0102114024755a07045f0c1d2f2a56b1819ac06bcb", size = 110973074, upload-time = "2025-11-12T15:21:39.958Z" },
+    { url = "https://files.pythonhosted.org/packages/1e/ce/7d251155a783fb2c1bb6837b2b7023c622a2070a0a72726ca1df47e7ea34/torch-2.9.1-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:52347912d868653e1528b47cafaf79b285b98be3f4f35d5955389b1b95224475", size = 74463887, upload-time = "2025-11-12T15:20:36.611Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/27/07c645c7673e73e53ded71705045d6cb5bae94c4b021b03aa8d03eee90ab/torch-2.9.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:da5f6f4d7f4940a173e5572791af238cb0b9e21b1aab592bd8b26da4c99f1cd6", size = 104126592, upload-time = "2025-11-12T15:20:41.62Z" },
+    { url = "https://files.pythonhosted.org/packages/19/17/e377a460603132b00760511299fceba4102bd95db1a0ee788da21298ccff/torch-2.9.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:27331cd902fb4322252657f3902adf1c4f6acad9dcad81d8df3ae14c7c4f07c4", size = 899742281, upload-time = "2025-11-12T15:22:17.602Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/1a/64f5769025db846a82567fa5b7d21dba4558a7234ee631712ee4771c436c/torch-2.9.1-cp312-cp312-win_amd64.whl", hash = "sha256:81a285002d7b8cfd3fdf1b98aa8df138d41f1a8334fd9ea37511517cedf43083", size = 110940568, upload-time = "2025-11-12T15:21:18.689Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/ab/07739fd776618e5882661d04c43f5b5586323e2f6a2d7d84aac20d8f20bd/torch-2.9.1-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:c0d25d1d8e531b8343bea0ed811d5d528958f1dcbd37e7245bc686273177ad7e", size = 74479191, upload-time = "2025-11-12T15:21:25.816Z" },
+    { url = "https://files.pythonhosted.org/packages/20/60/8fc5e828d050bddfab469b3fe78e5ab9a7e53dda9c3bdc6a43d17ce99e63/torch-2.9.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:c29455d2b910b98738131990394da3e50eea8291dfeb4b12de71ecf1fdeb21cb", size = 104135743, upload-time = "2025-11-12T15:21:34.936Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/b7/6d3f80e6918213babddb2a37b46dbb14c15b14c5f473e347869a51f40e1f/torch-2.9.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:524de44cd13931208ba2c4bde9ec7741fd4ae6bfd06409a604fc32f6520c2bc9", size = 899749493, upload-time = "2025-11-12T15:24:36.356Z" },
+    { url = "https://files.pythonhosted.org/packages/a6/47/c7843d69d6de8938c1cbb1eba426b1d48ddf375f101473d3e31a5fc52b74/torch-2.9.1-cp313-cp313-win_amd64.whl", hash = "sha256:545844cc16b3f91e08ce3b40e9c2d77012dd33a48d505aed34b7740ed627a1b2", size = 110944162, upload-time = "2025-11-12T15:21:53.151Z" },
+    { url = "https://files.pythonhosted.org/packages/28/0e/2a37247957e72c12151b33a01e4df651d9d155dd74d8cfcbfad15a79b44a/torch-2.9.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5be4bf7496f1e3ffb1dd44b672adb1ac3f081f204c5ca81eba6442f5f634df8e", size = 74830751, upload-time = "2025-11-12T15:21:43.792Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/f7/7a18745edcd7b9ca2381aa03353647bca8aace91683c4975f19ac233809d/torch-2.9.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:30a3e170a84894f3652434b56d59a64a2c11366b0ed5776fab33c2439396bf9a", size = 104142929, upload-time = "2025-11-12T15:21:48.319Z" },
+    { url = "https://files.pythonhosted.org/packages/f4/dd/f1c0d879f2863ef209e18823a988dc7a1bf40470750e3ebe927efdb9407f/torch-2.9.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:8301a7b431e51764629208d0edaa4f9e4c33e6df0f2f90b90e261d623df6a4e2", size = 899748978, upload-time = "2025-11-12T15:23:04.568Z" },
+    { url = "https://files.pythonhosted.org/packages/1f/9f/6986b83a53b4d043e36f3f898b798ab51f7f20fdf1a9b01a2720f445043d/torch-2.9.1-cp313-cp313t-win_amd64.whl", hash = "sha256:2e1c42c0ae92bf803a4b2409fdfed85e30f9027a66887f5e7dcdbc014c7531db", size = 111176995, upload-time = "2025-11-12T15:22:01.618Z" },
+    { url = "https://files.pythonhosted.org/packages/40/60/71c698b466dd01e65d0e9514b5405faae200c52a76901baf6906856f17e4/torch-2.9.1-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:2c14b3da5df416cf9cb5efab83aa3056f5b8cd8620b8fde81b4987ecab730587", size = 74480347, upload-time = "2025-11-12T15:21:57.648Z" },
+    { url = "https://files.pythonhosted.org/packages/48/50/c4b5112546d0d13cc9eaa1c732b823d676a9f49ae8b6f97772f795874a03/torch-2.9.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:1edee27a7c9897f4e0b7c14cfc2f3008c571921134522d5b9b5ec4ebbc69041a", size = 74433245, upload-time = "2025-11-12T15:22:39.027Z" },
+    { url = "https://files.pythonhosted.org/packages/81/c9/2628f408f0518b3bae49c95f5af3728b6ab498c8624ab1e03a43dd53d650/torch-2.9.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:19d144d6b3e29921f1fc70503e9f2fc572cde6a5115c0c0de2f7ca8b1483e8b6", size = 104134804, upload-time = "2025-11-12T15:22:35.222Z" },
+    { url = "https://files.pythonhosted.org/packages/28/fc/5bc91d6d831ae41bf6e9e6da6468f25330522e92347c9156eb3f1cb95956/torch-2.9.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:c432d04376f6d9767a9852ea0def7b47a7bbc8e7af3b16ac9cf9ce02b12851c9", size = 899747132, upload-time = "2025-11-12T15:23:36.068Z" },
+    { url = "https://files.pythonhosted.org/packages/63/5d/e8d4e009e52b6b2cf1684bde2a6be157b96fb873732542fb2a9a99e85a83/torch-2.9.1-cp314-cp314-win_amd64.whl", hash = "sha256:d187566a2cdc726fc80138c3cdb260970fab1c27e99f85452721f7759bbd554d", size = 110934845, upload-time = "2025-11-12T15:22:48.367Z" },
+    { url = "https://files.pythonhosted.org/packages/bd/b2/2d15a52516b2ea3f414643b8de68fa4cb220d3877ac8b1028c83dc8ca1c4/torch-2.9.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:cb10896a1f7fedaddbccc2017ce6ca9ecaaf990f0973bdfcf405439750118d2c", size = 74823558, upload-time = "2025-11-12T15:22:43.392Z" },
+    { url = "https://files.pythonhosted.org/packages/86/5c/5b2e5d84f5b9850cd1e71af07524d8cbb74cba19379800f1f9f7c997fc70/torch-2.9.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:0a2bd769944991c74acf0c4ef23603b9c777fdf7637f115605a4b2d8023110c7", size = 104145788, upload-time = "2025-11-12T15:23:52.109Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/8c/3da60787bcf70add986c4ad485993026ac0ca74f2fc21410bc4eb1bb7695/torch-2.9.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:07c8a9660bc9414c39cac530ac83b1fb1b679d7155824144a40a54f4a47bfa73", size = 899735500, upload-time = "2025-11-12T15:24:08.788Z" },
+    { url = "https://files.pythonhosted.org/packages/db/2b/f7818f6ec88758dfd21da46b6cd46af9d1b3433e53ddbb19ad1e0da17f9b/torch-2.9.1-cp314-cp314t-win_amd64.whl", hash = "sha256:c88d3299ddeb2b35dcc31753305612db485ab6f1823e37fb29451c8b2732b87e", size = 111163659, upload-time = "2025-11-12T15:23:20.009Z" },
+]
+
+[[package]]
+name = "torchvision"
+version = "0.24.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "pillow" },
+    { name = "torch" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f7/09/d51aadf8591138e08b74c64a6eb783630c7a31ca2634416277115a9c3a2b/torchvision-0.24.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ded5e625788572e4e1c4d155d1bbc48805c113794100d70e19c76e39e4d53465", size = 1891441, upload-time = "2025-11-12T15:25:01.687Z" },
+    { url = "https://files.pythonhosted.org/packages/6b/49/a35df863e7c153aad82af7505abd8264a5b510306689712ef86bea862822/torchvision-0.24.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:54ed17c3d30e718e08d8da3fd5b30ea44b0311317e55647cb97077a29ecbc25b", size = 2386226, upload-time = "2025-11-12T15:25:05.449Z" },
+    { url = "https://files.pythonhosted.org/packages/49/20/f2d7cd1eea052887c1083afff0b8df5228ec93b53e03759f20b1a3c6d22a/torchvision-0.24.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:f476da4e085b7307aaab6f540219617d46d5926aeda24be33e1359771c83778f", size = 8046093, upload-time = "2025-11-12T15:25:09.425Z" },
+    { url = "https://files.pythonhosted.org/packages/d8/cf/0ff4007c09903199307da5f53a192ff5d62b45447069e9ef3a19bdc5ff12/torchvision-0.24.1-cp310-cp310-win_amd64.whl", hash = "sha256:fbdbdae5e540b868a681240b7dbd6473986c862445ee8a138680a6a97d6c34ff", size = 3696202, upload-time = "2025-11-12T15:25:10.657Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/69/30f5f03752aa1a7c23931d2519b31e557f3f10af5089d787cddf3b903ecf/torchvision-0.24.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:056c525dc875f18fe8e9c27079ada166a7b2755cea5a2199b0bc7f1f8364e600", size = 1891436, upload-time = "2025-11-12T15:25:04.3Z" },
+    { url = "https://files.pythonhosted.org/packages/0c/69/49aae86edb75fe16460b59a191fcc0f568c2378f780bb063850db0fe007a/torchvision-0.24.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:1e39619de698e2821d71976c92c8a9e50cdfd1e993507dfb340f2688bfdd8283", size = 2387757, upload-time = "2025-11-12T15:25:06.795Z" },
+    { url = "https://files.pythonhosted.org/packages/11/c9/1dfc3db98797b326f1d0c3f3bb61c83b167a813fc7eab6fcd2edb8c7eb9d/torchvision-0.24.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:a0f106663e60332aa4fcb1ca2159ef8c3f2ed266b0e6df88de261048a840e0df", size = 8047682, upload-time = "2025-11-12T15:25:21.125Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/bb/cfc6a6f6ccc84a534ed1fdf029ae5716dd6ff04e57ed9dc2dab38bf652d5/torchvision-0.24.1-cp311-cp311-win_amd64.whl", hash = "sha256:a9308cdd37d8a42e14a3e7fd9d271830c7fecb150dd929b642f3c1460514599a", size = 4037588, upload-time = "2025-11-12T15:25:14.402Z" },
+    { url = "https://files.pythonhosted.org/packages/f0/af/18e2c6b9538a045f60718a0c5a058908ccb24f88fde8e6f0fc12d5ff7bd3/torchvision-0.24.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:e48bf6a8ec95872eb45763f06499f87bd2fb246b9b96cb00aae260fda2f96193", size = 1891433, upload-time = "2025-11-12T15:25:03.232Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/43/600e5cfb0643d10d633124f5982d7abc2170dfd7ce985584ff16edab3e76/torchvision-0.24.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:7fb7590c737ebe3e1c077ad60c0e5e2e56bb26e7bccc3b9d04dbfc34fd09f050", size = 2386737, upload-time = "2025-11-12T15:25:08.288Z" },
+    { url = "https://files.pythonhosted.org/packages/93/b1/db2941526ecddd84884132e2742a55c9311296a6a38627f9e2627f5ac889/torchvision-0.24.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:66a98471fc18cad9064123106d810a75f57f0838eee20edc56233fd8484b0cc7", size = 8049868, upload-time = "2025-11-12T15:25:13.058Z" },
+    { url = "https://files.pythonhosted.org/packages/69/98/16e583f59f86cd59949f59d52bfa8fc286f86341a229a9d15cbe7a694f0c/torchvision-0.24.1-cp312-cp312-win_amd64.whl", hash = "sha256:4aa6cb806eb8541e92c9b313e96192c6b826e9eb0042720e2fa250d021079952", size = 4302006, upload-time = "2025-11-12T15:25:16.184Z" },
+    { url = "https://files.pythonhosted.org/packages/e4/97/ab40550f482577f2788304c27220e8ba02c63313bd74cf2f8920526aac20/torchvision-0.24.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:8a6696db7fb71eadb2c6a48602106e136c785642e598eb1533e0b27744f2cce6", size = 1891435, upload-time = "2025-11-12T15:25:28.642Z" },
+    { url = "https://files.pythonhosted.org/packages/30/65/ac0a3f9be6abdbe4e1d82c915d7e20de97e7fd0e9a277970508b015309f3/torchvision-0.24.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:db2125c46f9cb25dc740be831ce3ce99303cfe60439249a41b04fd9f373be671", size = 2338718, upload-time = "2025-11-12T15:25:26.19Z" },
+    { url = "https://files.pythonhosted.org/packages/10/b5/5bba24ff9d325181508501ed7f0c3de8ed3dd2edca0784d48b144b6c5252/torchvision-0.24.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:f035f0cacd1f44a8ff6cb7ca3627d84c54d685055961d73a1a9fb9827a5414c8", size = 8049661, upload-time = "2025-11-12T15:25:22.558Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/ec/54a96ae9ab6a0dd66d4bba27771f892e36478a9c3489fa56e51c70abcc4d/torchvision-0.24.1-cp313-cp313-win_amd64.whl", hash = "sha256:16274823b93048e0a29d83415166a2e9e0bf4e1b432668357b657612a4802864", size = 4319808, upload-time = "2025-11-12T15:25:17.318Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/f3/a90a389a7e547f3eb8821b13f96ea7c0563cdefbbbb60a10e08dda9720ff/torchvision-0.24.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:e3f96208b4bef54cd60e415545f5200346a65024e04f29a26cd0006dbf9e8e66", size = 2005342, upload-time = "2025-11-12T15:25:11.871Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/fe/ff27d2ed1b524078164bea1062f23d2618a5fc3208e247d6153c18c91a76/torchvision-0.24.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:f231f6a4f2aa6522713326d0d2563538fa72d613741ae364f9913027fa52ea35", size = 2341708, upload-time = "2025-11-12T15:25:25.08Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/b9/d6c903495cbdfd2533b3ef6f7b5643ff589ea062f8feb5c206ee79b9d9e5/torchvision-0.24.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:1540a9e7f8cf55fe17554482f5a125a7e426347b71de07327d5de6bfd8d17caa", size = 8177239, upload-time = "2025-11-12T15:25:18.554Z" },
+    { url = "https://files.pythonhosted.org/packages/4f/2b/ba02e4261369c3798310483028495cf507e6cb3f394f42e4796981ecf3a7/torchvision-0.24.1-cp313-cp313t-win_amd64.whl", hash = "sha256:d83e16d70ea85d2f196d678bfb702c36be7a655b003abed84e465988b6128938", size = 4251604, upload-time = "2025-11-12T15:25:34.069Z" },
+    { url = "https://files.pythonhosted.org/packages/42/84/577b2cef8f32094add5f52887867da4c2a3e6b4261538447e9b48eb25812/torchvision-0.24.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:cccf4b4fec7fdfcd3431b9ea75d1588c0a8596d0333245dafebee0462abe3388", size = 2005319, upload-time = "2025-11-12T15:25:23.827Z" },
+    { url = "https://files.pythonhosted.org/packages/5f/34/ecb786bffe0159a3b49941a61caaae089853132f3cd1e8f555e3621f7e6f/torchvision-0.24.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:1b495edd3a8f9911292424117544f0b4ab780452e998649425d1f4b2bed6695f", size = 2338844, upload-time = "2025-11-12T15:25:32.625Z" },
+    { url = "https://files.pythonhosted.org/packages/51/99/a84623786a6969504c87f2dc3892200f586ee13503f519d282faab0bb4f0/torchvision-0.24.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:ab211e1807dc3e53acf8f6638df9a7444c80c0ad050466e8d652b3e83776987b", size = 8175144, upload-time = "2025-11-12T15:25:31.355Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/ba/8fae3525b233e109317ce6a9c1de922ab2881737b029a7e88021f81e068f/torchvision-0.24.1-cp314-cp314-win_amd64.whl", hash = "sha256:18f9cb60e64b37b551cd605a3d62c15730c086362b40682d23e24b616a697d41", size = 4234459, upload-time = "2025-11-12T15:25:19.859Z" },
+    { url = "https://files.pythonhosted.org/packages/50/33/481602c1c72d0485d4b3a6b48c9534b71c2957c9d83bf860eb837bf5a620/torchvision-0.24.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ec9d7379c519428395e4ffda4dbb99ec56be64b0a75b95989e00f9ec7ae0b2d7", size = 2005336, upload-time = "2025-11-12T15:25:27.225Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/7f/372de60bf3dd8f5593bd0d03f4aecf0d1fd58f5bc6943618d9d913f5e6d5/torchvision-0.24.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:af9201184c2712d808bd4eb656899011afdfce1e83721c7cb08000034df353fe", size = 2341704, upload-time = "2025-11-12T15:25:29.857Z" },
+    { url = "https://files.pythonhosted.org/packages/36/9b/0f3b9ff3d0225ee2324ec663de0e7fb3eb855615ca958ac1875f22f1f8e5/torchvision-0.24.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:9ef95d819fd6df81bc7cc97b8f21a15d2c0d3ac5dbfaab5cbc2d2ce57114b19e", size = 8177422, upload-time = "2025-11-12T15:25:37.357Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/ab/e2bcc7c2f13d882a58f8b30ff86f794210b075736587ea50f8c545834f8a/torchvision-0.24.1-cp314-cp314t-win_amd64.whl", hash = "sha256:480b271d6edff83ac2e8d69bbb4cf2073f93366516a50d48f140ccfceedb002e", size = 4335190, upload-time = "2025-11-12T15:25:35.745Z" },
 ]
 
 [[package]]
@@ -6455,6 +7679,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2", size = 78540, upload-time = "2024-11-24T20:12:19.698Z" },
 ]
 
+[[package]]
+name = "tqdm-multiprocess"
+version = "0.0.11"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "colorama" },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b4/1e/de81bd0f6cb2b61d6ee7ccbf304d99a42a0f53879481536dfb3288ee9a87/tqdm-multiprocess-0.0.11.tar.gz", hash = "sha256:a74002a1222ea9cbe8cdc9bd460108c6009be359621fbee9b92d0515d4d180f7", size = 8082, upload-time = "2020-10-27T06:57:54.313Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/25/7e/0d889fc6c84e3df6b69aaafe893fc77f69b3d968ac9ce574d1c62c688050/tqdm_multiprocess-0.0.11-py3-none-any.whl", hash = "sha256:3ebdf03e7a675150fa0bbceaa9c3c64b8cb556e9ffafa4fe6c078e51820524aa", size = 9817, upload-time = "2020-10-27T06:57:53.167Z" },
+]
+
 [[package]]
 name = "traitlets"
 version = "5.14.3"
@@ -6488,16 +7725,16 @@ wheels = [
 
 [[package]]
 name = "triton"
-version = "3.6.0"
+version = "3.5.1"
 source = { registry = "https://pypi.org/simple" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/8c/f7/f1c9d3424ab199ac53c2da567b859bcddbb9c9e7154805119f8bd95ec36f/triton-3.6.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a6550fae429e0667e397e5de64b332d1e5695b73650ee75a6146e2e902770bea", size = 188105201, upload-time = "2026-01-20T16:00:29.272Z" },
-    { url = "https://files.pythonhosted.org/packages/e0/12/b05ba554d2c623bffa59922b94b0775673de251f468a9609bc9e45de95e9/triton-3.6.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e8e323d608e3a9bfcc2d9efcc90ceefb764a82b99dea12a86d643c72539ad5d3", size = 188214640, upload-time = "2026-01-20T16:00:35.869Z" },
-    { url = "https://files.pythonhosted.org/packages/ab/a8/cdf8b3e4c98132f965f88c2313a4b493266832ad47fb52f23d14d4f86bb5/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:74caf5e34b66d9f3a429af689c1c7128daba1d8208df60e81106b115c00d6fca", size = 188266850, upload-time = "2026-01-20T16:00:43.041Z" },
-    { url = "https://files.pythonhosted.org/packages/f9/0b/37d991d8c130ce81a8728ae3c25b6e60935838e9be1b58791f5997b24a54/triton-3.6.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:10c7f76c6e72d2ef08df639e3d0d30729112f47a56b0c81672edc05ee5116ac9", size = 188289450, upload-time = "2026-01-20T16:00:49.136Z" },
-    { url = "https://files.pythonhosted.org/packages/35/f8/9c66bfc55361ec6d0e4040a0337fb5924ceb23de4648b8a81ae9d33b2b38/triton-3.6.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d002e07d7180fd65e622134fbd980c9a3d4211fb85224b56a0a0efbd422ab72f", size = 188400296, upload-time = "2026-01-20T16:00:56.042Z" },
-    { url = "https://files.pythonhosted.org/packages/df/3d/9e7eee57b37c80cec63322c0231bb6da3cfe535a91d7a4d64896fcb89357/triton-3.6.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a17a5d5985f0ac494ed8a8e54568f092f7057ef60e1b0fa09d3fd1512064e803", size = 188273063, upload-time = "2026-01-20T16:01:07.278Z" },
-    { url = "https://files.pythonhosted.org/packages/f6/56/6113c23ff46c00aae423333eb58b3e60bdfe9179d542781955a5e1514cb3/triton-3.6.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:46bd1c1af4b6704e554cad2eeb3b0a6513a980d470ccfa63189737340c7746a7", size = 188397994, upload-time = "2026-01-20T16:01:14.236Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/6e/676ab5019b4dde8b9b7bab71245102fc02778ef3df48218b298686b9ffd6/triton-3.5.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5fc53d849f879911ea13f4a877243afc513187bc7ee92d1f2c0f1ba3169e3c94", size = 170320692, upload-time = "2025-11-11T17:40:46.074Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/72/ec90c3519eaf168f22cb1757ad412f3a2add4782ad3a92861c9ad135d886/triton-3.5.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:61413522a48add32302353fdbaaf92daaaab06f6b5e3229940d21b5207f47579", size = 170425802, upload-time = "2025-11-11T17:40:53.209Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/50/9a8358d3ef58162c0a415d173cfb45b67de60176e1024f71fbc4d24c0b6d/triton-3.5.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d2c6b915a03888ab931a9fd3e55ba36785e1fe70cbea0b40c6ef93b20fc85232", size = 170470207, upload-time = "2025-11-11T17:41:00.253Z" },
+    { url = "https://files.pythonhosted.org/packages/27/46/8c3bbb5b0a19313f50edcaa363b599e5a1a5ac9683ead82b9b80fe497c8d/triton-3.5.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f3f4346b6ebbd4fad18773f5ba839114f4826037c9f2f34e0148894cd5dd3dba", size = 170470410, upload-time = "2025-11-11T17:41:06.319Z" },
+    { url = "https://files.pythonhosted.org/packages/37/92/e97fcc6b2c27cdb87ce5ee063d77f8f26f19f06916aa680464c8104ef0f6/triton-3.5.1-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0b4d2c70127fca6a23e247f9348b8adde979d2e7a20391bfbabaac6aebc7e6a8", size = 170579924, upload-time = "2025-11-11T17:41:12.455Z" },
+    { url = "https://files.pythonhosted.org/packages/a4/e6/c595c35e5c50c4bc56a7bac96493dad321e9e29b953b526bbbe20f9911d0/triton-3.5.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d0637b1efb1db599a8e9dc960d53ab6e4637db7d4ab6630a0974705d77b14b60", size = 170480488, upload-time = "2025-11-11T17:41:18.222Z" },
+    { url = "https://files.pythonhosted.org/packages/16/b5/b0d3d8b901b6a04ca38df5e24c27e53afb15b93624d7fd7d658c7cd9352a/triton-3.5.1-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bac7f7d959ad0f48c0e97d6643a1cc0fd5786fe61cb1f83b537c6b2d54776478", size = 170582192, upload-time = "2025-11-11T17:41:23.963Z" },
 ]
 
 [[package]]
@@ -6525,6 +7762,40 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/df/ca/4201ed5cb2af73912663d0c6ded927c28c28b3c921c9348aa8d2cfef4853/ty-0.0.5-py3-none-win_arm64.whl", hash = "sha256:83bea5a5296caac20d52b790ded2b830a7ff91c4ed9f36730fe1f393ceed6654", size = 9566474, upload-time = "2025-12-20T21:19:22.518Z" },
 ]
 
+[[package]]
+name = "typepy"
+version = "1.3.4"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "mbstrdecoder" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/79/59/4c39942077d7de285f762a91024dbda731be693591732977358f77d120fb/typepy-1.3.4.tar.gz", hash = "sha256:89c1f66de6c6133209c43a94d23431d320ba03ef5db18f241091ea594035d9de", size = 39558, upload-time = "2024-12-29T09:18:15.774Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ee/31/e393c3830bdedd01735bd195c85ac3034b6bcaf6c18142bab60a4047ca36/typepy-1.3.4-py3-none-any.whl", hash = "sha256:d5ed3e0c7f49521bff0603dd08cf8d453371cf68d65a29d3d0038552ccc46e2e", size = 31449, upload-time = "2024-12-29T09:18:13.135Z" },
+]
+
+[package.optional-dependencies]
+datetime = [
+    { name = "packaging" },
+    { name = "python-dateutil" },
+    { name = "pytz" },
+]
+
+[[package]]
+name = "typer"
+version = "0.24.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "annotated-doc" },
+    { name = "click" },
+    { name = "rich" },
+    { name = "shellingham" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/f5/24/cb09efec5cc954f7f9b930bf8279447d24618bb6758d4f6adf2574c41780/typer-0.24.1.tar.gz", hash = "sha256:e39b4732d65fbdcde189ae76cf7cd48aeae72919dea1fdfc16593be016256b45", size = 118613, upload-time = "2026-02-21T16:54:40.609Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/4a/91/48db081e7a63bb37284f9fbcefda7c44c277b18b0e13fbc36ea2335b71e6/typer-0.24.1-py3-none-any.whl", hash = "sha256:112c1f0ce578bfb4cab9ffdabc68f031416ebcc216536611ba21f04e9aa84c9e", size = 56085, upload-time = "2026-02-21T16:54:41.616Z" },
+]
+
 [[package]]
 name = "typing-extensions"
 version = "4.15.0"
@@ -6568,6 +7839,26 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/c7/b0/003792df09decd6849a5e39c28b513c06e84436a54440380862b5aeff25d/tzdata-2025.3-py2.py3-none-any.whl", hash = "sha256:06a47e5700f3081aab02b2e513160914ff0694bce9947d6b76ebd6bf57cfc5d1", size = 348521, upload-time = "2025-12-13T17:45:33.889Z" },
 ]
 
+[[package]]
+name = "umap-learn"
+version = "0.5.11"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numba" },
+    { name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "numpy", version = "2.4.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "pynndescent" },
+    { name = "scikit-learn", version = "1.7.2", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scikit-learn", version = "1.8.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "scipy", version = "1.15.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
+    { name = "scipy", version = "1.17.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/94/9a/a1e4a257a9aa979dac4f6d5781dac929cbb0949959e2003ed82657d10b0f/umap_learn-0.5.11.tar.gz", hash = "sha256:31566ffd495fbf05d7ab3efcba703861c0f5e6fc6998a838d0e2becdd00e54f5", size = 96409, upload-time = "2026-01-12T20:44:47.553Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/43/d2/fcf7192dd1cd8c090b6cfd53fa223c4fb2887a17c47e06bc356d44f40dfb/umap_learn-0.5.11-py3-none-any.whl", hash = "sha256:cb17adbde9d544ba79481b3ab4d81ac222e940f3d9219307bea6044f869af3cc", size = 90890, upload-time = "2026-01-12T20:44:46.511Z" },
+]
+
 [[package]]
 name = "uri-template"
 version = "1.3.0"
@@ -6577,6 +7868,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/e7/00/3fca040d7cf8a32776d3d81a00c8ee7457e00f80c649f1e4a863c8321ae9/uri_template-1.3.0-py3-none-any.whl", hash = "sha256:a44a133ea12d44a0c0f06d7d42a52d71282e77e2f937d8abd5655b8d56fc1363", size = 11140, upload-time = "2023-06-21T01:49:03.467Z" },
 ]
 
+[[package]]
+name = "uritemplate"
+version = "4.2.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/98/60/f174043244c5306c9988380d2cb10009f91563fc4b31293d27e17201af56/uritemplate-4.2.0.tar.gz", hash = "sha256:480c2ed180878955863323eea31b0ede668795de182617fef9c6ca09e6ec9d0e", size = 33267, upload-time = "2025-06-02T15:12:06.318Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/a9/99/3ae339466c9183ea5b8ae87b34c0b897eda475d2aec2307cae60e5cd4f29/uritemplate-4.2.0-py3-none-any.whl", hash = "sha256:962201ba1c4edcab02e60f9a0d3821e82dfc5d2d6662a21abd533879bdb8a686", size = 11488, upload-time = "2025-06-02T15:12:03.405Z" },
+]
+
 [[package]]
 name = "urllib3"
 version = "2.6.2"
@@ -6830,6 +8130,12 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/3f/0e/fa3b193432cfc60c93b42f3be03365f5f909d2b3ea410295cf36df739e31/widgetsnbextension-4.0.15-py3-none-any.whl", hash = "sha256:8156704e4346a571d9ce73b84bee86a29906c9abfd7223b7228a28899ccf3366", size = 2196503, upload-time = "2025-11-01T21:15:53.565Z" },
 ]
 
+[[package]]
+name = "word2number"
+version = "1.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/4a/29/a31940c848521f0725f0df6b25dca8917f13a2025b0e8fcbe5d0457e45e6/word2number-1.1.zip", hash = "sha256:70e27a5d387f67b04c71fbb7621c05930b19bfd26efd6851e6e0f9969dcde7d0", size = 9723, upload-time = "2017-06-02T15:45:14.488Z" }
+
 [[package]]
 name = "wrapt"
 version = "1.17.3"

From 9a61aafd5b0da82e9ddeebca0cf96ccf10095184 Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Sat, 28 Mar 2026 03:14:10 +0100
Subject: [PATCH 19/23] improved testing

---
 CHANGELOG.md                                  |   1 -
 tests/test_benchmarks/test_mmlu/conftest.py   |  74 +++
 .../test_mmlu/test_data_integrity.py          | 243 ++++++++++
 .../test_mmlu/test_mmlu_unit.py               | 456 ++++++++++++++++++
 tests/test_core/test_callback_handler.py      |  45 ++
 .../test_callbacks/test_file_result_logger.py |  88 ++++
 tests/test_core/test_message_history.py       |  97 ++++
 .../test_message_tracing_callback.py          |  22 +
 tests/test_core/test_system_info.py           | 104 ++++
 .../test_huggingface_scorer.py                | 282 +++++++++++
 10 files changed, 1411 insertions(+), 1 deletion(-)
 create mode 100644 tests/test_benchmarks/test_mmlu/conftest.py
 create mode 100644 tests/test_benchmarks/test_mmlu/test_data_integrity.py
 create mode 100644 tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
 create mode 100644 tests/test_core/test_callback_handler.py
 create mode 100644 tests/test_core/test_system_info.py
 create mode 100644 tests/test_interface/test_model_integration/test_huggingface_scorer.py

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 321067ad..c49edc6a 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -42,7 +42,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 **Core**
 
 - Added `InformativeSubsetQueue` and `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). `DISCOQueue` accepts `anchor_points_path` to load indices from a `.json`/`.pkl` file via `DISCOQueue.load_anchor_points()`. Available via `from maseval import DISCOQueue, InformativeSubsetQueue`. (PR: #34)
-- Added `get_with_assert()` utility in `maseval.core.exceptions` for strict dictionary access that raises `KeyError` instead of silently returning a default. Supports nested key lookups. (PR: #34)
 - Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #34)
 - Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
 - Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
diff --git a/tests/test_benchmarks/test_mmlu/conftest.py b/tests/test_benchmarks/test_mmlu/conftest.py
new file mode 100644
index 00000000..04b71a2e
--- /dev/null
+++ b/tests/test_benchmarks/test_mmlu/conftest.py
@@ -0,0 +1,74 @@
+"""Shared fixtures for MMLU benchmark tests."""
+
+import json
+from typing import Any, Dict, List, Optional
+
+import pytest
+
+from maseval.core.task import Task
+
+
+def make_mmlu_task(
+    query: str = "What is 2+2?",
+    gold: int = 0,
+    choices: Optional[List[str]] = None,
+    doc_id: int = 0,
+    full_prompt: str = "Few-shot examples...\n\nWhat is 2+2?",
+) -> Task:
+    """Create a Task with MMLU-shaped data."""
+    if choices is None:
+        choices = ["A", "B", "C", "D"]
+    return Task(
+        query=query,
+        id=f"mmlu_{doc_id}",
+        environment_data={
+            "choices": choices,
+            "full_prompt": full_prompt,
+            "example": query,
+        },
+        evaluation_data={"gold": gold},
+        metadata={"doc_id": doc_id, "task_type": "mmlu"},
+    )
+
+
+def make_mmlu_json_item(
+    query: str = "What is 2+2?",
+    gold: int = 0,
+    choices: Optional[List[str]] = None,
+    full_prompt: str = "Few-shot examples...\n\nWhat is 2+2?",
+) -> Dict[str, Any]:
+    """Create a raw JSON dict matching the MMLU prompts file format."""
+    if choices is None:
+        choices = ["A", "B", "C", "D"]
+    return {
+        "query": query,
+        "gold": gold,
+        "choices": choices,
+        "full_prompt": full_prompt,
+        "example": query,
+    }
+
+
+@pytest.fixture
+def sample_mmlu_task():
+    """Single MMLU task with gold=0 (answer A)."""
+    return make_mmlu_task()
+
+
+@pytest.fixture
+def sample_mmlu_tasks():
+    """Three MMLU tasks with different gold answers."""
+    return [
+        make_mmlu_task(query="Q1", gold=0, doc_id=0),
+        make_mmlu_task(query="Q2", gold=1, doc_id=1),
+        make_mmlu_task(query="Q3", gold=2, doc_id=2),
+    ]
+
+
+@pytest.fixture
+def mmlu_json_path(tmp_path):
+    """Write a 5-item MMLU JSON file and return its path."""
+    items = [make_mmlu_json_item(query=f"Question {i}", gold=i % 4, full_prompt=f"Full prompt {i}") for i in range(5)]
+    path = tmp_path / "mmlu_prompts.json"
+    path.write_text(json.dumps(items))
+    return path
diff --git a/tests/test_benchmarks/test_mmlu/test_data_integrity.py b/tests/test_benchmarks/test_mmlu/test_data_integrity.py
new file mode 100644
index 00000000..297907df
--- /dev/null
+++ b/tests/test_benchmarks/test_mmlu/test_data_integrity.py
@@ -0,0 +1,243 @@
+"""Tier 2 tests for MMLU benchmark — real data from HuggingFace.
+
+Downloads the real MMLU dataset and validates data integrity, load_tasks
+pipeline, and the evaluation pipeline against real task structure.
+
+Run with::
+
+    pytest -m "mmlu and live" tests/test_benchmarks/test_mmlu/test_data_integrity.py -v
+"""
+
+from unittest.mock import MagicMock
+
+import pytest
+
+pytestmark = [
+    pytest.mark.live,
+    pytest.mark.slow,
+    pytest.mark.benchmark,
+    pytest.mark.mmlu,
+]
+
+
+# ---------------------------------------------------------------------------
+# Session-scoped fixtures — download once, reuse across tests
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture(scope="session")
+def ensure_mmlu_data():
+    """Download real MMLU data from HuggingFace once per session.
+
+    Uses hf_hub_download with built-in caching — skips download when cached.
+    Returns the local path to the downloaded JSON file.
+    """
+    hf_hub = pytest.importorskip("huggingface_hub")
+    local_path = hf_hub.hf_hub_download(
+        repo_id="arubique/flattened-MMLU",
+        filename="mmlu_prompts_examples.json",
+        repo_type="dataset",
+    )
+    return local_path
+
+
+@pytest.fixture(scope="session")
+def real_mmlu_tasks(ensure_mmlu_data):
+    """Load all real MMLU tasks from the downloaded dataset."""
+    from maseval.benchmark.mmlu.mmlu import load_tasks
+
+    return load_tasks(data_path=ensure_mmlu_data)
+
+
+@pytest.fixture(scope="session")
+def real_mmlu_tasks_small(ensure_mmlu_data):
+    """Load a small subset (50 tasks) for faster integration tests."""
+    from maseval.benchmark.mmlu.mmlu import load_tasks
+
+    return load_tasks(data_path=ensure_mmlu_data, limit=50)
+
+
+# ---------------------------------------------------------------------------
+# Data integrity — validate the real dataset
+# ---------------------------------------------------------------------------
+
+
+class TestMMLUDataIntegrity:
+    def test_dataset_loads_from_huggingface(self, ensure_mmlu_data):
+        """Downloaded file exists and is reachable."""
+        from pathlib import Path
+
+        assert Path(ensure_mmlu_data).exists()
+
+    def test_total_task_count(self, real_mmlu_tasks):
+        """MMLU full dataset has >14 000 tasks."""
+        assert len(real_mmlu_tasks) > 14_000
+
+    def test_task_schema(self, real_mmlu_tasks):
+        """Every task has required fields with correct types."""
+        for task in real_mmlu_tasks:
+            assert isinstance(task.query, str) and len(task.query) > 0
+            assert isinstance(task.evaluation_data["gold"], int)
+            assert 0 <= task.evaluation_data["gold"] <= 3
+            choices = task.environment_data["choices"]
+            assert isinstance(choices, list) and len(choices) == 4
+            assert isinstance(task.metadata["doc_id"], int)
+
+    def test_gold_answer_distribution(self, real_mmlu_tasks):
+        """All four answer indices (0-3) appear as gold answers."""
+        golds = {task.evaluation_data["gold"] for task in real_mmlu_tasks}
+        assert golds == {0, 1, 2, 3}
+
+    def test_choices_are_abcd(self, real_mmlu_tasks):
+        """Every task's choices are [A, B, C, D]."""
+        for task in real_mmlu_tasks:
+            assert task.environment_data["choices"] == ["A", "B", "C", "D"]
+
+    def test_full_prompt_present(self, real_mmlu_tasks):
+        """Every task has a non-empty full_prompt."""
+        for task in real_mmlu_tasks:
+            assert len(task.environment_data.get("full_prompt", "")) > 0
+
+    def test_doc_ids_unique(self, real_mmlu_tasks):
+        """No duplicate doc_id values."""
+        doc_ids = [task.metadata["doc_id"] for task in real_mmlu_tasks]
+        assert len(doc_ids) == len(set(doc_ids))
+
+
+# ---------------------------------------------------------------------------
+# load_tasks with real data
+# ---------------------------------------------------------------------------
+
+
+class TestMMLULoadTasksWithRealData:
+    def test_load_with_limit(self, ensure_mmlu_data):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        tasks = load_tasks(data_path=ensure_mmlu_data, limit=10)
+        assert len(tasks) == 10
+
+    def test_load_returns_sequential_queue(self, ensure_mmlu_data):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+        from maseval.core.task import SequentialTaskQueue
+
+        tasks = load_tasks(data_path=ensure_mmlu_data, limit=5)
+        assert isinstance(tasks, SequentialTaskQueue)
+
+    def test_tasks_have_correct_id_format(self, real_mmlu_tasks_small):
+        for i, task in enumerate(real_mmlu_tasks_small):
+            assert task.id == f"mmlu_{i}"
+
+
+# ---------------------------------------------------------------------------
+# Real data pipeline — real tasks, no GPU
+# ---------------------------------------------------------------------------
+
+
+class TestMMLURealDataPipeline:
+    def test_environment_setup_with_real_task(self, real_mmlu_tasks_small):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment
+
+        task = real_mmlu_tasks_small[0]
+        task_data = {
+            "query": task.query,
+            "environment_data": {**task.environment_data, "use_full_prompt": True},
+        }
+        env = MMLUEnvironment(task_data)
+        assert "choices" in env.state
+        assert "full_prompt" in env.state
+        assert env.get_prompt() == task.environment_data["full_prompt"]
+
+    def test_evaluator_with_real_task(self, real_mmlu_tasks_small):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task = real_mmlu_tasks_small[0]
+        task_data = {
+            "query": task.query,
+            "environment_data": {**task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        evaluator = MMLUEvaluator(task, env)
+        assert 0 <= evaluator.gold <= 3
+        assert evaluator.choices == ["A", "B", "C", "D"]
+
+    def test_evaluate_real_task_correct(self, real_mmlu_tasks_small):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task = real_mmlu_tasks_small[0]
+        task_data = {
+            "query": task.query,
+            "environment_data": {**task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        evaluator = MMLUEvaluator(task, env)
+        gold_letter = ["A", "B", "C", "D"][evaluator.gold]
+        result = evaluator({"messages": []}, final_answer=gold_letter)
+        assert result["acc"] == 1.0
+        assert result["correct"] is True
+
+    def test_evaluate_real_task_incorrect(self, real_mmlu_tasks_small):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task = real_mmlu_tasks_small[0]
+        task_data = {
+            "query": task.query,
+            "environment_data": {**task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        evaluator = MMLUEvaluator(task, env)
+        wrong_letter = ["A", "B", "C", "D"][(evaluator.gold + 1) % 4]
+        result = evaluator({"messages": []}, final_answer=wrong_letter)
+        assert result["acc"] == 0.0
+        assert result["correct"] is False
+
+    def test_full_pipeline_single_task_with_mock_scorer(self, real_mmlu_tasks_small):
+        """Run the full MMLU pipeline on a real task with a stub scorer (no GPU)."""
+        from maseval.benchmark.mmlu.mmlu import MMLUBenchmark, MMLUEnvironment, _ScorerBackedAdapter
+        from maseval.core.seeding import DefaultSeedGenerator
+
+        task = real_mmlu_tasks_small[0]
+        gold = task.evaluation_data["gold"]
+
+        # Build a concrete benchmark subclass with a stub scorer
+        class StubMMLUBenchmark(MMLUBenchmark):
+            def __init__(self):
+                super().__init__(use_full_prompt=True)
+                self._stub_scorer = MagicMock()
+
+            def setup_agents(self, agent_data, environment, task, user, seed_generator):
+                adapter = _ScorerBackedAdapter(self._stub_scorer, "stub_agent")
+                return [adapter], {"stub_agent": adapter}
+
+            def get_model_adapter(self, model_id, **kwargs):
+                raise NotImplementedError
+
+            def run_agents(self, agents, task, environment, query=""):
+                env = environment
+                prompt = env.get_prompt()
+                choices = env.state["choices"]
+                # Return fixed logprobs that pick the gold answer
+                logprobs = [-5.0] * 4
+                logprobs[gold] = -0.1
+                agent = agents[0]
+                answer = choices[logprobs.index(max(logprobs))]
+                agent.record_message({"role": "user", "content": prompt})
+                agent.record_message({"role": "assistant", "content": answer, "logprobs": logprobs})
+                return answer
+
+        benchmark = StubMMLUBenchmark()
+
+        # Run the pipeline components manually
+        seed_gen = DefaultSeedGenerator()
+        env = benchmark.setup_environment({}, task, seed_gen)
+        assert isinstance(env, MMLUEnvironment)
+
+        agents_list, agents_dict = benchmark.setup_agents({}, env, task, None, seed_gen)
+        answer = benchmark.run_agents(agents_list, task, env)
+        assert answer == ["A", "B", "C", "D"][gold]
+
+        evaluators = benchmark.setup_evaluators(env, task, agents_list, None, seed_gen)
+        traces = {"agents": {name: {"messages": list(a.get_messages())} for name, a in agents_dict.items()}}
+        results = benchmark.evaluate(evaluators, agents_dict, answer, traces)
+        assert len(results) == 1
+        assert results[0]["correct"] is True
+        assert results[0]["doc_id"] == task.metadata["doc_id"]
diff --git a/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
new file mode 100644
index 00000000..0e2243ca
--- /dev/null
+++ b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
@@ -0,0 +1,456 @@
+"""Unit tests for MMLU benchmark components (Tier 1: offline, no real data).
+
+Tests MMLUEnvironment, MMLUEvaluator, _ScorerBackedAdapter, load_tasks,
+compute_benchmark_metrics, MMLUBenchmark, and DefaultMMLUBenchmark.
+"""
+
+import json
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from maseval.core.history import MessageHistory
+from maseval.core.seeding import DefaultSeedGenerator
+from maseval.core.task import DISCOQueue, SequentialTaskQueue
+
+from .conftest import make_mmlu_task
+
+pytestmark = pytest.mark.benchmark
+
+
+# ---------------------------------------------------------------------------
+# MMLUEnvironment
+# ---------------------------------------------------------------------------
+
+
+class TestMMLUEnvironment:
+    def _make_env(self, *, use_full_prompt: bool = True):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment
+
+        task = make_mmlu_task(full_prompt="Full prompt here")
+        task_data = {
+            "query": task.query,
+            "environment_data": {**task.environment_data, "use_full_prompt": use_full_prompt},
+        }
+        return MMLUEnvironment(task_data)
+
+    def test_setup_state_extracts_fields(self):
+        env = self._make_env()
+        for key in ("query", "choices", "full_prompt", "use_full_prompt"):
+            assert key in env.state
+
+    def test_create_tools_returns_empty_dict(self):
+        env = self._make_env()
+        assert env.tools == {}
+
+    def test_get_prompt_full_prompt_true(self):
+        env = self._make_env(use_full_prompt=True)
+        assert env.get_prompt() == "Full prompt here"
+
+    def test_get_prompt_full_prompt_false(self):
+        env = self._make_env(use_full_prompt=False)
+        assert env.get_prompt() == "What is 2+2?"
+
+
+# ---------------------------------------------------------------------------
+# MMLUEvaluator._parse_answer
+# ---------------------------------------------------------------------------
+
+
+class TestMMLUEvaluatorParseAnswer:
+    @pytest.fixture
+    def evaluator(self, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        return MMLUEvaluator(sample_mmlu_task, env)
+
+    @pytest.mark.parametrize(
+        "response, expected",
+        [
+            # Direct letter
+            ("A", 0),
+            ("B", 1),
+            ("C", 2),
+            ("D", 3),
+            # With period
+            ("A.", 0),
+            ("B.", 1),
+            # Sentence patterns
+            ("The answer is A", 0),
+            ("ANSWER IS C", 2),
+            ("ANSWER: D", 3),
+            # Last character
+            ("I think it's B", 1),
+            # Empty / unparseable
+            ("", -1),
+            ("random text", -1),
+        ],
+    )
+    def test_parse_answer(self, evaluator, response, expected):
+        assert evaluator._parse_answer(response) == expected
+
+
+# ---------------------------------------------------------------------------
+# MMLUEvaluator.__call__
+# ---------------------------------------------------------------------------
+
+
+class TestMMLUEvaluatorCall:
+    @pytest.fixture
+    def evaluator(self, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        return MMLUEvaluator(sample_mmlu_task, env)
+
+    def test_correct_answer_scores_1(self, evaluator):
+        result = evaluator({"messages": []}, final_answer="A")
+        assert result["acc"] == 1.0
+        assert result["correct"] is True
+        assert result["predicted"] == 0
+
+    def test_incorrect_answer_scores_0(self, evaluator):
+        result = evaluator({"messages": []}, final_answer="B")
+        assert result["acc"] == 0.0
+        assert result["correct"] is False
+
+    def test_extracts_logprobs_from_traces(self, evaluator):
+        traces = {"messages": [{"role": "assistant", "content": "A", "logprobs": [-1.0, -2.0, -3.0, -4.0]}]}
+        result = evaluator(traces, final_answer="A")
+        assert result["logprobs"] == [-1.0, -2.0, -3.0, -4.0]
+
+    def test_filter_traces_with_agents(self, evaluator):
+        traces = {"agents": {"agent1": {"messages": [{"role": "user", "content": "hi"}]}}}
+        filtered = evaluator.filter_traces(traces)
+        assert filtered["messages"] == [{"role": "user", "content": "hi"}]
+
+    def test_filter_traces_empty(self, evaluator):
+        assert evaluator.filter_traces({})["messages"] == []
+
+
+# ---------------------------------------------------------------------------
+# _ScorerBackedAdapter
+# ---------------------------------------------------------------------------
+
+
+class TestScorerBackedAdapter:
+    def test_record_and_get_messages(self):
+        from maseval.benchmark.mmlu.mmlu import _ScorerBackedAdapter
+
+        adapter = _ScorerBackedAdapter(scorer=MagicMock(), name="test")
+        adapter.record_message({"role": "user", "content": "hello"})
+        adapter.record_message({"role": "assistant", "content": "world"})
+        history = adapter.get_messages()
+        assert isinstance(history, MessageHistory)
+        assert len(history) == 2
+
+    def test_run_agent_raises(self):
+        from maseval.benchmark.mmlu.mmlu import _ScorerBackedAdapter
+
+        adapter = _ScorerBackedAdapter(scorer=MagicMock(), name="test")
+        with pytest.raises(NotImplementedError):
+            adapter._run_agent("query")
+
+
+# ---------------------------------------------------------------------------
+# load_tasks
+# ---------------------------------------------------------------------------
+
+
+class TestLoadTasks:
+    def test_basic_load(self, mmlu_json_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        tasks = load_tasks(data_path=mmlu_json_path)
+        assert isinstance(tasks, SequentialTaskQueue)
+        assert len(tasks) == 5
+
+    def test_with_limit(self, mmlu_json_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        tasks = load_tasks(data_path=mmlu_json_path, limit=2)
+        assert len(tasks) == 2
+
+    def test_uses_example_field(self, tmp_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        item = {"example": "Example question", "gold": 0, "choices": ["A", "B", "C", "D"]}
+        path = tmp_path / "data.json"
+        path.write_text(json.dumps([item]))
+        tasks = load_tasks(data_path=path)
+        assert tasks[0].query == "Example question"
+
+    def test_missing_gold_raises(self, tmp_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        path = tmp_path / "data.json"
+        path.write_text(json.dumps([{"query": "Q", "choices": ["A", "B", "C", "D"]}]))
+        with pytest.raises(ValueError, match="gold"):
+            load_tasks(data_path=path)
+
+    def test_missing_choices_raises(self, tmp_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        path = tmp_path / "data.json"
+        path.write_text(json.dumps([{"query": "Q", "gold": 0}]))
+        with pytest.raises(ValueError, match="choices"):
+            load_tasks(data_path=path)
+
+    def test_missing_query_and_example_raises(self, tmp_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        path = tmp_path / "data.json"
+        path.write_text(json.dumps([{"gold": 0, "choices": ["A", "B", "C", "D"]}]))
+        with pytest.raises(ValueError, match="neither"):
+            load_tasks(data_path=path)
+
+    def test_with_anchor_points_path(self, mmlu_json_path, tmp_path):
+        from maseval.benchmark.mmlu.mmlu import load_tasks
+
+        # Write a JSON anchor points file with indices [0, 2]
+        anchor_path = tmp_path / "anchors.json"
+        anchor_path.write_text(json.dumps([0, 2]))
+        tasks = load_tasks(data_path=mmlu_json_path, anchor_points_path=anchor_path)
+        assert isinstance(tasks, DISCOQueue)
+        assert len(tasks) == 2
+
+
+# ---------------------------------------------------------------------------
+# compute_benchmark_metrics
+# ---------------------------------------------------------------------------
+
+
+def _success_result(acc: float = 1.0, acc_norm: float = 1.0, correct: bool = True):
+    return {
+        "status": "success",
+        "eval": [{"acc": acc, "acc_norm": acc_norm, "correct": correct}],
+    }
+
+
+def _error_result():
+    return {"status": "error", "eval": None}
+
+
+class TestComputeBenchmarkMetrics:
+    @pytest.mark.parametrize(
+        "results, expected_acc, expected_correct",
+        [
+            ([], 0.0, 0),
+            ([_success_result(acc=1.0)], 1.0, 1),
+            ([_success_result(acc=1.0), _success_result(acc=0.0, correct=False)], 0.5, 1),
+            ([_error_result()], 0.0, 0),
+        ],
+        ids=["empty", "all_correct", "mixed", "error_skipped"],
+    )
+    def test_compute_metrics(self, results, expected_acc, expected_correct):
+        from maseval.benchmark.mmlu.mmlu import compute_benchmark_metrics
+
+        metrics = compute_benchmark_metrics(results)
+        assert metrics["acc"] == pytest.approx(expected_acc)
+        if results:
+            assert metrics["correct_count"] == expected_correct
+
+
+# ---------------------------------------------------------------------------
+# MMLUBenchmark (base class)
+# ---------------------------------------------------------------------------
+
+
+class TestMMLUBenchmarkBase:
+    @pytest.fixture
+    def benchmark(self):
+        from maseval.benchmark.mmlu.mmlu import MMLUBenchmark
+
+        class ConcreteMMLU(MMLUBenchmark):
+            def setup_agents(self, agent_data, environment, task, user, seed_generator):
+                adapter = MagicMock()
+                adapter.name = "mock_agent"
+                adapter.run.return_value = "A"
+                adapter.get_messages.return_value = MessageHistory([])
+                adapter.gather_traces.return_value = {}
+                adapter.gather_config.return_value = {}
+                adapter.callbacks = []
+                return [adapter], {"mock_agent": adapter}
+
+            def get_model_adapter(self, model_id, **kwargs):
+                return MagicMock()
+
+        return ConcreteMMLU(use_full_prompt=False)
+
+    def test_setup_environment_returns_mmlu_env(self, benchmark, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment
+
+        env = benchmark.setup_environment({}, sample_mmlu_task, DefaultSeedGenerator())
+        assert isinstance(env, MMLUEnvironment)
+        assert "choices" in env.state
+
+    def test_setup_evaluators_returns_mmlu_evaluator(self, benchmark, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, MMLUEvaluator
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        evaluators = benchmark.setup_evaluators(env, sample_mmlu_task, [], None, DefaultSeedGenerator())
+        assert len(evaluators) == 1
+        assert isinstance(evaluators[0], MMLUEvaluator)
+
+    def test_run_agents_calls_agent_run(self, benchmark, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        agents, _ = benchmark.setup_agents({}, env, sample_mmlu_task, None, DefaultSeedGenerator())
+        result = benchmark.run_agents(agents, sample_mmlu_task, env, query="")
+        assert result == "A"
+        agents[0].run.assert_called_once()
+
+    def test_evaluate_delegates_to_evaluators(self, benchmark, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": False},
+        }
+        env = MMLUEnvironment(task_data)
+        evaluators = benchmark.setup_evaluators(env, sample_mmlu_task, [], None, DefaultSeedGenerator())
+        results = benchmark.evaluate(evaluators, {}, "A", {"agents": {}})
+        assert len(results) == 1
+        assert results[0]["correct"] is True
+
+
+# ---------------------------------------------------------------------------
+# DefaultMMLUBenchmark
+# ---------------------------------------------------------------------------
+
+
+class TestDefaultMMLUBenchmark:
+    @pytest.fixture
+    def mock_scorer(self):
+        scorer = MagicMock()
+        scorer.model_id = "test-model"
+        scorer.loglikelihood_choices.return_value = [-1.0, -2.0, -3.0, -4.0]
+        scorer.gather_traces.return_value = {}
+        scorer.gather_config.return_value = {"model_id": "test-model"}
+        return scorer
+
+    @pytest.fixture
+    def benchmark(self, mock_scorer):
+        with patch(
+            "maseval.interface.inference.huggingface_scorer.HuggingFaceModelScorer",
+            return_value=mock_scorer,
+        ):
+            from maseval.benchmark.mmlu.mmlu import DefaultMMLUBenchmark
+
+            return DefaultMMLUBenchmark(model_id="test-model", device="cpu")
+
+    def test_setup_agents_returns_scorer_backed_adapter(self, benchmark, sample_mmlu_task):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, _ScorerBackedAdapter
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": True},
+        }
+        env = MMLUEnvironment(task_data)
+        agents, agents_dict = benchmark.setup_agents({}, env, sample_mmlu_task, None, DefaultSeedGenerator())
+        assert len(agents) == 1
+        assert isinstance(agents[0], _ScorerBackedAdapter)
+
+    def test_run_agents_with_precomputed_logprobs(self, benchmark, sample_mmlu_task, mock_scorer):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, _ScorerBackedAdapter
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": True},
+        }
+        env = MMLUEnvironment(task_data)
+        adapter = _ScorerBackedAdapter(mock_scorer, "agent")
+
+        benchmark._precomputed_logprobs = {0: [-0.5, -1.5, -2.5, -3.5]}
+        answer = benchmark.run_agents([adapter], sample_mmlu_task, env)
+        assert answer == "A"  # index 0 has highest logprob
+        mock_scorer.loglikelihood_choices.assert_not_called()
+
+    def test_run_agents_without_precomputed(self, benchmark, sample_mmlu_task, mock_scorer):
+        from maseval.benchmark.mmlu.mmlu import MMLUEnvironment, _ScorerBackedAdapter
+
+        task_data = {
+            "query": sample_mmlu_task.query,
+            "environment_data": {**sample_mmlu_task.environment_data, "use_full_prompt": True},
+        }
+        env = MMLUEnvironment(task_data)
+        adapter = _ScorerBackedAdapter(mock_scorer, "agent")
+
+        benchmark._precomputed_logprobs = None
+        answer = benchmark.run_agents([adapter], sample_mmlu_task, env)
+        assert answer == "A"  # index 0 has highest logprob (-1.0)
+        mock_scorer.loglikelihood_choices.assert_called_once()
+
+    def test_get_model_adapter_raises(self, benchmark):
+        with pytest.raises(NotImplementedError):
+            benchmark.get_model_adapter("test-model")
+
+    def test_precompute_all_logprobs_lmeval(self, benchmark, sample_mmlu_tasks, mock_scorer):
+        """Test precompute_all_logprobs_lmeval with mocked lm-evaluation-harness."""
+        import sys
+        from types import ModuleType
+
+        # Build mock lm-evaluation-harness modules
+        mock_hflm_mod = ModuleType("lm_eval.models.huggingface")
+        mock_instance_mod = ModuleType("lm_eval.api.instance")
+        mock_lm_top = ModuleType("lm_eval")
+        mock_lm_models = ModuleType("lm_eval.models")
+        mock_lm_api = ModuleType("lm_eval.api")
+
+        # FakeInstance stores keyword args as attributes
+        class FakeInstance:
+            def __init__(self, **kwargs):
+                for k, v in kwargs.items():
+                    setattr(self, k, v)
+
+        mock_instance_mod.Instance = FakeInstance
+
+        # FakeHFLM returns (logprob, is_greedy) tuples
+        class FakeHFLM:
+            def __init__(self, **kwargs):
+                pass
+
+            def loglikelihood(self, instances):
+                return [(-float(i), True) for i in range(len(instances))]
+
+        mock_hflm_mod.HFLM = FakeHFLM
+
+        tasks = sample_mmlu_tasks
+        with patch.dict(
+            sys.modules,
+            {
+                "lm_eval": mock_lm_top,
+                "lm_eval.models": mock_lm_models,
+                "lm_eval.models.huggingface": mock_hflm_mod,
+                "lm_eval.api": mock_lm_api,
+                "lm_eval.api.instance": mock_instance_mod,
+            },
+        ):
+            doc_logprobs = benchmark.precompute_all_logprobs_lmeval(tasks)
+
+        # 3 tasks, each with 4 choices
+        assert len(doc_logprobs) == 3
+        for doc_id in [0, 1, 2]:
+            assert doc_id in doc_logprobs
+            assert len(doc_logprobs[doc_id]) == 4
+
+        # Verify stored for later use
+        assert benchmark._precomputed_logprobs is doc_logprobs
diff --git a/tests/test_core/test_callback_handler.py b/tests/test_core/test_callback_handler.py
new file mode 100644
index 00000000..d3c94f1c
--- /dev/null
+++ b/tests/test_core/test_callback_handler.py
@@ -0,0 +1,45 @@
+"""Tests for maseval.core.callback_handler.CallbackHandler."""
+
+import pytest
+
+from maseval.core.callback_handler import CallbackHandler
+
+pytestmark = pytest.mark.core
+
+
+class TestCallbackHandler:
+    def test_register_and_invoke(self):
+        handler = CallbackHandler()
+        results = []
+        handler.register(lambda x: results.append(x))
+        handler.invoke("hello")
+        assert results == ["hello"]
+
+    def test_invoke_multiple_callbacks(self):
+        handler = CallbackHandler()
+        log = []
+        handler.register(lambda: log.append("a"))
+        handler.register(lambda: log.append("b"))
+        handler.invoke()
+        assert log == ["a", "b"]
+
+    def test_deregister(self):
+        handler = CallbackHandler()
+        log = []
+        cb = lambda: log.append("x")  # noqa: E731
+        handler.register(cb)
+        handler.deregister(cb)
+        handler.invoke()
+        assert log == []
+
+    def test_deregister_nonexistent_raises(self):
+        handler = CallbackHandler()
+        with pytest.raises(ValueError):
+            handler.deregister(lambda: None)
+
+    def test_invoke_passes_kwargs(self):
+        handler = CallbackHandler()
+        captured = {}
+        handler.register(lambda **kw: captured.update(kw))
+        handler.invoke(key="val")
+        assert captured == {"key": "val"}
diff --git a/tests/test_core/test_callbacks/test_file_result_logger.py b/tests/test_core/test_callbacks/test_file_result_logger.py
index 54cfdbba..4fa53bcc 100644
--- a/tests/test_core/test_callbacks/test_file_result_logger.py
+++ b/tests/test_core/test_callbacks/test_file_result_logger.py
@@ -175,3 +175,91 @@ def test_file_result_logger_overwrite_true_allows_overwriting(tmp_path):
     obj = json.loads(lines[0])
     assert obj["task_id"] == report["task_id"]
     assert obj["repeat_idx"] == report["repeat_idx"]
+
+
+@pytest.mark.core
+def test_file_result_logger_writes_metadata(tmp_path):
+    """Test that FileResultLogger writes a .meta.json file on finalization."""
+    out_dir = tmp_path / "results"
+    out_dir.mkdir()
+
+    logger = FileResultLogger(output_dir=out_dir, write_metadata=True)
+    benchmark = MockBenchmark(n_tasks=2, n_repeats=1)
+    logger.on_run_start(benchmark)  # type: ignore[arg-type]
+
+    for i, task_id in enumerate(benchmark.task_ids):
+        report = {"task_id": task_id, "repeat_idx": 0, "status": "success"}
+        logger.on_task_repeat_end(benchmark, report)  # type: ignore[arg-type]
+
+    logger.on_run_end(benchmark, [])  # type: ignore[arg-type]
+
+    meta_files = list(out_dir.glob("*.meta.json"))
+    assert len(meta_files) == 1
+    meta = json.loads(meta_files[0].read_text())
+    assert meta["n_tasks"] == 2
+    assert meta["lines_written"] == 2
+    assert "timestamp" in meta
+
+
+@pytest.mark.core
+def test_file_result_logger_validate_detects_duplicates(tmp_path):
+    """Test that validation detects duplicate iterations in the JSONL file."""
+    out_dir = tmp_path / "results"
+    out_dir.mkdir()
+
+    logger = FileResultLogger(output_dir=out_dir, validate_on_completion=False)
+    benchmark = MockBenchmark(n_tasks=1, n_repeats=1)
+    logger.on_run_start(benchmark)  # type: ignore[arg-type]
+
+    report = {"task_id": benchmark.task_ids[0], "repeat_idx": 0, "status": "success"}
+    logger.on_task_repeat_end(benchmark, report)  # type: ignore[arg-type]
+
+    # Manually write a duplicate line to the file
+    logger._file_handle.write(json.dumps(report) + "\n")
+    logger._file_handle.flush()
+    logger._lines_written += 1
+
+    logger.finalize()
+    # Validation should fail due to duplicate
+    assert logger.validate() is False
+
+
+@pytest.mark.core
+def test_file_result_logger_non_atomic_writes(tmp_path):
+    """Test that non-atomic writes work correctly."""
+    out_dir = tmp_path / "results"
+    out_dir.mkdir()
+
+    logger = FileResultLogger(output_dir=out_dir, atomic_writes=False)
+    benchmark = MockBenchmark(n_tasks=1, n_repeats=1)
+    logger.on_run_start(benchmark)  # type: ignore[arg-type]
+
+    report = {"task_id": benchmark.task_ids[0], "repeat_idx": 0, "status": "success"}
+    logger.on_task_repeat_end(benchmark, report)  # type: ignore[arg-type]
+    logger.on_run_end(benchmark, [report])  # type: ignore[arg-type]
+
+    jsonl_files = list(out_dir.glob("*.jsonl"))
+    assert len(jsonl_files) == 1
+    lines = jsonl_files[0].read_text().strip().splitlines()
+    assert len(lines) == 1
+
+
+@pytest.mark.core
+def test_report_validation_errors(tmp_path, capsys):
+    """Test _report_validation_errors reports missing and extra iterations."""
+    out_dir = tmp_path / "results"
+    out_dir.mkdir()
+
+    logger = FileResultLogger(output_dir=out_dir, validate_on_completion=False)
+    benchmark = MockBenchmark(n_tasks=1, n_repeats=1)
+    logger.on_run_start(benchmark)  # type: ignore[arg-type]
+
+    # Set up state to simulate missing iterations
+    logger._expected_iterations = {("task_1", 0), ("task_2", 0)}
+    logger._logged_iterations = {("task_1", 0), ("task_3", 0)}
+
+    logger._report_validation_errors()
+    output = capsys.readouterr().out
+    assert "Validation failed" in output
+    assert "Missing 1 iterations" in output
+    assert "Unexpected 1 iterations" in output
diff --git a/tests/test_core/test_message_history.py b/tests/test_core/test_message_history.py
index 68b1dd25..c4c8f78e 100644
--- a/tests/test_core/test_message_history.py
+++ b/tests/test_core/test_message_history.py
@@ -241,3 +241,100 @@ def test_message_history_iteration_doesnt_modify(self):
 
         # Verify still has 2 messages
         assert len(history) == 2
+
+    def test_clear(self):
+        """Test that clear removes all messages."""
+        history = MessageHistory()
+        history.add_message("user", "Hello")
+        history.add_message("assistant", "Hi")
+        assert len(history) == 2
+        history.clear()
+        assert len(history) == 0
+        assert not history
+
+    def test_filter_by_role(self):
+        """Test filtering messages by role."""
+        history = MessageHistory()
+        history.add_message("user", "U1")
+        history.add_message("assistant", "A1")
+        history.add_message("user", "U2")
+        history.add_message("system", "S1")
+        user_msgs = history.filter_by_role("user")
+        assert len(user_msgs) == 2
+        assert all(m["role"] == "user" for m in user_msgs)
+        assert history.filter_by_role("system") == [history[3]]
+        assert history.filter_by_role("tool") == []
+
+    def test_get_last_message(self):
+        """Test get_last_message returns last message or None."""
+        history = MessageHistory()
+        assert history.get_last_message() is None
+        history.add_message("user", "First")
+        history.add_message("assistant", "Second")
+        assert history.get_last_message()["content"] == "Second"
+
+    def test_to_openai_format_strips_metadata_and_timestamps(self):
+        """Test to_openai_format strips metadata and timestamps."""
+        history = MessageHistory()
+        history.add_message("user", "Hello", metadata={"key": "val"})
+        history.add_message("assistant", "Hi", name="bot")
+        openai_msgs = history.to_openai_format()
+        assert len(openai_msgs) == 2
+        assert "metadata" not in openai_msgs[0]
+        assert "timestamp" not in openai_msgs[0]
+        assert openai_msgs[0] == {"role": "user", "content": "Hello"}
+        assert openai_msgs[1] == {"role": "assistant", "content": "Hi", "name": "bot"}
+
+    def test_to_openai_format_preserves_tool_calls(self):
+        """Test to_openai_format preserves tool_calls and tool_call_id."""
+        history = MessageHistory()
+        tc = [{"id": "c1", "type": "function", "function": {"name": "f", "arguments": "{}"}}]
+        history.add_tool_call(tool_calls=tc)
+        history.add_tool_response(tool_call_id="c1", content="result")
+        openai_msgs = history.to_openai_format()
+        assert openai_msgs[0]["tool_calls"] == tc
+        assert openai_msgs[1]["tool_call_id"] == "c1"
+
+    def test_add_tool_call_without_content(self):
+        """Test add_tool_call without content omits content key."""
+        history = MessageHistory()
+        history.add_tool_call(tool_calls=[{"id": "c1"}])
+        assert "content" not in history[0]
+
+    def test_add_tool_call_with_content(self):
+        """Test add_tool_call with content includes it."""
+        history = MessageHistory()
+        history.add_tool_call(tool_calls=[{"id": "c1"}], content="thinking...")
+        assert history[0]["content"] == "thinking..."
+
+
+@pytest.mark.core
+class TestToolInvocationHistory:
+    """Tests for ToolInvocationHistory."""
+
+    def test_add_and_len(self):
+        from maseval.core.history import ToolInvocationHistory
+
+        h = ToolInvocationHistory()
+        assert len(h) == 0
+        h.add_invocation(inputs={"x": 1}, outputs={"y": 2}, status="success")
+        assert len(h) == 1
+
+    def test_to_list(self):
+        from maseval.core.history import ToolInvocationHistory
+
+        h = ToolInvocationHistory()
+        h.add_invocation(inputs="in", outputs="out", status="ok")
+        records = h.to_list()
+        assert len(records) == 1
+        assert records[0]["inputs"] == "in"
+        assert records[0]["outputs"] == "out"
+        assert records[0]["status"] == "ok"
+        assert "id" in records[0]
+        assert "timestamp" in records[0]
+
+    def test_init_with_records(self):
+        from maseval.core.history import ToolInvocationHistory
+
+        h = ToolInvocationHistory(records=[{"id": "x", "inputs": 1, "outputs": 2, "status": "ok"}])
+        assert len(h) == 1
diff --git a/tests/test_core/test_message_tracing_callback.py b/tests/test_core/test_message_tracing_callback.py
index bfd6b9ed..be212c79 100644
--- a/tests/test_core/test_message_tracing_callback.py
+++ b/tests/test_core/test_message_tracing_callback.py
@@ -290,3 +290,25 @@ def test_repr(self):
         repr_str = repr(callback)
         assert "MessageTracingAgentCallback" in repr_str
         assert "conversations=0" in repr_str
+
+    def test_verbose_mode_prints(self, capsys):
+        """Test that verbose=True produces output."""
+        callback = MessageTracingAgentCallback(verbose=True)
+        agent = DummyAgent()
+        adapter = TracingTestAgentAdapter(agent_instance=agent, name="verbose_agent", callbacks=[callback])
+        adapter.run("Hello")
+        captured = capsys.readouterr().out
+        assert len(captured) > 0
+
+    def test_get_conversations_by_agent_nonexistent(self):
+        """Test that querying a nonexistent agent returns empty list."""
+        callback = MessageTracingAgentCallback()
+        assert callback.get_conversations_by_agent("nonexistent") == []
+
+    def test_get_statistics_empty(self):
+        """Test statistics on empty callback."""
+        callback = MessageTracingAgentCallback()
+        stats = callback.get_statistics()
+        assert stats["total_conversations"] == 0
+        assert stats["total_messages"] == 0
+        assert stats["agents"] == []
diff --git a/tests/test_core/test_system_info.py b/tests/test_core/test_system_info.py
new file mode 100644
index 00000000..ff08ea6e
--- /dev/null
+++ b/tests/test_core/test_system_info.py
@@ -0,0 +1,104 @@
+"""Tests for maseval.core.utils.system_info module."""
+
+from unittest.mock import patch
+import os
+
+import pytest
+
+from maseval.core.utils.system_info import (
+    gather_benchmark_config,
+    get_environment_variables,
+    get_git_info,
+    get_package_versions,
+    get_python_info,
+    get_system_info,
+)
+
+pytestmark = pytest.mark.core
+
+
+class TestGetGitInfo:
+    def test_returns_commit_hash(self):
+        info = get_git_info()
+        # We're running inside a git repo, so this should succeed
+        assert "commit_hash" in info
+        assert "branch" in info
+
+    def test_error_path_for_invalid_repo(self, tmp_path):
+        info = get_git_info(repo_path=str(tmp_path / "nonexistent"))
+        assert "error" in info
+        assert "error_type" in info
+
+
+class TestGetPythonInfo:
+    def test_contains_expected_fields(self):
+        info = get_python_info()
+        assert "version" in info
+        assert "executable" in info
+        assert "implementation" in info
+        assert "version_info" in info
+        assert info["version_info"]["major"] >= 3
+
+
+class TestGetSystemInfo:
+    def test_contains_expected_fields(self):
+        info = get_system_info()
+        assert "hostname" in info
+        assert "platform" in info
+        assert "system" in info
+        assert "machine" in info
+
+
+class TestGetPackageVersions:
+    def test_returns_dict(self):
+        # This actually runs pip freeze, so just verify it returns a dict
+        versions = get_package_versions()
+        assert isinstance(versions, dict)
+
+    def test_handles_subprocess_error(self):
+        import subprocess
+
+        with patch("subprocess.run", side_effect=subprocess.CalledProcessError(1, "pip")):
+            versions = get_package_versions()
+            assert versions == {}
+
+
+class TestGetEnvironmentVariables:
+    def test_excludes_sensitive_keys(self):
+        with patch.dict(os.environ, {"MY_API_KEY": "secret", "CUDA_VISIBLE_DEVICES": "0"}, clear=False):
+            env = get_environment_variables()
+            assert "MY_API_KEY" not in env
+            assert "CUDA_VISIBLE_DEVICES" in env
+
+    def test_excludes_token_and_password(self):
+        with patch.dict(os.environ, {"OPENAI_API_TOKEN": "tok", "DB_PASSWORD": "pw", "OPENAI_ORG": "org"}, clear=False):
+            env = get_environment_variables()
+            assert "OPENAI_API_TOKEN" not in env
+            assert "DB_PASSWORD" not in env
+
+    def test_custom_patterns(self):
+        with patch.dict(os.environ, {"MY_CUSTOM_VAR": "val", "OTHER_VAR": "other"}, clear=False):
+            env = get_environment_variables(include_patterns=["MY_CUSTOM"])
+            assert "MY_CUSTOM_VAR" in env
+            assert "OTHER_VAR" not in env
+
+
+class TestGatherBenchmarkConfig:
+    def test_includes_all_sections(self):
+        config = gather_benchmark_config()
+        assert "timestamp" in config
+        assert "git" in config
+        assert "python" in config
+        assert "system" in config
+        assert "packages" in config
+        assert "environment" in config
+
+    def test_excludes_packages(self):
+        config = gather_benchmark_config(include_packages=False)
+        assert "packages" not in config
+        assert "python" in config
+
+    def test_excludes_env_vars(self):
+        config = gather_benchmark_config(include_env_vars=False)
+        assert "environment" not in config
+        assert "python" in config
diff --git a/tests/test_interface/test_model_integration/test_huggingface_scorer.py b/tests/test_interface/test_model_integration/test_huggingface_scorer.py
new file mode 100644
index 00000000..96b6df1c
--- /dev/null
+++ b/tests/test_interface/test_model_integration/test_huggingface_scorer.py
@@ -0,0 +1,282 @@
+"""Unit tests for HuggingFaceModelScorer (mocked transformers + torch).
+
+Tests model loading, tokenisation, log-likelihood computation, and the
+single-token optimisation path without requiring a GPU or real model weights.
+"""
+
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+torch = pytest.importorskip("torch")
+
+pytestmark = pytest.mark.interface
+
+
+# ---------------------------------------------------------------------------
+# Mock helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_mock_tokenizer(*, pad_token=None, eos_token="</s>"):
+    """Create a mock tokenizer that returns predictable token IDs."""
+    tok = MagicMock()
+    tok.padding_side = "right"  # will be overwritten to "left"
+    tok.pad_token = pad_token
+    tok.eos_token = eos_token
+
+    def _encode(text, add_special_tokens=True):
+        # Deterministic: each character → its ord value.
+        # This is intentionally simple so tests can predict the split.
+        return [ord(c) for c in text]
+
+    tok.encode = _encode
+    return tok
+
+
+class _FakeModelOutput:
+    """Simple container for model output with logits attribute."""
+
+    def __init__(self, logits):
+        self.logits = logits
+
+
+class _FakeCausalLM:
+    """Minimal fake causal LM that returns uniform logits."""
+
+    def __init__(self, vocab_size=256):
+        self._vocab_size = vocab_size
+        self.call_count = 0
+
+    def to(self, device):
+        return self
+
+    def eval(self):
+        pass
+
+    def __call__(self, input_ids):
+        self.call_count += 1
+        seq_len = input_ids.shape[1]
+        logits = torch.zeros(1, seq_len, self._vocab_size)
+        return _FakeModelOutput(logits)
+
+    def reset_mock(self):
+        self.call_count = 0
+
+
+def _make_mock_model(vocab_size=256):
+    """Create a fake model whose forward pass returns uniform logits."""
+    return _FakeCausalLM(vocab_size=vocab_size)
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def mock_tokenizer():
+    return _make_mock_tokenizer()
+
+
+@pytest.fixture
+def mock_model():
+    return _make_mock_model()
+
+
+@pytest.fixture
+def scorer(mock_model, mock_tokenizer):
+    """HuggingFaceModelScorer with mocked transformers — model pre-injected."""
+    from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+    s = HuggingFaceModelScorer(model_id="test-model", device="cpu")
+    # Bypass lazy loading by injecting mocks directly
+    s._model = mock_model
+    s._tokenizer = mock_tokenizer
+    mock_tokenizer.padding_side = "left"
+    yield s
+
+
+# ---------------------------------------------------------------------------
+# TestInit
+# ---------------------------------------------------------------------------
+
+
+class TestInit:
+    def test_model_id_property(self):
+        from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+        s = HuggingFaceModelScorer(model_id="my/model", device="cpu")
+        assert s.model_id == "my/model"
+
+    def test_lazy_loading(self):
+        from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+        s = HuggingFaceModelScorer(model_id="my/model", device="cpu")
+        assert s._model is None
+        assert s._tokenizer is None
+
+    def test_gather_config(self, scorer):
+        config = scorer.gather_config()
+        assert config["model_id"] == "test-model"
+        assert config["device"] == "cpu"
+        assert "trust_remote_code" in config
+
+
+# ---------------------------------------------------------------------------
+# TestLoadModel
+# ---------------------------------------------------------------------------
+
+
+class TestLoadModel:
+    def test_loads_on_first_call(self, mock_model, mock_tokenizer):
+        with patch.dict("sys.modules", {"transformers": MagicMock()}) as _:
+            import sys
+
+            transformers_mock = sys.modules["transformers"]
+            transformers_mock.AutoModelForCausalLM.from_pretrained.return_value = mock_model
+            transformers_mock.AutoTokenizer.from_pretrained.return_value = mock_tokenizer
+
+            from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+            s = HuggingFaceModelScorer(model_id="test-model", device="cpu")
+            s._load_model()
+            transformers_mock.AutoModelForCausalLM.from_pretrained.assert_called_once()
+            transformers_mock.AutoTokenizer.from_pretrained.assert_called_once()
+
+    def test_caches_model(self, mock_model, mock_tokenizer):
+        with patch.dict("sys.modules", {"transformers": MagicMock()}) as _:
+            import sys
+
+            transformers_mock = sys.modules["transformers"]
+            transformers_mock.AutoModelForCausalLM.from_pretrained.return_value = mock_model
+            transformers_mock.AutoTokenizer.from_pretrained.return_value = mock_tokenizer
+
+            from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+            s = HuggingFaceModelScorer(model_id="test-model", device="cpu")
+            s._load_model()
+            s._load_model()
+            assert transformers_mock.AutoModelForCausalLM.from_pretrained.call_count == 1
+
+    def test_sets_padding_left(self, scorer, mock_tokenizer):
+        assert mock_tokenizer.padding_side == "left"
+
+    def test_sets_pad_token_from_eos(self, mock_model):
+        tok = _make_mock_tokenizer(pad_token=None, eos_token="<eos>")
+        with patch.dict("sys.modules", {"transformers": MagicMock()}) as _:
+            import sys
+
+            transformers_mock = sys.modules["transformers"]
+            transformers_mock.AutoModelForCausalLM.from_pretrained.return_value = mock_model
+            transformers_mock.AutoTokenizer.from_pretrained.return_value = tok
+
+            from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+            s = HuggingFaceModelScorer(model_id="test-model", device="cpu")
+            s._load_model()
+            assert tok.pad_token == "<eos>"
+
+
+# ---------------------------------------------------------------------------
+# TestEncodePair
+# ---------------------------------------------------------------------------
+
+
+class TestEncodePair:
+    def test_basic_split(self, scorer):
+        ctx_enc, cont_enc = scorer._encode_pair("abc", "de")
+        # "abcde" → [97,98,99,100,101], "abc" → [97,98,99]
+        assert ctx_enc == [97, 98, 99]
+        assert cont_enc == [100, 101]
+
+    def test_trailing_spaces_transfer(self, scorer):
+        ctx_enc, cont_enc = scorer._encode_pair("abc ", "de")
+        # Space transfers: context becomes "abc", continuation becomes " de"
+        # "abc de" → [97,98,99,32,100,101], "abc" → [97,98,99]
+        assert ctx_enc == [97, 98, 99]
+        assert cont_enc == [32, 100, 101]
+
+
+# ---------------------------------------------------------------------------
+# TestLoglikelihood
+# ---------------------------------------------------------------------------
+
+
+class TestLoglikelihood:
+    def test_computes_logprob_sum(self, scorer):
+        """With uniform logits (vocab_size=256), log_softmax = -log(256) per token."""
+        import math
+
+        result = scorer.loglikelihood("ab", "c")
+        # Continuation "c" is 1 token → expected = -log(256) ≈ -5.545
+        expected = -math.log(256)
+        assert result == pytest.approx(expected, rel=1e-4)
+
+    def test_logged_via_public_api(self, scorer):
+        scorer.loglikelihood("hello", " world")
+        assert len(scorer.logs) == 1
+        assert scorer.logs[0]["status"] == "success"
+
+
+# ---------------------------------------------------------------------------
+# TestLoglikelihoodChoices
+# ---------------------------------------------------------------------------
+
+
+class TestLoglikelihoodChoices:
+    def test_single_token_path(self, mock_model):
+        """When all continuations are 1 token, model is called once (single-token optimisation)."""
+        from maseval.interface.inference.huggingface_scorer import HuggingFaceModelScorer
+
+        # Use a tokenizer where " A", " B", " C", " D" each map to a single
+        # token beyond the context, so _encode_pair yields 1-token continuations.
+        tok = MagicMock()
+        tok.padding_side = "left"
+        tok.pad_token = "<pad>"
+        tok.eos_token = "</s>"
+        ctx_tokens = [1, 2, 3]
+
+        def _single_tok_encode(text, add_special_tokens=True):
+            # Context alone → [1, 2, 3]
+            # Context + " X" → [1, 2, 3, <token_for_X>]
+            if text == "ctx":
+                return ctx_tokens
+            # Anything longer than ctx is ctx + one extra token
+            return ctx_tokens + [10 + len(text)]
+
+        tok.encode = _single_tok_encode
+
+        s = HuggingFaceModelScorer(model_id="test", device="cpu")
+        s._model = _FakeCausalLM(vocab_size=256)
+        s._tokenizer = tok
+
+        s._model.reset_mock()
+        results = s.loglikelihood_choices("ctx", ["A", "B", "C", "D"])
+        assert len(results) == 4
+        assert all(isinstance(r, float) for r in results)
+        # Single-token path: one forward call for the shared context
+        assert s._model.call_count == 1
+
+    def test_multi_token_fallback(self, scorer, mock_model):
+        """When continuations have different lengths, falls back to per-choice scoring."""
+        original_encode = scorer._tokenizer.encode
+
+        def _varied_encode(text, add_special_tokens=True):
+            base = original_encode(text, add_special_tokens=add_special_tokens)
+            # Add extra token for text containing "long" to create length mismatch
+            if "long" in text:
+                return base + [50]
+            return base
+
+        scorer._tokenizer.encode = _varied_encode
+        mock_model.reset_mock()
+        results = scorer.loglikelihood_choices("context", ["A", "long answer"])
+        assert len(results) == 2
+        # Multi-token path: one forward call per choice
+        assert mock_model.call_count == 2
+
+    def test_returns_correct_shape(self, scorer):
+        results = scorer.loglikelihood_choices("ctx", ["A", "B"])
+        assert len(results) == 2
+        assert all(isinstance(r, float) for r in results)

From 7a0fe3054f9be0034ef000932531912ad7c817bf Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Sat, 28 Mar 2026 03:16:47 +0100
Subject: [PATCH 20/23] fixed typing errors in tests

---
 tests/test_benchmarks/test_mmlu/test_mmlu_unit.py         | 4 ++--
 tests/test_core/test_callbacks/test_file_result_logger.py | 1 +
 tests/test_core/test_message_history.py                   | 4 +++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
index 0e2243ca..5602f122 100644
--- a/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
+++ b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
@@ -421,7 +421,7 @@ def __init__(self, **kwargs):
                 for k, v in kwargs.items():
                     setattr(self, k, v)
 
-        mock_instance_mod.Instance = FakeInstance
+        setattr(mock_instance_mod, "Instance", FakeInstance)
 
         # FakeHFLM returns (logprob, is_greedy) tuples
         class FakeHFLM:
@@ -431,7 +431,7 @@ def __init__(self, **kwargs):
             def loglikelihood(self, instances):
                 return [(-float(i), True) for i in range(len(instances))]
 
-        mock_hflm_mod.HFLM = FakeHFLM
+        setattr(mock_hflm_mod, "HFLM", FakeHFLM)
 
         tasks = sample_mmlu_tasks
         with patch.dict(
diff --git a/tests/test_core/test_callbacks/test_file_result_logger.py b/tests/test_core/test_callbacks/test_file_result_logger.py
index 4fa53bcc..0f072a16 100644
--- a/tests/test_core/test_callbacks/test_file_result_logger.py
+++ b/tests/test_core/test_callbacks/test_file_result_logger.py
@@ -215,6 +215,7 @@ def test_file_result_logger_validate_detects_duplicates(tmp_path):
     logger.on_task_repeat_end(benchmark, report)  # type: ignore[arg-type]
 
     # Manually write a duplicate line to the file
+    assert logger._file_handle is not None
     logger._file_handle.write(json.dumps(report) + "\n")
     logger._file_handle.flush()
     logger._lines_written += 1
diff --git a/tests/test_core/test_message_history.py b/tests/test_core/test_message_history.py
index c4c8f78e..8e67c912 100644
--- a/tests/test_core/test_message_history.py
+++ b/tests/test_core/test_message_history.py
@@ -271,7 +271,9 @@ def test_get_last_message(self):
         assert history.get_last_message() is None
         history.add_message("user", "First")
         history.add_message("assistant", "Second")
-        assert history.get_last_message()["content"] == "Second"
+        last = history.get_last_message()
+        assert last is not None
+        assert last["content"] == "Second"
 
     def test_to_openai_format_strips_metadata_and_timestamps(self):
         """Test to_openai_format strips metadata and timestamps."""

From 0d5c2218bd978329315fd9cf2fa63a7bdd518d06 Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Sat, 28 Mar 2026 03:26:23 +0100
Subject: [PATCH 21/23] fixed potential scientific integrity issues

---
 maseval/benchmark/mmlu/mmlu.py                | 38 +++++++++++++------
 .../interface/inference/huggingface_scorer.py |  8 ++++
 .../test_mmlu/test_mmlu_unit.py               |  4 +-
 3 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/maseval/benchmark/mmlu/mmlu.py b/maseval/benchmark/mmlu/mmlu.py
index b367288a..2ec50b5c 100644
--- a/maseval/benchmark/mmlu/mmlu.py
+++ b/maseval/benchmark/mmlu/mmlu.py
@@ -105,12 +105,20 @@ def setup_state(self, task_data: Dict[str, Any]) -> Dict[str, Any]:
                 (dict with ``"choices"``, ``"full_prompt"``, ``"use_full_prompt"``).
         """
         env_data = task_data["environment_data"]
-        return {
+        use_full_prompt = env_data["use_full_prompt"]
+        if use_full_prompt and "full_prompt" not in env_data:
+            raise ValueError(
+                "use_full_prompt=True but 'full_prompt' is missing from environment_data. "
+                "Ensure the dataset includes few-shot prompts or set use_full_prompt=False."
+            )
+        state: Dict[str, Any] = {
             "query": task_data["query"],
             "choices": env_data["choices"],
-            "full_prompt": env_data["full_prompt"],
-            "use_full_prompt": env_data["use_full_prompt"],
+            "use_full_prompt": use_full_prompt,
         }
+        if "full_prompt" in env_data:
+            state["full_prompt"] = env_data["full_prompt"]
+        return state
 
     def create_tools(self) -> Dict[str, Any]:
         """MMLU doesn't use tools."""
@@ -179,13 +187,14 @@ def __call__(self, traces: Dict[str, Any], final_answer: Optional[str] = None) -
             final_answer: The model's final answer.
 
         Returns:
-            Dict with acc, acc_norm, predicted, gold, correct, and optionally logprobs fields.
+            Dict with acc, acc_norm, predicted, gold, correct, parse_failed, and optionally logprobs fields.
         """
         # Parse the model's answer
         predicted = self._parse_answer(final_answer or "")
+        parse_failed = predicted is None
 
-        # Check if correct
-        correct = predicted == self.gold
+        # Check if correct — unparseable responses are never correct
+        correct = (not parse_failed) and predicted == self.gold
 
         result = {
             "acc": 1.0 if correct else 0.0,
@@ -193,6 +202,7 @@ def __call__(self, traces: Dict[str, Any], final_answer: Optional[str] = None) -
             "predicted": predicted,
             "gold": self.gold,
             "correct": correct,
+            "parse_failed": parse_failed,
             "doc_id": self.task.metadata["doc_id"],
         }
 
@@ -205,7 +215,7 @@ def __call__(self, traces: Dict[str, Any], final_answer: Optional[str] = None) -
 
         return result
 
-    def _parse_answer(self, response: str) -> int:
+    def _parse_answer(self, response: str) -> Optional[int]:
         """Parse model response to extract answer choice.
 
         Handles various formats:
@@ -217,10 +227,10 @@ def _parse_answer(self, response: str) -> int:
             response: Model's response string.
 
         Returns:
-            Index of the predicted choice (0-3), or -1 if unparseable.
+            Index of the predicted choice (0-3), or None if unparseable.
         """
         if not response:
-            return -1
+            return None
 
         response = response.strip().upper()
 
@@ -242,7 +252,7 @@ def _parse_answer(self, response: str) -> int:
             if last_char == choice:
                 return i
 
-        return -1
+        return None
 
 
 # =============================================================================
@@ -591,8 +601,8 @@ def load_tasks(
             id=f"mmlu_{i}",
             environment_data={
                 "choices": item["choices"],
-                "full_prompt": item.get("full_prompt", ""),
-                "example": item.get("example", ""),
+                **({"full_prompt": item["full_prompt"]} if "full_prompt" in item else {}),
+                **({"example": item["example"]} if "example" in item else {}),
             },
             evaluation_data={
                 "gold": item["gold"],
@@ -628,6 +638,7 @@ def compute_benchmark_metrics(results: List[Dict[str, Any]]) -> Dict[str, Any]:
 
     total_tasks = len(results)
     correct_count = 0
+    parse_failed_count = 0
     acc_sum = 0.0
     acc_norm_sum = 0.0
 
@@ -641,10 +652,13 @@ def compute_benchmark_metrics(results: List[Dict[str, Any]]) -> Dict[str, Any]:
             acc_norm_sum += entry["acc_norm"]
             if entry["correct"]:
                 correct_count += 1
+            if entry.get("parse_failed", False):
+                parse_failed_count += 1
 
     return {
         "total_tasks": total_tasks,
         "correct_count": correct_count,
+        "parse_failed_count": parse_failed_count,
         "acc": acc_sum / total_tasks if total_tasks > 0 else 0.0,
         "acc_norm": acc_norm_sum / total_tasks if total_tasks > 0 else 0.0,
     }
diff --git a/maseval/interface/inference/huggingface_scorer.py b/maseval/interface/inference/huggingface_scorer.py
index 53d43dad..d52b9f99 100644
--- a/maseval/interface/inference/huggingface_scorer.py
+++ b/maseval/interface/inference/huggingface_scorer.py
@@ -168,6 +168,10 @@ def _loglikelihood_impl(self, context: str, continuation: str) -> float:
             logits = model(input_ids).logits[0]
             inplen = len(input_tokens)
             contlen = len(continuation_enc)
+            assert inplen >= contlen, (
+                f"Context tokens ({inplen}) fewer than continuation tokens ({contlen}). "
+                f"Tokenisation produced an unexpected result for context={context!r}, continuation={continuation!r}"
+            )
             selected = logits[inplen - contlen : inplen]
             log_probs = torch.nn.functional.log_softmax(selected, dim=-1)
 
@@ -235,6 +239,10 @@ def _score_single_token(
             logits = model(input_ids).logits[0]
             inplen = len(input_tokens)
             contlen = len(first_cont_enc)
+            assert inplen >= contlen, (
+                f"Context tokens ({inplen}) fewer than continuation tokens ({contlen}). "
+                "Tokenisation produced an unexpected result."
+            )
             selected_logits = logits[inplen - contlen : inplen]
             log_probs = torch.nn.functional.log_softmax(selected_logits, dim=-1)
 
diff --git a/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
index 5602f122..4744ee17 100644
--- a/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
+++ b/tests/test_benchmarks/test_mmlu/test_mmlu_unit.py
@@ -87,8 +87,8 @@ def evaluator(self, sample_mmlu_task):
             # Last character
             ("I think it's B", 1),
             # Empty / unparseable
-            ("", -1),
-            ("random text", -1),
+            ("", None),
+            ("random text", None),
         ],
     )
     def test_parse_answer(self, evaluator, response, expected):

From 524fee58e9b9218b20e756e5fc571b18b118de1b Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Sat, 28 Mar 2026 03:43:02 +0100
Subject: [PATCH 22/23] small bug fix, formatting issues

---
 CHANGELOG.md                                      | 3 ++-
 maseval/core/utils/system_info.py                 | 8 +++++++-
 maseval/interface/inference/huggingface_scorer.py | 3 +--
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 6bca13c1..941548d2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 **Core**
 
-- Fixed `MessageHistory.to_list()` returning a reference to the internal list instead of a copy, causing simulator logs to contain future conversation messages that hadn't occurred at the time of logging. (PR: #PR_NUMBER_PLACEHOLDER)
+- Fixed `MessageHistory.to_list()` returning a reference to the internal list instead of a copy, causing simulator logs to contain future conversation messages that hadn't occurred at the time of logging. (PR: #48)
+- Fixed `get_git_info()` crashing on detached HEAD (e.g. in CI checkout), now returns `detached@<short-hash>` as the branch name. (PR: #41)
 
 **Interface**
 
diff --git a/maseval/core/utils/system_info.py b/maseval/core/utils/system_info.py
index 583fc127..10b1bb89 100644
--- a/maseval/core/utils/system_info.py
+++ b/maseval/core/utils/system_info.py
@@ -36,6 +36,12 @@ def get_git_info(repo_path: Optional[str] = None) -> Dict[str, Any]:
         # Get current commit
         commit = repo.head.commit
 
+        # Get branch name (detached HEAD in CI returns the commit hash)
+        try:
+            branch = repo.active_branch.name
+        except TypeError:
+            branch = f"detached@{commit.hexsha[:7]}"
+
         # Get remote URL (if available)
         remote_url = None
         try:
@@ -46,7 +52,7 @@ def get_git_info(repo_path: Optional[str] = None) -> Dict[str, Any]:
         return {
             "commit_hash": commit.hexsha,
             "commit_hash_short": commit.hexsha[:7],
-            "branch": repo.active_branch.name,
+            "branch": branch,
             "is_dirty": repo.is_dirty(),
             "untracked_files": len(repo.untracked_files),
             "remote_url": remote_url,
diff --git a/maseval/interface/inference/huggingface_scorer.py b/maseval/interface/inference/huggingface_scorer.py
index d52b9f99..f6138f31 100644
--- a/maseval/interface/inference/huggingface_scorer.py
+++ b/maseval/interface/inference/huggingface_scorer.py
@@ -240,8 +240,7 @@ def _score_single_token(
             inplen = len(input_tokens)
             contlen = len(first_cont_enc)
             assert inplen >= contlen, (
-                f"Context tokens ({inplen}) fewer than continuation tokens ({contlen}). "
-                "Tokenisation produced an unexpected result."
+                f"Context tokens ({inplen}) fewer than continuation tokens ({contlen}). Tokenisation produced an unexpected result."
             )
             selected_logits = logits[inplen - contlen : inplen]
             log_probs = torch.nn.functional.log_softmax(selected_logits, dim=-1)

From 9b2696a8a6060a67e586edc0315ea0a7e518d697 Mon Sep 17 00:00:00 2001
From: cemde <42615086+cemde@users.noreply.github.com>
Date: Sat, 28 Mar 2026 05:17:26 +0100
Subject: [PATCH 23/23] fixed changelog and docs

---
 CHANGELOG.md | 67 ++++++++++++++++++++--------------------------------
 mkdocs.yml   |  1 -
 2 files changed, 25 insertions(+), 43 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 941548d2..e9f5a23b 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -25,16 +25,30 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Usage and cost tracking via `Usage` and `TokenUsage` data classes. `ModelAdapter` tracks token usage automatically after each `chat()` call. Components that implement `UsageTrackableMixin` are collected via `gather_usage()`. Live totals available during benchmark runs via `benchmark.usage` (grand total) and `benchmark.usage_by_component` (per-component breakdowns). Post-hoc analysis via `UsageReporter.from_reports(benchmark.reports)` with breakdowns by task, component, or model. (PR: #45)
 - Pluggable cost calculation via `CostCalculator` protocol. `StaticPricingCalculator` computes cost from user-supplied per-token rates. `LiteLLMCostCalculator` in `maseval.interface.usage` for automatic pricing via LiteLLM's model database (supports `custom_pricing` overrides and `model_id_map`; requires `litellm`). Pass a `cost_calculator` to `ModelAdapter` or `AgentAdapter` to compute `Usage.cost`. Provider-reported cost always takes precedence. (PR: #45)
 - `AgentAdapter` now accepts `cost_calculator` and `model_id` parameters. For smolagents, CAMEL, and LlamaIndex, both are auto-detected from the framework's agent object (`LiteLLMCostCalculator` if litellm is installed). LangGraph requires explicit `model_id` since graphs can contain multiple models. Explicit parameters always override auto-detection. (PR: #45)
-
 - `Task.freeze()` and `Task.unfreeze()` methods to make task data read-only during benchmark runs, preventing accidental mutation of `environment_data`, `user_data`, `evaluation_data`, and `metadata` (including nested dicts). Attribute reassignment is also blocked while frozen. Check state with `Task.is_frozen`. (PR: #42)
 - `TaskFrozenError` exception in `maseval.core.exceptions`, raised when attempting to modify a frozen task. (PR: #42)
+- Added `InformativeSubsetQueue` and `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). `DISCOQueue` accepts `anchor_points_path` to load indices from a `.json`/`.pkl` file via `DISCOQueue.load_anchor_points()`. Available via `from maseval import DISCOQueue, InformativeSubsetQueue`. (PR: #34 and #41)
+- Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #34 and #41)
+- Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
+- Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
+- Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
+- Added `seed` parameter to `ModelAdapter.__init__` for deterministic model inference (PR: #24)
+- Added `SeedingError` exception for providers that don't support seeding (Anthropic models raise this if seed is provided) (PR: #24)
+- Added `UserExhaustedError` exception in `maseval.core.exceptions` for flow control when a user's turns are exhausted (PR: #39)
 
-**Benchmarks**
+**Interface**
+
+- Added seed support to interface adapters: `OpenAIModelAdapter`, `GoogleGenAIModelAdapter`, `LiteLLMModelAdapter`, `HuggingFacePipelineModelAdapter` pass seeds to underlying APIs (PR: #24)
+- Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #34 and #41)
+- CAMEL-AI integration: `CamelAgentAdapter` and `CamelLLMUser` for evaluating CAMEL-AI ChatAgent-based systems (PR: #22)
+  - Added `CamelAgentUser` for using a CAMEL ChatAgent as the user in agent-to-agent evaluation (PR: #22)
+  - Added `camel_role_playing_execution_loop()` for benchmarks using CAMEL's RolePlaying semantics (PR: #22)
+  - Added `CamelRolePlayingTracer` and `CamelWorkforceTracer` for capturing orchestration-level traces from CAMEL's multi-agent systems (PR: #22)
 
-- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. Includes `MMLUBenchmark`, `DefaultMMLUBenchmark`, `MMLUEnvironment`, `MMLUEvaluator`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `DefaultMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34)
+**Benchmarks**
 
+- MMLU Benchmark with DISCO support: Integration for evaluating language models on MMLU (Massive Multitask Language Understanding) multiple-choice questions, compatible with DISCO anchor-point methodology. `MMLUBenchmark` is a framework-agnostic base class (`setup_agents()` and `get_model_adapter()` must be implemented by subclasses); `DefaultMMLUBenchmark` provides a ready-made HuggingFace implementation. Also includes `MMLUEnvironment`, `MMLUEvaluator`, `load_tasks()`, and `compute_benchmark_metrics()`. Install with `pip install maseval[mmlu]`. Optional extras: `lm-eval` (for `DefaultMMLUBenchmark.precompute_all_logprobs_lmeval`), `disco` (for DISCO prediction in the example). (PR: #34 and #41)
 - CONVERSE benchmark for contextual safety evaluation in adversarial agent-to-agent conversations, including `ConverseBenchmark`, `DefaultAgentConverseBenchmark`, `ConverseEnvironment`, `ConverseExternalAgent`, `PrivacyEvaluator`, `SecurityEvaluator`, and `load_tasks()` utilities for `travel`, `real_estate`, and `insurance` domains. Benchmark source files are now downloaded on first use via `ensure_data_exists()` instead of being bundled in the package. (PR: #28)
-
 - GAIA2 Benchmark: Integration with Meta's ARE (Agent Research Environments) platform for evaluating LLM-based agents on dynamic, multi-step scenarios (PR: #26)
   - `Gaia2Benchmark`, `Gaia2Environment`, `Gaia2Evaluator` components for framework-agnostic evaluation with ARE simulation (PR: #26)
   - `DefaultAgentGaia2Benchmark` with ReAct-style agent for direct comparison with ARE reference implementation (PR: #26)
@@ -44,7 +58,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - Metrics: `compute_gaia2_metrics()` for GSR (Goal Success Rate) computation by capability type (PR: #26)
   - Support for 5 capability dimensions: execution, search, adaptability, time, ambiguity (PR: #26, #30)
   - Added `gaia2` optional dependency: `pip install maseval[gaia2]` (PR: #26)
-
 - MultiAgentBench Benchmark: Integration with MARBLE MultiAgentBench for evaluating multi-agent collaboration across all 6 paper-defined domains: research, bargaining, coding, database, werewolf, and minecraft (PR: #25, #30)
   - `MultiAgentBenchBenchmark` abstract base class for framework-agnostic multi-agent evaluation with seeding support for evaluators and agents (PR: #25)
   - `MarbleMultiAgentBenchBenchmark` for exact MARBLE reproduction mode using native MARBLE agents (note: MARBLE's internal LLM calls bypass MASEval seeding) (PR: #25)
@@ -55,9 +68,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 **Examples**
 
 - Added usage tracking to the 5-A-Day benchmark: `five_a_day_benchmark.ipynb` (section 2.7) and `five_a_day_benchmark.py` (post-run usage summary with per-component and per-task breakdowns). (PR: #45)
-
-- MMLU benchmark example at `examples/mmlu_benchmark/` for evaluating HuggingFace models on MMLU with optional DISCO prediction (`--disco_model_path`, `--disco_transform_path`). Supports local data, HuggingFace dataset repos, and DISCO weights from .pkl/.npz or HF repos. (PR: #34)
-- MMLU benchmark documentation at `docs/benchmark/mmlu.md` with installation, quick start, and API reference. (PR: #34)
+- MMLU benchmark example at `examples/mmlu_benchmark/` for evaluating HuggingFace models on MMLU with optional DISCO prediction (`--disco_model_path`, `--disco_transform_path`). Supports local data, HuggingFace dataset repos, and DISCO weights from .pkl/.npz or HF repos. (PR: #34 and #41)
 - Added a dedicated runnable CONVERSE default benchmark example at `examples/converse_benchmark/default_converse_benchmark.py` for quick start with `DefaultAgentConverseBenchmark`. (PR: #28)
 - Gaia2 benchmark example with Google GenAI and OpenAI model support (PR: #26)
 
@@ -65,28 +76,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 - Usage & Cost Tracking guide (`docs/guides/usage-tracking.md`) and API reference (`docs/reference/usage.md`). (PR: #45)
 
-**Core**
-
-- Added `InformativeSubsetQueue` and `DISCOQueue` to `maseval.core.task` for subset-based evaluation (e.g., anchor-point selection for DISCO). `DISCOQueue` accepts `anchor_points_path` to load indices from a `.json`/`.pkl` file via `DISCOQueue.load_anchor_points()`. Available via `from maseval import DISCOQueue, InformativeSubsetQueue`. (PR: #34)
-- Added `ModelScorer` abstract base class in `maseval.core.scorer` for log-likelihood scoring, with `loglikelihood()`, `loglikelihood_batch()`, and `loglikelihood_choices()` methods. (PR: #34)
-- Added `SeedGenerator` abstract base class and `DefaultSeedGenerator` implementation for reproducible benchmark runs via SHA-256-based seed derivation (PR: #24)
-- Added `seed` and `seed_generator` parameters to `Benchmark.__init__` for enabling reproducibility (PR: #24)
-- Added `seed_generator` parameter to all benchmark setup methods (`setup_environment`, `setup_user`, `setup_agents`, `setup_evaluators`) (PR: #24)
-- Added `seed` parameter to `ModelAdapter.__init__` for deterministic model inference (PR: #24)
-- Added `SeedingError` exception for providers that don't support seeding (Anthropic models raise this if seed is provided) (PR: #24)
-- Added seed support to interface adapters: `OpenAIModelAdapter`, `GoogleGenAIModelAdapter`, `LiteLLMModelAdapter`, `HuggingFacePipelineModelAdapter` pass seeds to underlying APIs (PR: #24)
-- Added `UserExhaustedError` exception in `maseval.core.exceptions` for flow control when a user's turns are exhausted (PR: #39)
-
-**Interface**
-
-- Added `HuggingFaceModelScorer` in `maseval.interface.inference` — log-likelihood scorer backed by a HuggingFace `AutoModelForCausalLM`, with single-token optimisation for MCQ evaluation. Implements the `ModelScorer` interface. (PR: #34)
-- Renamed `HuggingFaceModelAdapter` → `HuggingFacePipelineModelAdapter` to distinguish it from the new scorer. (PR: #34)
-
-- CAMEL-AI integration: `CamelAgentAdapter` and `CamelLLMUser` for evaluating CAMEL-AI ChatAgent-based systems (PR: #22)
-  - Added `CamelAgentUser` for using a CAMEL ChatAgent as the user in agent-to-agent evaluation (PR: #22)
-  - Added `camel_role_playing_execution_loop()` for benchmarks using CAMEL's RolePlaying semantics (PR: #22)
-  - Added `CamelRolePlayingTracer` and `CamelWorkforceTracer` for capturing orchestration-level traces from CAMEL's multi-agent systems (PR: #22)
-
 **Testing**
 
 - Composable pytest markers (`live`, `credentialed`, `slow`, `smoke`) for fine-grained test selection; default runs exclude slow, credentialed, and smoke tests (PR: #29)
@@ -98,7 +87,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Live API round-trip tests for all model adapters (`-m credentialed`) (PR: #29)
 - CI jobs for slow tests (with benchmark data caching) and credentialed tests (behind GitHub Environment approval) (PR: #29)
 - Added `respx` dev dependency for HTTP-level mocking (PR: #29)
-- pytest marker `mmlu` for tests that require the MMLU benchmark (HuggingFace + DISCO). (PR: #34)
+- pytest marker `mmlu` for tests that require the MMLU benchmark (HuggingFace + DISCO). (PR: #34 and #41)
 
 ### Changed
 
@@ -115,30 +104,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `LlamaIndexAgentAdapter`: Added `max_iterations` constructor parameter, forwarded to `AgentWorkflow.run()`. Fixes silent swallowing of `max_steps` by `FunctionAgent.__init__`. (PR: #39)
 - `SmolAgentAdapter`: New `_determine_step_status()` detects crashed steps where `AgentGenerationError` was raised before `step.error` was set, preventing false "success" status on empty steps. (PR: #39)
 - `GoogleGenAIModelAdapter`: Consecutive tool-response messages are now merged into a single `contents` entry, fixing Google API errors when multiple tool results are returned in one turn. (PR: #39)
+- Renamed framework-specific user classes to reflect the new `LLMUser` base (PR: #22):
+  - `SmolAgentUser` → `SmolAgentLLMUser`
+  - `LangGraphUser` → `LangGraphLLMUser`
+  - `LlamaIndexUser` → `LlamaIndexLLMUser`
 
 **Benchmarks**
 
-- `MMLUBenchmark` is now a framework-agnostic base class — `setup_agents()` and `get_model_adapter()` must be implemented by subclasses. Use `DefaultMMLUBenchmark` for HuggingFace models. Missing required fields now raise immediately instead of falling back to silent defaults. Install with `pip install maseval[mmlu]`. (PR: #34)
-- `DISCOQueue` moved from `maseval.benchmark.mmlu` to `maseval.core.task` and now extends `SequentialTaskQueue`. Removed `MMLUModelAgent`, `MMLUAgentAdapter`, and `AnchorPointsTaskQueue`. (PR: #34)
 - `MACSBenchmark` and `Tau2Benchmark` benchmarks now actively use the seeding system by deriving seeds for model adapters. Seeds are passed to agents, user simulators, tool simulators, and LLM-based evaluators for reproducible runs. (PR: #26)
   - `Gaia2Benchmark`: Seeds `agents/gaia2_agent`, `evaluators/judge`
   - `MACSBenchmark`: Seeds `environment/tools/tool_{name}`, `simulators/user`, `evaluators/user_gsr`, `evaluators/system_gsr`
   - `Tau2Benchmark`: Seeds `simulators/user`, `agents/default_agent`
+- All benchmarks except MACS are now labeled as **Beta** in docs, BENCHMARKS.md, and benchmark index, with a warning that results have not yet been validated against original implementations. (PR: #39)
 
 **User**
 
 - Refactored `User` class into abstract base class defining the interface (`get_initial_query()`, `respond()`, `is_done()`) with `LLMUser` as the concrete LLM-driven implementation. This enables non-LLM user implementations (scripted, human-in-the-loop, agent-based). (PR: #22)
 - Renamed `AgenticUser` → `AgenticLLMUser` for consistency with the new hierarchy (PR: #22)
 
-**Interface**
-
-- Renamed framework-specific user classes to reflect the new `LLMUser` base (PR: #22):
-  - `SmolAgentUser` → `SmolAgentLLMUser`
-  - `LangGraphUser` → `LangGraphLLMUser`
-  - `LlamaIndexUser` → `LlamaIndexLLMUser`
-
-- All benchmarks except MACS are now labeled as **Beta** in docs, BENCHMARKS.md, and benchmark index, with a warning that results have not yet been validated against original implementations. (PR: #39)
-
 **Testing**
 
 - Coverage script (`scripts/coverage_by_feature.py`) now accepts `--exclude` flag to skip additional markers; always excludes `credentialed` and `smoke` by default (PR: #29)
diff --git a/mkdocs.yml b/mkdocs.yml
index 813fe947..6a6edf98 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -131,7 +131,6 @@ nav:
               - LiteLLM: interface/inference/litellm.md
               - OpenAI: interface/inference/openai.md
       - Benchmarks:
-          - Overview: benchmark/index.md
           - ConVerse: benchmark/converse.md
           - GAIA2: benchmark/gaia2.md
           - MACS: benchmark/macs.md