Overview
Add the ability to filter and group trend analysis results by system prompt and MCP tool schema versions. This enables users to correlate performance changes with configuration changes over time.
Background
MCProbe now captures system prompts and MCP tool schemas with each test run (implemented in #24, #25, #27), including SHA256 hashes for quick comparison. This data can be leveraged to provide more powerful trend analysis.
Proposed Features
1. Filter by Prompt Version
- Filter trend data to show only runs with a specific prompt hash
- Compare performance metrics across different prompt versions
- Identify which prompt version performed best for a given scenario
2. Filter by Schema Version
- Filter trend data to show only runs with a specific schema hash
- Track how tool description changes affect tool usage patterns
- Correlate schema changes with pass/fail rate changes
3. Group by Configuration
- Group trend results by prompt hash to see performance distribution per prompt version
- Group by schema hash to compare tool effectiveness across schema versions
- Combined grouping to see performance by full configuration
4. CLI Support
# Filter trends by prompt version
mcprobe trends --prompt-hash abc123
# Filter by schema version
mcprobe trends --schema-hash xyz789
# Show trends grouped by prompt version
mcprobe trends --group-by prompt
5. HTML Report Enhancements
- Add filter controls in trend visualizations
- Show prompt/schema version timeline
- Highlight configuration change points on trend graphs
Use Cases
- A/B Testing: Compare test results between two different prompt versions
- Regression Analysis: Quickly see if a prompt change caused a score drop
- Optimization Tracking: Track improvements as prompts are iterated
- Configuration Audit: Review which configurations were tested and when
Related
Overview
Add the ability to filter and group trend analysis results by system prompt and MCP tool schema versions. This enables users to correlate performance changes with configuration changes over time.
Background
MCProbe now captures system prompts and MCP tool schemas with each test run (implemented in #24, #25, #27), including SHA256 hashes for quick comparison. This data can be leveraged to provide more powerful trend analysis.
Proposed Features
1. Filter by Prompt Version
2. Filter by Schema Version
3. Group by Configuration
4. CLI Support
5. HTML Report Enhancements
Use Cases
Related