Conversation
…arator instances - Add _MetricDefinition class to separate metric type (definition) from metric value (measurement) - Implement comparator caching via _get_shared_comparator() to ensure only one comparator object exists per unique (method, target, epsilon) combination - All solutions using the same comparison method now share the same comparator instance, reducing memory usage - Maintain full backward compatibility - no changes to Metric class API - Remove TODO comment as the refactoring addresses the concern about mixing metric and metric value concerns
- Add epsilon to __eq__ method to ensure metrics with different epsilon values are correctly differentiated - Add epsilon to __hash__ method to maintain hash contract (must include all fields used in __eq__) - Fix redundant condition in Metric.__gt__ method This fixes critical bugs identified in code review that could cause incorrect metric comparisons.
- Reformat code to comply with black formatting standards - Fixes CI formatting check failure
Greptile OverviewGreptile SummaryThis PR enhances sample data generation to use LLM-based realistic values instead of simple synthetic patterns. The main functional change is in Key changes:
The implementation is solid with proper error handling, though the hardcoded provider string could be made configurable to align with patterns used elsewhere in the codebase. Confidence Score: 4/5
Important Files ChangedFile Analysis
|
| SampleModel = type("SampleModel", (BaseModel,), schema_fields) | ||
|
|
||
| # Use LLM to generate sensible sample values based on field names and types | ||
| provider = Provider("openai/gpt-4o-mini") |
There was a problem hiding this comment.
style: hardcoded openai/gpt-4o-mini provider prevents users from configuring model choice
other tools in this module receive llm_to_use parameter (see get_training_code_generation_tool, get_select_target_metric). consider adding provider parameter or getting from config
| provider = Provider("openai/gpt-4o-mini") | |
| # Use LLM to generate sensible sample values based on field names and types | |
| # TODO: make provider configurable via parameter | |
| provider = Provider("openai/gpt-4o-mini") |
Prompt To Fix With AI
This is a comment left during a code review.
Path: plexe/tools/datasets.py
Line: 135:135
Comment:
**style:** hardcoded `openai/gpt-4o-mini` provider prevents users from configuring model choice
other tools in this module receive `llm_to_use` parameter (see `get_training_code_generation_tool`, `get_select_target_metric`). consider adding provider parameter or getting from config
```suggestion
# Use LLM to generate sensible sample values based on field names and types
# TODO: make provider configurable via parameter
provider = Provider("openai/gpt-4o-mini")
```
How can I resolve this? If you propose a fix, please make it concise.- Fix black formatting issues for CI compliance - Format both files to match project style standards
Completed a TODO where llm is called for dataset generation keeping the orginal manual logic there as fallback