Skip to content

test: add regression tests for lexical baseline models#42

Merged
Mattdl merged 2 commits intotechwolf-ai:mainfrom
federetyk:feat/regression-tests-lexical-baselines
Feb 23, 2026
Merged

test: add regression tests for lexical baseline models#42
Mattdl merged 2 commits intotechwolf-ai:mainfrom
federetyk:feat/regression-tests-lexical-baselines

Conversation

@federetyk
Copy link
Contributor

Addresses #41

Description

This PR adds regression tests for all configuration variants of the four lexical baseline models introduced in #36: BM25Model, TfIdfModel, EditDistanceModel, and RandomRankingModel. Each variant is evaluated on the English-only split of the JobTitleSimilarityRanking task, and the resulting metrics are compared against pre-recorded expected values with a small tolerance window. The full regression suite runs in ~12 seconds on a mid-range laptop CPU.

These tests complement the existing unit tests in test_lexical_baselines.py, which verify output shapes and types but do not exercise the evaluation pipeline or assert metric correctness. This PR was suggested by @Mattdl in #36.

Changes:

  • Add tests/test_lexical_baselines_regression.py

Checklist

  • Added new tests for new functionality
  • Tested locally with example tasks
  • Code follows project style guidelines
  • Documentation updated
  • No new warnings introduced

Evaluate all 9 lexical baseline model variants on the
JobTitleSimilarityRanking task (English, test split, 105 queries x
2,619 targets) and assert that MAP, RP@5, RP@10, and MRR match
pre-recorded expected values within abs=1e-6 tolerance.

Covers BM25 (lower/cased), TfIdf (word-lower/word-cased/char-lower/
char-cased), EditDistance (lower/cased), and RandomRanking (seed=42).

Addresses techwolf-ai#41
…stability

BM25 and TfIdf scoring may produce slightly different floating-point
results across Python versions and numpy builds. The previous abs=1e-6
tolerance was too tight for reproducibility.
Copy link
Collaborator

@Mattdl Mattdl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Closing PR.

@Mattdl Mattdl merged commit 97d799b into techwolf-ai:main Feb 23, 2026
2 checks passed
@federetyk federetyk deleted the feat/regression-tests-lexical-baselines branch February 23, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants