Skip to content

Add default_action_fraction to epsilon greedy#136

Open
galih112 wants to merge 4 commits intodevelopfrom
feature/eps_greedy_mixed
Open

Add default_action_fraction to epsilon greedy#136
galih112 wants to merge 4 commits intodevelopfrom
feature/eps_greedy_mixed

Conversation

@galih112
Copy link
Copy Markdown

@galih112 galih112 commented Apr 26, 2026

Changes:

  • Added PositiveFloat01 type alias to base.py — float constrained to (0, 1]
  • Added default_action_fraction: Optional[PositiveFloat01] field to BaseMab for probabilistic mixing of default_action and a uniformly-random arm on the epsilon-greedy explore branch
  • Added validation guards requiring both epsilon and default_action when default_action_fraction is set; reordered checks so the specific error surfaces first
  • Added TestDefaultActionFraction test class with parametrized and named-constant tests covering all branches

Summary by CodeRabbit

  • New Features

    • Added optional default_action_fraction parameter to control exploration behavior in epsilon-greedy selection. When enabled, customizes the balance between selecting the default action and random exploration during the explore phase.
  • Tests

    • Added comprehensive test suite validating parameter constraints and behavioral correctness across multiple configuration scenarios.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

📝 Walkthrough

Walkthrough

Adds a default_action_fraction option to BaseMab, validates it during model init, and changes epsilon-greedy exploration so the explore branch can probabilistically choose the default_action vs. a uniform random action. Also adds a constrained type PositiveFloat01.

Changes

Cohort / File(s) Summary
Type Definition
pybandits/base.py
Adds PositiveFloat01 = NewType("PositiveFloat01", confloat(gt=0, le=1)) for fractions strictly >0 and ≤1.
MAB Implementation
pybandits/mab.py
Adds BaseMab.default_action_fraction field; threads default_action_fraction through __init__ and cold_start; adds model_post_init validation to require epsilon and default_action when set; updates _select_epsilon_greedy_action so exploration performs an additional Bernoulli/binomial draw to choose between default_action and a uniform random action when default_action_fraction is set, preserving legacy behavior when unset.
Tests
tests/test_mab.py
Adds TestDefaultActionFraction tests: config validation (missing epsilon or default_action, invalid zero), deterministic behavior when fraction is None/1.0, empirical mixing tests by mocking RNG, and a quantitative-model selection shape check.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I twitch a whisker, toss a tiny bet,
A default hop gets voted — sometimes, not yet.
With binomial luck and epsilon's curious art,
I mix the known and new with a probabilistic heart. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding a new default_action_fraction parameter to the epsilon-greedy action selection strategy.
Description check ✅ Passed The description covers all required sections: Changes (with feature details, tests, and validation), addressing the template structure completely and providing sufficient implementation details.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/eps_greedy_mixed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@galih112 galih112 requested a review from shaharbar1 April 26, 2026 09:00
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
tests/test_mab.py (2)

238-254: Test only covers the patched-random.choice path, not the "default could still be drawn" nuance.

With default_action_fraction=0.0, the implementation routes to random.choice(list(p.keys())), which uniformly includes the default action "a1". This test asserts "a2" only because random.choice is patched. That's fine for verifying the branch dispatch, but the test's docstring ("must always pick a random arm") could be misread as "never returns default". Consider tightening the docstring to "must always go through the random branch" to avoid that implication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mab.py` around lines 238 - 254, The test name/test docstring is
misleading: it asserts the explore branch is taken (via a patched random.choice)
but doesn't guarantee the default action won't be drawn; update the docstring in
test_default_action_fraction_zero_matches_random_only to say it "must always go
through the random branch" (or similar) and mention the patched random.choice
path, referencing the test function
test_default_action_fraction_zero_matches_random_only,
DummyMab._select_epsilon_greedy_action, default_action_fraction=0.0, and the
patched random.choice to make the intent explicit.

184-184: Escape . in pytest.raises(match=...) regex patterns (RUF043).

The match= argument is a regex; the literal . matches any character, so the assertion is laxer than the displayed message implies and Ruff (RUF043) flags it. Use a raw string with the dot escaped, or wrap with re.escape.

Proposed fix
 def test_default_action_fraction_requires_epsilon():
-    with pytest.raises(AttributeError, match="default_action_fraction requires epsilon to be defined."):
+    with pytest.raises(AttributeError, match=r"default_action_fraction requires epsilon to be defined\."):
         DummyMab(
             actions={"a1": Beta(), "a2": Beta()},
             strategy=ClassicBandit(),
             epsilon=None,
             default_action_fraction=0.5,
         )


 def test_default_action_fraction_requires_default_action():
-    with pytest.raises(AttributeError, match="default_action_fraction requires default_action to be defined."):
+    with pytest.raises(AttributeError, match=r"default_action_fraction requires default_action to be defined\."):
         DummyMab(
             actions={"a1": Beta(), "a2": Beta()},
             strategy=ClassicBandit(),
             epsilon=0.1,
             default_action=None,
             default_action_fraction=0.5,
         )

Also applies to: 194-194

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mab.py` at line 184, The pytest.raises calls that assert the
message "default_action_fraction requires epsilon to be defined." use an
unescaped dot in the match regex; update the two occurrences of the
pytest.raises(match=...) assertions in tests/test_mab.py (the pytest.raises
calls around the message) to use a proper regex by escaping the final dot (e.g.,
use a raw string with the dot escaped like r"default_action_fraction requires
epsilon to be defined\." or wrap the literal with re.escape) so the test matches
the exact message.
pybandits/mab.py (1)

175-178: Validation order leaves a slightly misleading error in one combo.

If a user passes default_action_fraction together with default_action but no epsilon, the pre-existing check at lines 173-174 fires first ("A default action should only be defined when epsilon is defined.") instead of the new, more specific "default_action_fraction requires epsilon to be defined.". Functionally correct (still rejected) but the error doesn't mention the new field. Optional polish: surface the default_action_fraction requirement first when it's the user-supplied parameter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pybandits/mab.py` around lines 175 - 178, Reorder the validations so the
checks that reference default_action_fraction run before the generic
default_action check: first verify that if self.default_action_fraction is not
None then self.epsilon is truthy (raise "default_action_fraction requires
epsilon to be defined.") and then verify that if self.default_action_fraction is
not None then self.default_action is not None (raise "default_action_fraction
requires default_action to be defined."); only after those should you run the
existing check that raises "A default action should only be defined when epsilon
is defined." for self.default_action alone, ensuring the more specific
default_action_fraction errors surface when that parameter was provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pybandits/mab.py`:
- Around line 175-178: Reorder the validations so the checks that reference
default_action_fraction run before the generic default_action check: first
verify that if self.default_action_fraction is not None then self.epsilon is
truthy (raise "default_action_fraction requires epsilon to be defined.") and
then verify that if self.default_action_fraction is not None then
self.default_action is not None (raise "default_action_fraction requires
default_action to be defined."); only after those should you run the existing
check that raises "A default action should only be defined when epsilon is
defined." for self.default_action alone, ensuring the more specific
default_action_fraction errors surface when that parameter was provided.

In `@tests/test_mab.py`:
- Around line 238-254: The test name/test docstring is misleading: it asserts
the explore branch is taken (via a patched random.choice) but doesn't guarantee
the default action won't be drawn; update the docstring in
test_default_action_fraction_zero_matches_random_only to say it "must always go
through the random branch" (or similar) and mention the patched random.choice
path, referencing the test function
test_default_action_fraction_zero_matches_random_only,
DummyMab._select_epsilon_greedy_action, default_action_fraction=0.0, and the
patched random.choice to make the intent explicit.
- Line 184: The pytest.raises calls that assert the message
"default_action_fraction requires epsilon to be defined." use an unescaped dot
in the match regex; update the two occurrences of the pytest.raises(match=...)
assertions in tests/test_mab.py (the pytest.raises calls around the message) to
use a proper regex by escaping the final dot (e.g., use a raw string with the
dot escaped like r"default_action_fraction requires epsilon to be defined\." or
wrap the literal with re.escape) so the test matches the exact message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0336677b-0836-4941-896a-4d2c812aa3fd

📥 Commits

Reviewing files that changed from the base of the PR and between 7af2bb0 and 8535d09.

📒 Files selected for processing (3)
  • README.md
  • pybandits/mab.py
  • tests/test_mab.py

@shaharbar1
Copy link
Copy Markdown
Collaborator

Commit message of your PR should follow this pattern:

<Title>

Changes:

Then,
PR name would be title and description would be the changes.

Comment thread pybandits/mab.py
Comment thread pybandits/mab.py Outdated
Comment thread tests/test_mab.py Outdated
Comment thread tests/test_mab.py Outdated
Comment thread README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pybandits/mab.py`:
- Around line 527-531: Update the cold_start/default_action_fraction docstring
to match the actual type and validation: change the annotated type from
Optional[Float01] to Optional[PositiveFloat01] (use the PositiveFloat01 alias)
and reword the description to state that values must be in the open-closed
interval (0, 1] — explicitly note that 0.0 is invalid and will raise
ValidationError — and that 1.0 corresponds to "default-only" while None
preserves legacy behavior; ensure the field declaration default_action_fraction,
the __init__ parameter, and the class-level docstring for the MAB class use the
same wording and range to avoid contradictions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 03cea33f-e7d7-49f8-bfa4-7f8cfaa3193b

📥 Commits

Reviewing files that changed from the base of the PR and between 8535d09 and 7bfc2c0.

📒 Files selected for processing (3)
  • pybandits/base.py
  • pybandits/mab.py
  • tests/test_mab.py
✅ Files skipped from review due to trivial changes (1)
  • pybandits/base.py

Comment thread pybandits/mab.py Outdated
@galih112 galih112 changed the title feat(mab): add default_action_fraction for mixed epsilon-greedy exploration feat(mab): Add default_action_fraction to epsilon greedy Apr 27, 2026
@galih112 galih112 changed the title feat(mab): Add default_action_fraction to epsilon greedy Add default_action_fraction to epsilon greedy Apr 27, 2026
@galih112 galih112 force-pushed the feature/eps_greedy_mixed branch from 8f17fbe to 7bfc2c0 Compare April 27, 2026 11:58
@galih112 galih112 added the enhancement New feature or request label Apr 27, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
pybandits/mab.py (2)

185-188: Optional: tighten the epsilon precondition check.

not self.epsilon is truthy for both None and 0.0, but the error message says "requires epsilon to be defined." If a user passes epsilon=0.0, default_action_fraction=0.5, they'll get a message that contradicts what they wrote. Functionally 0.0 makes the explore branch unreachable anyway, so rejecting it is fine — but the message could be clearer (or the check could be self.epsilon is None to mirror the field's optionality).

♻️ Possible refinement
-        if self.default_action_fraction is not None and not self.epsilon:
-            raise AttributeError("default_action_fraction requires epsilon to be defined.")
+        if self.default_action_fraction is not None and not self.epsilon:
+            raise AttributeError(
+                "default_action_fraction requires a positive epsilon (the explore branch never fires otherwise)."
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pybandits/mab.py` around lines 185 - 188, The check for epsilon uses "not
self.epsilon" which treats 0.0 the same as None and yields a misleading error;
update the precondition in the constructor/validation so it checks "self.epsilon
is None" (or alternatively leave the boolean check but change the message to say
"requires a non-None epsilon (0.0 disables exploration)" if you want to keep
rejecting 0.0), and keep the other validation for default_action; reference the
attributes default_action_fraction, epsilon, and default_action when making this
change.

98-105: Minor docstring nit: incomplete boundary semantics.

The "Boundary semantics" section lists only 1.0 and None. Consider adding the intermediate-value description (e.g., "any value in (0, 1) → Bernoulli mix between default_action and a uniformly-random arm on each explore step") so readers don't have to infer it from the surrounding text. Also, line 101's parenthetical "(always default when set)" is redundant with the None bullet on line 104 and reads slightly awkwardly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pybandits/mab.py` around lines 98 - 105, Update the docstring for the
parameter default_action_fraction (in the mab module) to explicitly document
intermediate values: state that any value x in (0,1) yields a Bernoulli mix on
each explore step where with probability x the explore branch returns
default_action and with probability 1-x it returns a uniformly-random arm (this
behavior is only meaningful when epsilon and default_action are set). Also
remove or reword the redundant parenthetical "(always default when set)" so the
None bullet and the 1.0 bullet are not repetitive.
tests/test_mab.py (1)

245-246: Optional: hoist the in-method imports to the top.

from unittest.mock import MagicMock (Lines 245, 270) and from pybandits.quantitative_model import Zooming (Line 272) are imported inside test methods. There's no circular-import reason here, so moving them next to the existing imports at the top of the file would be more idiomatic and avoid repeating the MagicMock import.

Also applies to: 270-272

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mab.py` around lines 245 - 246, Move the in-method imports to the
module top: hoist "from unittest.mock import MagicMock" and "from
pybandits.quantitative_model import Zooming" out of the test functions and place
them alongside the existing file-level imports; then remove the duplicate
in-function import lines (the ones inside the test methods around where
MagicMock is imported on lines 245 and 270 and Zooming on line 272) so tests
reference MagicMock and Zooming from the top-level imports.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pybandits/mab.py`:
- Around line 185-188: The check for epsilon uses "not self.epsilon" which
treats 0.0 the same as None and yields a misleading error; update the
precondition in the constructor/validation so it checks "self.epsilon is None"
(or alternatively leave the boolean check but change the message to say
"requires a non-None epsilon (0.0 disables exploration)" if you want to keep
rejecting 0.0), and keep the other validation for default_action; reference the
attributes default_action_fraction, epsilon, and default_action when making this
change.
- Around line 98-105: Update the docstring for the parameter
default_action_fraction (in the mab module) to explicitly document intermediate
values: state that any value x in (0,1) yields a Bernoulli mix on each explore
step where with probability x the explore branch returns default_action and with
probability 1-x it returns a uniformly-random arm (this behavior is only
meaningful when epsilon and default_action are set). Also remove or reword the
redundant parenthetical "(always default when set)" so the None bullet and the
1.0 bullet are not repetitive.

In `@tests/test_mab.py`:
- Around line 245-246: Move the in-method imports to the module top: hoist "from
unittest.mock import MagicMock" and "from pybandits.quantitative_model import
Zooming" out of the test functions and place them alongside the existing
file-level imports; then remove the duplicate in-function import lines (the ones
inside the test methods around where MagicMock is imported on lines 245 and 270
and Zooming on line 272) so tests reference MagicMock and Zooming from the
top-level imports.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: addb1e6f-8ce9-4369-9eea-01ed17906519

📥 Commits

Reviewing files that changed from the base of the PR and between 7bfc2c0 and 1806630.

📒 Files selected for processing (3)
  • pybandits/base.py
  • pybandits/mab.py
  • tests/test_mab.py
✅ Files skipped from review due to trivial changes (1)
  • pybandits/base.py

@galih112 galih112 requested a review from shaharbar1 May 4, 2026 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants