Skip to content

Eval cleanups - timing and env.step() #308

Merged
kantneel merged 13 commits intomainfrom
eval-cleanups
Aug 27, 2025
Merged

Eval cleanups - timing and env.step() #308
kantneel merged 13 commits intomainfrom
eval-cleanups

Conversation

@kantneel
Copy link
Copy Markdown
Collaborator

@kantneel kantneel commented Aug 19, 2025

Refactor Environment Timing and Pause/Unpause System

This PR implements a comprehensive refactor of the Factorio Learning Environment's timing system, removes the problematic value accrual mechanism, and adds robust pause/unpause functionality.

Key Changes

Removed Value Accrual Time from Environment

  • Eliminated value_accrual_time parameter from FactorioGymEnv.__init__()
  • Removed artificial sleep delays that were causing timing issues and making episodes unnecessarily slow
  • Reordered env.step() logic to calculate rewards immediately after action execution rather than after waiting
  • Actions now execute and return rewards instantly, dramatically improving episode speed

Enhanced Instance Pause/Unpause System

  • Added comprehensive pause/unpause functionality to FactorioInstance
  • Separated speed and pause controls - speed settings are preserved when pausing/unpausing
  • Added convenience methods:
    • set_speed_and_unpause() - common use case for starting actions
    • pause_at_speed() - set speed but pause immediately
    • toggle_pause() - switch between paused/unpaused states
    • is_paused() - check current pause state
  • Added pause_after_action parameter to FactorioGymEnv - enables pausing between trajectory steps
  • Visual feedback - pause/unpause actions now display timestamped messages in-game

Server Admin Configuration

  • Added server-adminlist.json to enable /editor mode and other admin commands
  • Includes "client_master" as admin to allow full server control during development/testing

Comprehensive Test Suite Additions

  • Added test_elapsed_ticks.py with 15+ new tests covering:
    • Sleep timing at different game speeds
    • Move, craft, and harvest timing validation
    • Real-world sleep duration accuracy
    • Tick accumulation across multiple actions
    • Edge cases like very low speeds
  • Enhanced existing tests with proper pause/unpause handling
  • Added game_info observation tests to verify tick tracking in observations

Achievements System Cleanup

  • Simplified achievements calculation by removing AchievementTracker class
  • Flattened achievement logic into a single calculate_achievements() function
  • Removed achievements from task metadata to reduce complexity
  • More maintainable and straightforward achievement processing

Tool Timing Improvements

  • Updated all action tools (move_to, craft_item, harvest_resource, connect_entities) to:
    • Track elapsed ticks accurately
    • Sleep for appropriate real-world time based on game speed
    • Provide timing feedback via console output
  • Redesigned sleep() tool to work with standard tick calculations rather than polling
  • Fixed movement calculations to use consistent character speed (0.15 tiles/tick)

API Improvements

  • Made num_agents optional in task configurations (defaults to 1)
  • Added command-line argument parsing to run_eval.py
  • Updated observation formatting to include game timing information

Migration Notes

  • value_accrual_time parameter removed from FactorioGymEnv - remove from instantiation calls
  • Use set_speed_and_unpause() instead of separate set_speed() calls in most cases
  • Tests may need pause_after_action=False for immediate execution scenarios

This refactor significantly improves the environment's usability for both training and evaluation while maintaining full backward compatibility for core functionality.

@kiankyars
Copy link
Copy Markdown
Collaborator

I'm going to finnish off what I started on #305 but I see you're modifying achievements so I'll wait until you finish this.

@kantneel kantneel changed the title Eval cleanups Eval cleanups - timing and env.step() Aug 22, 2025
@kantneel kantneel marked this pull request as ready for review August 22, 2025 15:50
@JackHopkins
Copy link
Copy Markdown
Owner

Nice work!

  • Why is there a set_speed_and_unpause as well as set_speed - it seems that the latter method is sufficient? What is the motivation for 2 methods?
  • I'm not sold on the toggle, I think we should rely on set_speed (to zero or not) to explicitly control this state instead (or use your new method).
  • Are the tests in test_elapsed_ticks added to the CI/CD suite? If not, it would be a good idea to include them.

@hrshtt
Copy link
Copy Markdown
Collaborator

hrshtt commented Aug 25, 2025

is it possible to parametrize the new tests?
in the following manner:

@pytest.mark.parametrize("speed", range(1, 10)) # 10 independent items

sleeping tests done on a single worker starve the other workers in distributed testing, parameterizing as well as using --dist loadscope ensures they are (naively) distributed across workers.

@kiankyars
Copy link
Copy Markdown
Collaborator

Nice work!

  • Why is there a set_speed_and_unpause as well as set_speed - it seems that the latter method is sufficient? What is the motivation for 2 methods?

seems to be a convenience method

Copy link
Copy Markdown
Collaborator

@kiankyars kiankyars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the API improvements only and the other changes look great for QOL from the surface.

Comment thread fle/env/gym_env/run_eval.py Outdated
@kantneel kantneel merged commit 1fb9188 into main Aug 27, 2025
3 checks passed
JackHopkins pushed a commit that referenced this pull request Sep 29, 2025
* clean up achievements; fix value accrual time; report flows better

* use pause, remove value accrual time

* make clients sleep correct time, add more speed and pausing methods to instance, add tests

* server adminlist

* clean up code, add more Instance methods, render pause message, tests passing'

* add tests for elapsed ticks

* fix run_eval

* game control;

* tests

* tests

* task info

* game control and medium electric poles

* change prints, max achieved throughput
JackHopkins pushed a commit that referenced this pull request Sep 29, 2025
* clean up achievements; fix value accrual time; report flows better

* use pause, remove value accrual time

* make clients sleep correct time, add more speed and pausing methods to instance, add tests

* server adminlist

* clean up code, add more Instance methods, render pause message, tests passing'

* add tests for elapsed ticks

* fix run_eval

* game control;

* tests

* tests

* task info

* game control and medium electric poles

* change prints, max achieved throughput
JackHopkins pushed a commit that referenced this pull request Sep 29, 2025
* clean up achievements; fix value accrual time; report flows better

* use pause, remove value accrual time

* make clients sleep correct time, add more speed and pausing methods to instance, add tests

* server adminlist

* clean up code, add more Instance methods, render pause message, tests passing'

* add tests for elapsed ticks

* fix run_eval

* game control;

* tests

* tests

* task info

* game control and medium electric poles

* change prints, max achieved throughput
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants