Skip to content

rai-sukant/grid_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Q-Learning Grid World Solver

This project implements a Reinforcement Learning agent that learns to navigate a 6ร—6 grid. The agent must find the optimal path from a starting position to a goal while avoiding static obstacles.

๐ŸŽฎ The Environment

The world is a coordinate-based grid where:

  • Grid Size: 6ร—6

  • Start Position (S): (5, 0) (Bottom-Left)

  • Goal Position (G): (0, 5) (Top-Right)

  • Obstacles (X): Static blocks located at specific coordinates that penalize the agent.

๐Ÿง  Reinforcement Learning Logic

The agent uses the Q-Learning algorithm to populate a 3D Q-Table (6, 6, 4), representing 4 possible actions (Up, Right, Down, Left) for every grid cell.

Key Hyperparameters

  • Alpha (ฮฑ): 0.3 (Learning Rate)

  • Gamma (ฮณ): 0.95 (Discount Factor)

  • Epsilon (ฯต): Starts at 0.9 and decays over time to balance exploration and exploitation.

Reward Structure

  • Goal Reach: +100

  • Obstacle/Boundary Hit: -10

  • Each Step: -0.5 (Encourages the shortest path)

๐Ÿš€ How to Run

  1. Initialize the Q-Table with zeros.

  2. Run the train_agent() function for 20,000 episodes.

  3. Use visualize_best_grid(q_table) to view the learned policy in the console.

๐Ÿ“ Final Learned Path

Below is the visual representation of the agent's optimal policy after training. Arrows indicate the action with the highest Q-value for each state.


๐Ÿ›  Features

  • Epsilon-Greedy Policy: Ensures the agent explores the grid thoroughly before settling on a path.

  • Boundary Protection: The is_valid_state function prevents the agent from leaving the 6ร—6 area.

  • Detailed Visualization: Formatted console output using :7.2f for aligned and readable Q-values.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors