A reinforcement learning project for autonomous drone navigation in indoor environments using ROS, Gazebo, and simulation-based control.
This repository explores two reinforcement learning approaches for UAV navigation:
- Q-Learning for discrete state and action spaces
- DDPG for continuous control
Autonomous navigation for unmanned aerial vehicles (UAVs) is an important problem in robotics, especially in indoor environments where path planning and control must work together reliably.
This project studies reinforcement learning for drone navigation in simulation.
The main focus is to move a UAV from a start position to a target position inside a Gazebo environment while learning navigation behavior from interaction and rewards.
The repository includes:
- a Q-Learning implementation for discrete navigation
- an initial DDPG implementation for continuous control
- a simulation workflow based on ROS and Gazebo
The Q-Learning implementation is designed for discrete navigation.
Key characteristics:
- State space: Discrete
- Action space: Discrete
- Navigation setup: 5 × 5 grid-based action space
- Control strategy: PID-based movement with Q-Learning for decision making
In this setup, the UAV learns to move toward a goal location in a simulated indoor environment based on a reward policy.
The repository also includes a Deep Deterministic Policy Gradient (DDPG) implementation.
This part of the project is intended for:
- continuous action-space navigation
- learning smoother control policies for drone motion
Note: the DDPG part is currently an early or under-development implementation.
.
├── DDPG/
│ └── DDPG.py
├── Q_learning_algorithm/
│ ├── Q-Learning.py
│ └── drone_qlearning.gif
└── README.md
Path: Q_learning_algorithm/Q-Learning.py
This module implements Q-Learning for autonomous UAV navigation in a discrete environment.
Path: DDPG/DDPG.py
This module contains the DDPG-based implementation for continuous-control navigation experiments.
This project was originally developed with the following setup:
- Ubuntu 16.04
- ROS Kinetic
- Gazebo 7
- Python 2.7
- TensorFlow 1.1.0
- OpenAI Gym 0.9.3
- ArDrone Autonomy ROS Package
Because this is a legacy ROS and TensorFlow project, it is best reproduced in a compatible Ubuntu 16.04 / ROS Kinetic environment.
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws
catkin_makecd ~/catkin_ws/src
git clone https://github.com/Ilyas-Raza1214/UAVs_Q_learning_algorithm-for-drone-using-ROS.gitInstall the required ROS, Gazebo, and Python dependencies for your environment.
Typical setup includes:
- ROS Kinetic
- Gazebo 7
- TensorFlow 1.1.0
- OpenAI Gym 0.9.3
- ArDrone Autonomy package
cd ~/catkin_ws
catkin_make
source devel/setup.bashThe Q-Learning workflow follows these steps:
- Initialize the UAV in the Gazebo environment
- Observe the current discrete state
- Select an action from the discrete grid-based action space
- Move the UAV using PID-based control
- Receive a reward based on movement toward the goal
- Update the Q-values
- Repeat until the target is reached or the episode ends
The DDPG workflow is intended to:
- Observe the environment state
- Use an actor network to output continuous actions
- Execute drone motion in simulation
- Learn a control policy using actor-critic updates
This type of project can be useful for:
- autonomous drone navigation
- indoor path planning
- reinforcement learning in robotics
- simulation-based UAV control
- research on discrete and continuous control methods
Possible future improvements include:
- migrating the code to Python 3
- migrating the stack to ROS Noetic or ROS 2
- modernizing the implementation for newer TensorFlow or PyTorch versions
- adding evaluation metrics and training curves
- documenting launch files and runtime commands in detail
- adding a demo for the DDPG implementation
- improving reproducibility with a full dependency file
- The Q-Learning part is the main demonstrated implementation in this repository.
- The DDPG part is included as a continuous-control extension.
- This repository is best presented as a reinforcement learning UAV navigation project rather than only a single algorithm implementation.
Muhammad Ilyas Raza
Machine Learning Engineer | Computer Vision | Robotics | Applied AI
This project is distributed under the license included in this repository.
