Skip to content

Ilyas-Raza1214/UAVs_Q_learning_algorithm-for-drone-using-ROS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning for Autonomous UAV Navigation using ROS

A reinforcement learning project for autonomous drone navigation in indoor environments using ROS, Gazebo, and simulation-based control.

This repository explores two reinforcement learning approaches for UAV navigation:

  • Q-Learning for discrete state and action spaces
  • DDPG for continuous control

Overview

Autonomous navigation for unmanned aerial vehicles (UAVs) is an important problem in robotics, especially in indoor environments where path planning and control must work together reliably.

This project studies reinforcement learning for drone navigation in simulation.
The main focus is to move a UAV from a start position to a target position inside a Gazebo environment while learning navigation behavior from interaction and rewards.

The repository includes:

  • a Q-Learning implementation for discrete navigation
  • an initial DDPG implementation for continuous control
  • a simulation workflow based on ROS and Gazebo

Methods Included

1. Q-Learning

The Q-Learning implementation is designed for discrete navigation.

Key characteristics:

  • State space: Discrete
  • Action space: Discrete
  • Navigation setup: 5 × 5 grid-based action space
  • Control strategy: PID-based movement with Q-Learning for decision making

In this setup, the UAV learns to move toward a goal location in a simulated indoor environment based on a reward policy.

2. DDPG

The repository also includes a Deep Deterministic Policy Gradient (DDPG) implementation.

This part of the project is intended for:

  • continuous action-space navigation
  • learning smoother control policies for drone motion

Note: the DDPG part is currently an early or under-development implementation.

Demo

Q-Learning Navigation Demo

Q-Learning UAV Navigation Demo

Repository Structure

.
├── DDPG/
│   └── DDPG.py
├── Q_learning_algorithm/
│   ├── Q-Learning.py
│   └── drone_qlearning.gif
└── README.md

Project Components

Q-Learning Module

Path: Q_learning_algorithm/Q-Learning.py

This module implements Q-Learning for autonomous UAV navigation in a discrete environment.

DDPG Module

Path: DDPG/DDPG.py

This module contains the DDPG-based implementation for continuous-control navigation experiments.

Environment and Dependencies

This project was originally developed with the following setup:

  • Ubuntu 16.04
  • ROS Kinetic
  • Gazebo 7
  • Python 2.7
  • TensorFlow 1.1.0
  • OpenAI Gym 0.9.3
  • ArDrone Autonomy ROS Package

Because this is a legacy ROS and TensorFlow project, it is best reproduced in a compatible Ubuntu 16.04 / ROS Kinetic environment.

Installation

1. Create a catkin workspace

mkdir -p ~/catkin_ws/src
cd ~/catkin_ws
catkin_make

2. Clone this repository

cd ~/catkin_ws/src
git clone https://github.com/Ilyas-Raza1214/UAVs_Q_learning_algorithm-for-drone-using-ROS.git

3. Install dependencies

Install the required ROS, Gazebo, and Python dependencies for your environment.

Typical setup includes:

  • ROS Kinetic
  • Gazebo 7
  • TensorFlow 1.1.0
  • OpenAI Gym 0.9.3
  • ArDrone Autonomy package

4. Build the workspace

cd ~/catkin_ws
catkin_make
source devel/setup.bash

How It Works

Q-Learning Navigation

The Q-Learning workflow follows these steps:

  1. Initialize the UAV in the Gazebo environment
  2. Observe the current discrete state
  3. Select an action from the discrete grid-based action space
  4. Move the UAV using PID-based control
  5. Receive a reward based on movement toward the goal
  6. Update the Q-values
  7. Repeat until the target is reached or the episode ends

DDPG Navigation

The DDPG workflow is intended to:

  1. Observe the environment state
  2. Use an actor network to output continuous actions
  3. Execute drone motion in simulation
  4. Learn a control policy using actor-critic updates

Applications

This type of project can be useful for:

  • autonomous drone navigation
  • indoor path planning
  • reinforcement learning in robotics
  • simulation-based UAV control
  • research on discrete and continuous control methods

Future Improvements

Possible future improvements include:

  • migrating the code to Python 3
  • migrating the stack to ROS Noetic or ROS 2
  • modernizing the implementation for newer TensorFlow or PyTorch versions
  • adding evaluation metrics and training curves
  • documenting launch files and runtime commands in detail
  • adding a demo for the DDPG implementation
  • improving reproducibility with a full dependency file

Notes

  • The Q-Learning part is the main demonstrated implementation in this repository.
  • The DDPG part is included as a continuous-control extension.
  • This repository is best presented as a reinforcement learning UAV navigation project rather than only a single algorithm implementation.

Author

Muhammad Ilyas Raza
Machine Learning Engineer | Computer Vision | Robotics | Applied AI

License

This project is distributed under the license included in this repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages