Desktop Automation System

A robust, modular voice-controlled automation suite designed for Windows environments. This system integrates speech recognition with low-level system controls, web automation, and file management to streamline daily desktop workflows.

📋 Overview

This project serves as a central interface for controlling a Windows workstation hands-free. Unlike standard voice assistants, this tool focuses on local execution and system automation, allowing users to manipulate windows, organize files, launch applications, and query specific web tools using custom logic handles.

It features a lightweight Tkinter GUI for visual status feedback (Listening, Processing, Idle) without consuming significant system resources.

🚀 Key Capabilities

🖥️ System & Power Control

Window Management: Minimize specific windows or clear the desktop via voice.
Audio Control: Adjust system volume or mute instantly.
Power Functions: Distinct voice commands to securely shutdown, restart, or hibernate the Windows PC, alongside options to simply shutdown/restart the Assistant script itself.
Battery Monitoring: Real-time feedback on battery health and percentage.

📂 Intelligent File Management

Auto-Organizer: One command (clean downloads) automatically sorts files in the Downloads directory into subfolders (Images, Documents, Installers, Audio, Video) based on file extensions.
File Creation: Generate folders and empty files instantly using voice commands.

🌐 Web & Application Integration

Universal Launcher: Uses pyautogui to interact with the Windows Start Menu, allowing the launching of any indexed application (e.g., "Open VS Code", "Launch Spotify").
Browser Automation: Configurable to work with specific browsers (Brave, Chrome, Edge, etc.).
Direct Portal Access: Hardcoded shortcuts for developer tools (GitHub, Claude, Gemini) and media (YouTube Music).

🛠️ Utilities

Push-to-Talk Security: Listens only when a specific key is held, preventing accidental triggers.
Timer, Date & Time: Built-in countdown timers and current time/date retrieval.

📂 Project Structure

main.py: The entry point. Initializes the GUI thread and the Logic thread concurrently.
logic.py: The core engine. Handles Speech-to-Text, Text-to-Speech, and command execution.
gui.py: Defines the Face class, a reactive Tkinter interface that visualizes the assistant's state.
config.py: Central configuration file for paths, API keys, and user preferences.

⚙️ Setup & Installation

Prerequisites

Windows 10 or 11 (Required for pypiwin32 and system calls).
Python 3.8+.
A working microphone.

Installation Steps

Clone the repository: git clone https://github.com/starJeet000/Desktop-Automation-System.git
cd desktop-automation-system
Install Dependencies: pip install -r requirements.txt
Note: pypiwin32 is critical for audio driver access on Windows. 3. Configure the Environment: Open config.py and strictly update the following:
- BROWSER_PATH: Point this to your actual browser executable (e.g., chrome.exe, brave.exe).
- DOWNLOADS_PATH: Verify the path to your downloads folder.
- PTT_KEY: Set your preferred Push-to-Talk key (Default: Right Alt).

▶️ Usage

Run the main script:
python main.py
The GUI dashboard will appear.
Hold the configured Push-to-Talk key (e.g., Right Alt).
Speak a command (see examples below).
Release the key to execute.

🗣️ Command Syntax Examples

Context	Command Example	Function
Launcher	"Open Notepad"	Opens app via Start Menu
Search	"Search for Python documentation"	Opens query in default browser
Files	"Clean downloads"	Sorts files into subdirectories
App Power	"Shutdown assistant"	Safely closes the AI script
PC Power	"Hibernate computer"	Puts Windows into hibernation
PC Power	"Shutdown computer"	Shuts down the Windows PC
Dev	"Search GitHub for react native"	Searches GitHub repositories
Media	"YouTube search lo-fi beats"	Searches and opens YouTube
Utils	"Set timer for 15 minutes"	Starts a background timer
Info	"What is the time"	Speaks current time

🧩 Extension

To add new commands, navigate to the execute_command method in logic.py. The architecture is designed to be easily extensible—simply add a new elif block with your desired keyword string and corresponding Python logic.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
commands.txt		commands.txt
config.py		config.py
gui.py		gui.py
logic.py		logic.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Desktop Automation System

📋 Overview

🚀 Key Capabilities

🖥️ System & Power Control

📂 Intelligent File Management

🌐 Web & Application Integration

🛠️ Utilities

📂 Project Structure

⚙️ Setup & Installation

Prerequisites

Installation Steps

▶️ Usage

🗣️ Command Syntax Examples

🧩 Extension

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Desktop Automation System

📋 Overview

🚀 Key Capabilities

🖥️ System & Power Control

📂 Intelligent File Management

🌐 Web & Application Integration

🛠️ Utilities

📂 Project Structure

⚙️ Setup & Installation

Prerequisites

Installation Steps

▶️ Usage

🗣️ Command Syntax Examples

🧩 Extension

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages