Jarvis – Voice Controlled Desktop AI Assistant

A fully voice-controlled personal desktop assistant built in Python, inspired by Jarvis in Ironman movie.This assistant can listen, speak, execute system actions, open apps, control volume/brightness, search YouTube, play songs on Spotify, interact with ChatGPT, and display a fully animated UI.

Important

Operating System: This project involves system-level commands (e.g., os.system, ctypes, powershell) that are designed specifically for Windows 10/11. It will not function correctly on macOS or Linux without modification.Jarvis uses:

Speech Recognition
Text-To-Speech
OpenAI API
Tkinter animated GUI
Local OS automation
Custom command execution engine

🎯 About the Project

This project started from a very basic speech-to-text experiment using PyAudio & SpeechRecognition. No AI, no UI — just a simple goal: make the computer listen and respond.

Later, the project evolved step-by-step:

✔️ Phase 1 — Basic Voice Input

Started with PyAudio + speech_recognition
Recognized simple speech commands
Executed local Python functions
No AI used at this stage

✔️ Phase 2 — System Commands

Added OS-level control using os, webbrowser, ctypes, pyautogui
Could open apps, browsers, control volume/brightness

✔️ Phase 3 — Text-to-Speech

Added pyttsx3
Jarvis started speaking responses
Basic conversational ability

✔️ Phase 4 — Advanced Commands

YouTube search + auto-play
Spotify search
Open google and search
Open any web
Open system apps
Shutdown, restart, sleep, lock
Close all apps

✔️ Phase 5 — GUI (Big Upgrade)

Built a full Tkinter UI
Added animated 3D globe effect
Added audio visualizer bars
Added typewriter text animation
Added fade-out effects
Clean “Jarvis-like” theme

✔️ Phase 6 — ChatGPT Integration

Added direct OpenAI API calls
Built a custom JSON action interpreter
Jarvis could now:
- Understand natural language
- Answer general questions
- Use AI fallback when speech commands fail
- Execute system actions through AI JSON

✔️ Phase 7 — Error Fixes & Enhancements

Fixed YouTube search
Fixed repeated answers
Fixed Spotify handling
Added command normalization
Added ambient noise reduction
Improved streaming text display
Added multi-threaded listening

This project now evolved into a full-featured desktop AI assistant.

🚀 Features

🎤 Voice Recognition

Continuous listening mode
Google Speech Recognition
Ambient noise cancellation
Real-time display of what user said

🔊 Text-to-Speech

pyttsx3 engine
Professional Jarvis-style narration
Adjustable speed + volume
UI typewriter animation

💻 System Control

Open apps:
- CMD
- PowerShell
- File Explorer
- Notepad
- Calculator
- Chrome
- Spotify
- YouTube
- Settings
- Task Manager
System power actions:
- Shutdown
- Restart
- Sleep
- Lock
Volume control:
- Increase
- Decrease
- Mute
Brightness control:
- Increase
- Decrease

🌐 Internet & App Control

Open YouTube
Search + auto-play top YouTube result
Spotify play/search
ChatGPT website
ChatGPT voice mode

🔍 AI Features (OpenAI Integration)

General questions answered by GPT
Fallback to local commands if API fails
JSON action extraction
Unified command execution
Custom system prompt for Jarvis personality

🎨 Graphical UI (Tkinter)

Fully animated 3D rotating globe
Audio visualizer
Fade-out effects
Typewriter text rendering
Smooth UX
Modern neon theme
Central status panel (“Listening… / Processing…”)
Real-time user text + AI response

⚙️ Internal Architecture

Background listening thread
Robust command interpreter
Clean error handling
YouTube ID regex extraction
URL encoding
OS-level process control
App protection list (python, VSCode)

🏗️ Project Structure

Jarvis Assistant/
│
├── jarvis.py                # Main assistant code  
├── requirements.txt         # Dependencies  
├── README.md                # Documentation  
└── assets/                  # (Optional) icons, images

🛠️ Installation Guide

1. Clone the repo

git clone https://github.com/yourusername/jarvis-assistant
cd jarvis-assistant

2. Install dependencies

pip install -r requirements.txt

Required Libraries:

speechrecognition
pyttsx3
pyautogui
psutil
screen_brightness_control
openai
requests
tkinter (built-in)
pyaudio

3. Install PyAudio (Windows fix)

If PyAudio fails:

pip install pipwin
pipwin install pyaudio

4. Set OpenAI API Key

Inside the code:

OPENAI_KEY = "your-key-here"

Or use environment variable:

setx OPENAI_API_KEY "your-key-here"

5. Run the assistant

python jarvis.py

🎙️ Available Voice Commands

App Commands

open chrome
open calculator
open file explorer
open chatgpt
open youtube
play despacito on youtube
play songs on spotify

System Commands

shutdown
restart
lock system
increase volume
decrease brightness
close all apps

AI Questions

who is sundar pichai?
what is quantum computing?
explain python in simple words

UI/Exit Commands

bye
stop
exit

🧠 How the Assistant Works (Architecture Explanation)

🔹 1. Speech Input

Speech → Microphone → Google STT → Text

🔹 2. Local Command Engine

Jarvis checks if your command matches:

system actions
app open
YouTube
Spotify
volume/brightness

If matched → executes instantly.

🔹 3. AI Processing (If needed)

If command is NOT recognized locally:

Sent to ChatGPT
GPT responds with:
- A normal text answer
- OR a JSON command like:

{
  "action": { "type":"open_app", "params":{ "app":"calculator" } },
  "speak": "Opening calculator"
}

🔹 4. UI Rendering

Typewriter text
Animated globe
Audio bars

🧩 Troubleshooting

Speech issues?

Try:

pipwin install pyaudio

Make sure microphone is selected.

Repeated AI answers?

Fixed in:

Improved JSON extraction
Stripping duplicate replies

YouTube not opening?

Fixed using:

Regex video ID extractor
Correct search URL

🏆 Future Improvements (Optional)

Offline STT with Whisper
Real-time wake word (“Hey Jarvis”)
Better 3D UI
Add system tray mode
Add weather, news APIs
Add email automation
Add reminders + notes

📌 License

Free to use for personal projects.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
core		core
demos		demos
experiments		experiments
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Jarvis – Voice Controlled Desktop AI Assistant

🎯 About the Project

✔️ Phase 1 — Basic Voice Input

✔️ Phase 2 — System Commands

✔️ Phase 3 — Text-to-Speech

✔️ Phase 4 — Advanced Commands

✔️ Phase 5 — GUI (Big Upgrade)

✔️ Phase 6 — ChatGPT Integration

✔️ Phase 7 — Error Fixes & Enhancements

🚀 Features

🎤 Voice Recognition

🔊 Text-to-Speech

💻 System Control

🌐 Internet & App Control

🔍 AI Features (OpenAI Integration)

🎨 Graphical UI (Tkinter)

⚙️ Internal Architecture

🏗️ Project Structure

🛠️ Installation Guide

1. Clone the repo

2. Install dependencies

Required Libraries:

3. Install PyAudio (Windows fix)

4. Set OpenAI API Key

5. Run the assistant

🎙️ Available Voice Commands

App Commands

System Commands

AI Questions

UI/Exit Commands

🧠 How the Assistant Works (Architecture Explanation)

🔹 1. Speech Input

🔹 2. Local Command Engine

🔹 3. AI Processing (If needed)

🔹 4. UI Rendering

🧩 Troubleshooting

Speech issues?

Repeated AI answers?

YouTube not opening?

🏆 Future Improvements (Optional)

📌 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages