Skip to content

Swathi-88/JARVIS-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jarvis – Voice Controlled Desktop AI Assistant

A fully voice-controlled personal desktop assistant built in Python, inspired by Jarvis in Ironman movie.This assistant can listen, speak, execute system actions, open apps, control volume/brightness, search YouTube, play songs on Spotify, interact with ChatGPT, and display a fully animated UI.

Important

Operating System: This project involves system-level commands (e.g., os.system, ctypes, powershell) that are designed specifically for Windows 10/11. It will not function correctly on macOS or Linux without modification.Jarvis uses:

  • Speech Recognition
  • Text-To-Speech
  • OpenAI API
  • Tkinter animated GUI
  • Local OS automation
  • Custom command execution engine

🎯 About the Project

This project started from a very basic speech-to-text experiment using PyAudio & SpeechRecognition. No AI, no UI — just a simple goal: make the computer listen and respond.

Later, the project evolved step-by-step:

✔️ Phase 1 — Basic Voice Input

  • Started with PyAudio + speech_recognition
  • Recognized simple speech commands
  • Executed local Python functions
  • No AI used at this stage

✔️ Phase 2 — System Commands

  • Added OS-level control using os, webbrowser, ctypes, pyautogui
  • Could open apps, browsers, control volume/brightness

✔️ Phase 3 — Text-to-Speech

  • Added pyttsx3
  • Jarvis started speaking responses
  • Basic conversational ability

✔️ Phase 4 — Advanced Commands

  • YouTube search + auto-play
  • Spotify search
  • Open google and search
  • Open any web
  • Open system apps
  • Shutdown, restart, sleep, lock
  • Close all apps

✔️ Phase 5 — GUI (Big Upgrade)

  • Built a full Tkinter UI
  • Added animated 3D globe effect
  • Added audio visualizer bars
  • Added typewriter text animation
  • Added fade-out effects
  • Clean “Jarvis-like” theme

✔️ Phase 6 — ChatGPT Integration

  • Added direct OpenAI API calls

  • Built a custom JSON action interpreter

  • Jarvis could now:

    • Understand natural language
    • Answer general questions
    • Use AI fallback when speech commands fail
    • Execute system actions through AI JSON

✔️ Phase 7 — Error Fixes & Enhancements

  • Fixed YouTube search
  • Fixed repeated answers
  • Fixed Spotify handling
  • Added command normalization
  • Added ambient noise reduction
  • Improved streaming text display
  • Added multi-threaded listening

This project now evolved into a full-featured desktop AI assistant.


🚀 Features

🎤 Voice Recognition

  • Continuous listening mode
  • Google Speech Recognition
  • Ambient noise cancellation
  • Real-time display of what user said

🔊 Text-to-Speech

  • pyttsx3 engine
  • Professional Jarvis-style narration
  • Adjustable speed + volume
  • UI typewriter animation

💻 System Control

  • Open apps:

    • CMD
    • PowerShell
    • File Explorer
    • Notepad
    • Calculator
    • Chrome
    • Spotify
    • YouTube
    • Settings
    • Task Manager
  • System power actions:

    • Shutdown
    • Restart
    • Sleep
    • Lock
  • Volume control:

    • Increase
    • Decrease
    • Mute
  • Brightness control:

    • Increase
    • Decrease

🌐 Internet & App Control

  • Open YouTube
  • Search + auto-play top YouTube result
  • Spotify play/search
  • ChatGPT website
  • ChatGPT voice mode

🔍 AI Features (OpenAI Integration)

  • General questions answered by GPT
  • Fallback to local commands if API fails
  • JSON action extraction
  • Unified command execution
  • Custom system prompt for Jarvis personality

🎨 Graphical UI (Tkinter)

  • Fully animated 3D rotating globe
  • Audio visualizer
  • Fade-out effects
  • Typewriter text rendering
  • Smooth UX
  • Modern neon theme
  • Central status panel (“Listening… / Processing…”)
  • Real-time user text + AI response

⚙️ Internal Architecture

  • Background listening thread
  • Robust command interpreter
  • Clean error handling
  • YouTube ID regex extraction
  • URL encoding
  • OS-level process control
  • App protection list (python, VSCode)

🏗️ Project Structure

Jarvis Assistant/
│
├── jarvis.py                # Main assistant code  
├── requirements.txt         # Dependencies  
├── README.md                # Documentation  
└── assets/                  # (Optional) icons, images  

🛠️ Installation Guide

1. Clone the repo

git clone https://github.com/yourusername/jarvis-assistant
cd jarvis-assistant

2. Install dependencies

pip install -r requirements.txt

Required Libraries:

speechrecognition
pyttsx3
pyautogui
psutil
screen_brightness_control
openai
requests
tkinter (built-in)
pyaudio

3. Install PyAudio (Windows fix)

If PyAudio fails:

pip install pipwin
pipwin install pyaudio

4. Set OpenAI API Key

Inside the code:

OPENAI_KEY = "your-key-here"

Or use environment variable:

setx OPENAI_API_KEY "your-key-here"

5. Run the assistant

python jarvis.py

🎙️ Available Voice Commands

App Commands

open chrome
open calculator
open file explorer
open chatgpt
open youtube
play despacito on youtube
play songs on spotify

System Commands

shutdown
restart
lock system
increase volume
decrease brightness
close all apps

AI Questions

who is sundar pichai?
what is quantum computing?
explain python in simple words

UI/Exit Commands

bye
stop
exit

🧠 How the Assistant Works (Architecture Explanation)

🔹 1. Speech Input

Speech → Microphone → Google STT → Text

🔹 2. Local Command Engine

Jarvis checks if your command matches:

  • system actions
  • app open
  • YouTube
  • Spotify
  • volume/brightness

If matched → executes instantly.

🔹 3. AI Processing (If needed)

If command is NOT recognized locally:

  • Sent to ChatGPT

  • GPT responds with:

    • A normal text answer
    • OR a JSON command like:
{
  "action": { "type":"open_app", "params":{ "app":"calculator" } },
  "speak": "Opening calculator"
}

🔹 4. UI Rendering

  • Typewriter text
  • Animated globe
  • Audio bars

🧩 Troubleshooting

Speech issues?

Try:

pipwin install pyaudio

Make sure microphone is selected.

Repeated AI answers?

Fixed in:

  • Improved JSON extraction
  • Stripping duplicate replies

YouTube not opening?

Fixed using:

  • Regex video ID extractor
  • Correct search URL

🏆 Future Improvements (Optional)

  • Offline STT with Whisper
  • Real-time wake word (“Hey Jarvis”)
  • Better 3D UI
  • Add system tray mode
  • Add weather, news APIs
  • Add email automation
  • Add reminders + notes

📌 License

Free to use for personal projects.

About

A voice-controlled desktop AI assistant for Windows featuring OpenAI integration, system automation, and an animated GUI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors