Skip to content

Jitenndra03/sleepy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👁️ EyeGuard

Real-time eye-tracking focus monitor for Linux. EyeGuard watches your gaze via webcam and triggers an escalating audible alarm when you look away from the screen — helping you stay focused during deep work sessions.

Built with Python, OpenCV, MediaPipe, and PyQt6.


Features

  • Real-time gaze tracking — combines Eye Aspect Ratio (EAR), Iris Position Ratio (IPR), and head pose estimation
  • Smart debounce — hysteresis state machine prevents false alarms on brief glances away
  • Escalating alerts — alarm intensifies if you remain unfocused (soft → loud after 15s)
  • Eyes closed / no face detection — separate warnings for closed eyes and leaving the frame
  • System tray integration — color-coded status icon, pause/resume, live settings dialog
  • Headless mode — run without GUI via --headless (great for servers or Wayland without tray support)
  • Live configuration — TOML config at ~/.config/eyeguard/config.toml with thread-safe runtime updates
  • Multi-threaded — capture, processing, and UI in separate threads for smooth performance
  • Docker support — run in a container with webcam and audio passthrough

Quick Start

# Clone and install
git clone <repo-url> && cd eyeguard
pip install -r requirements.txt

# Run with system tray
python -m eyeguard

# Run headless with debug logging
python -m eyeguard --headless --debug

# Or install as a package
pip install -e .
eyeguard

CLI Options

Flag Description
--headless Run without system tray UI
--debug Enable verbose debug logging

Docker

Run EyeGuard in a container with webcam and audio passthrough.

Quick Start (Docker Compose)

docker compose up --build

This builds the image and runs in headless mode with webcam + PulseAudio access.

Manual Docker Build & Run

# Build
docker build -t eyeguard .

# Run (headless with webcam + audio)
docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  eyeguard

# Run with debug logging
docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  eyeguard --headless --debug

Persist Configuration

Mount your host config directory so settings survive container restarts:

docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  -v ~/.config/eyeguard:/root/.config/eyeguard \
  eyeguard

Notes

  • The container runs in headless mode by default (no GUI). Override with --entrypoint if you need X11 forwarding.
  • Webcam: the host device /dev/video0 is passed through. Change the path if your camera uses a different device.
  • Audio: PulseAudio socket is shared read-only from the host. Works with both PulseAudio and PipeWire (PulseAudio compatibility).

Architecture

High-Level Overview

The application follows a capture → detect → analyze → decide → alert pipeline across three threads:

flowchart LR
    subgraph Capture["🎥 Capture Thread"]
        CAM[Webcam] --> CT[CaptureThread]
    end

    subgraph Processing["⚙️ Processing Thread"]
        FD[FaceDetector\nMediaPipe Face Mesh] --> GA[GazeEstimator\nEAR · IPR · Head Pose]
        GA --> DE[DecisionEngine\nFinite State Machine]
        DE --> AM[AlertManager\nSound Playback]
    end

    subgraph UI["🖥️ Main Thread"]
        TRAY[System Tray\nPyQt6]
        SD[Settings Dialog]
        TRAY --> SD
    end

    CT -->|Frame Queue| FD
    DE -.->|pyqtSignal| TRAY
    SD -.->|LiveConfig| DE
    AM -.->|simpleaudio| SOUND[🔔 Alarm]

    style Capture fill:#e8f5e9,stroke:#4caf50
    style Processing fill:#e3f2fd,stroke:#2196f3
    style UI fill:#fff3e0,stroke:#ff9800
Loading

Data Processing Pipeline

Each frame flows through detection, gaze analysis (three parallel metrics), into the decision engine:

flowchart TB
    subgraph Input
        FRAME["Video Frame\n(BGR, 640×480)"]
    end

    subgraph Detection
        MP["MediaPipe Face Mesh\n478 Landmarks"]
    end

    subgraph Gaze["Gaze Analysis"]
        direction LR
        EAR["Eye Aspect Ratio\n(EAR)\nOpen vs Closed"]
        IPR["Iris Position Ratio\n(IPR)\nGaze Direction"]
        HP["Head Pose\nsolvePnP\nYaw & Pitch"]
    end

    subgraph Decision
        GR["GazeResult\n11 float metrics"]
        FSM["Focus State Machine"]
    end

    subgraph Output
        EVT["FocusEvent"]
        direction LR
        ALARM["🔔 Trigger Alarm"]
        ESC["⚠️ Escalate"]
        CANCEL["✅ Cancel"]
    end

    FRAME --> MP
    MP --> EAR & IPR & HP
    EAR & IPR & HP --> GR
    GR --> FSM
    FSM --> EVT
    EVT --> ALARM & ESC & CANCEL

    style Input fill:#f3e5f5,stroke:#9c27b0
    style Detection fill:#e8f5e9,stroke:#4caf50
    style Gaze fill:#e3f2fd,stroke:#2196f3
    style Decision fill:#fff3e0,stroke:#ff9800
    style Output fill:#ffebee,stroke:#f44336
Loading

Focus Decision State Machine

The FocusDecisionEngine uses a finite state machine with hysteresis debounce to avoid false alarms. States transition based on gaze metrics and configurable timeouts:

stateDiagram-v2
    [*] --> FOCUSED

    FOCUSED --> POSSIBLY_UNFOCUSED : Gaze off-screen
    FOCUSED --> EYES_CLOSED : EAR < threshold
    FOCUSED --> NO_FACE : No face detected

    POSSIBLY_UNFOCUSED --> FOCUSED : Gaze returns
    POSSIBLY_UNFOCUSED --> UNFOCUSED : After 3s timeout

    UNFOCUSED --> FOCUSED : Gaze returns\n→ CANCEL_ALARM
    UNFOCUSED --> ALARM_ACTIVE : TRIGGER_ALARM

    ALARM_ACTIVE --> FOCUSED : Gaze returns\n→ CANCEL_ALARM
    ALARM_ACTIVE --> ALARM_ESCALATED : After 15s\n→ ESCALATE_ALARM

    ALARM_ESCALATED --> FOCUSED : Gaze returns\n→ CANCEL_ALARM

    EYES_CLOSED --> FOCUSED : Eyes open
    EYES_CLOSED --> UNFOCUSED : After 2s\n→ WARN_EYES_CLOSED

    NO_FACE --> FOCUSED : Face detected
    NO_FACE --> UNFOCUSED : After 5s\n→ WARN_NO_FACE
Loading
State Description Tray Color
FOCUSED User is looking at the screen 🟢 Green
POSSIBLY_UNFOCUSED Gaze drifted, within tolerance window 🟠 Orange
UNFOCUSED Confirmed off-screen after timeout 🔴 Red
ALARM_ACTIVE Alarm is sounding 🔴 Red
ALARM_ESCALATED Alarm escalated to loud mode 🔴 Red
EYES_CLOSED Eyes detected as closed 🟠 Orange
NO_FACE No face detected in frame ⚪ Grey

Threading & Sequence

Three threads coordinate via a frame queue and Qt signals:

sequenceDiagram
    participant W as Webcam
    participant CT as Capture Thread
    participant Q as Frame Queue
    participant PT as Processing Thread
    participant DE as DecisionEngine
    participant AM as AlertManager
    participant SIG as pyqtSignal
    participant UI as Main Thread (Qt)

    loop Every ~66ms (15 FPS)
        W->>CT: Read frame
        CT->>Q: Put frame (drop oldest)
    end

    loop Every ~66ms (15 FPS)
        Q->>PT: Get latest frame
        PT->>PT: Detect face + Analyze gaze
        PT->>DE: process(gaze_result)
        DE->>PT: FocusEvent

        alt TRIGGER_ALARM
            PT->>AM: trigger_alarm()
            AM->>AM: Play sound 🔔
        else CANCEL_ALARM
            PT->>AM: cancel_alarm()
        end

        PT->>SIG: emit status update
        SIG->>UI: Update tray icon
    end
Loading

Thread-safety mechanisms:

  • LiveConfig uses threading.RLock() for atomic config reads/writes
  • AlertManager uses threading.Lock() for alarm state
  • TrayStatusUpdater bridges threads via pyqtSignal
  • Frame queue uses a drop-oldest policy (no blocking)

Module Dependency Graph

graph TD
    MAIN["__main__.py"] --> APP["app.py\nEyeGuardApp"]
    APP --> CAP["capture.py\nCaptureThread"]
    APP --> DET["detector.py\nFaceDetector"]
    APP --> GAZ["gaze.py\nGazeEstimator"]
    APP --> DEC["decision.py\nDecisionEngine"]
    APP --> ALR["alert.py\nAlertManager"]
    APP --> UI["ui.py\nEyeGuardTray"]
    APP --> CFG["config.py\nLiveConfig"]

    UI --> CFG
    DEC --> CFG
    GAZ --> CONST["constants.py\nLandmark Indices"]
    DET --> CONST
    ALR --> UTIL["utils.py\nHelpers"]
    APP --> UTIL

    CAP -.->|opencv| EXT1["OpenCV"]
    DET -.->|mediapipe| EXT2["MediaPipe"]
    GAZ -.->|numpy + cv2| EXT3["NumPy"]
    ALR -.->|simpleaudio| EXT4["SimpleAudio"]
    UI -.->|PyQt6| EXT5["PyQt6"]

    style MAIN fill:#e8eaf6,stroke:#3f51b5
    style APP fill:#e8eaf6,stroke:#3f51b5
    style EXT1 fill:#f5f5f5,stroke:#9e9e9e
    style EXT2 fill:#f5f5f5,stroke:#9e9e9e
    style EXT3 fill:#f5f5f5,stroke:#9e9e9e
    style EXT4 fill:#f5f5f5,stroke:#9e9e9e
    style EXT5 fill:#f5f5f5,stroke:#9e9e9e
Loading

Key Algorithms

Eye Aspect Ratio (EAR)

Detects whether eyes are open or closed using 6 eye contour landmarks:

$$\text{EAR} = \frac{|p_2 - p_6| + |p_3 - p_5|}{2 \times |p_1 - p_4|}$$

  • EAR > 0.21 → eyes open
  • EAR ≤ 0.21 → eyes closed

Iris Position Ratio (IPR)

Measures horizontal and vertical iris displacement relative to eye bounds:

$$\text{IPR}_x = \frac{iris_{center.x} - left_{corner.x}}{right_{corner.x} - left_{corner.x}} \quad \in [0, 1]$$

  • IPR ≈ 0.5 → looking straight ahead
  • IPR < 0.3 or > 0.7 → looking away

Head Pose Estimation

Uses cv2.solvePnP with 6 facial anchor points mapped to a 3D reference model, decomposed into Euler angles:

  • Yaw (left/right head turn) — threshold: ±30°
  • Pitch (up/down tilt) — threshold: ±25°

Configuration

Config is stored at ~/.config/eyeguard/config.toml (auto-created on first run). Settings can be changed live via the system tray Settings dialog.

Timing

Setting Default Description
timing.unfocused_timeout 3.0 s Seconds of off-screen gaze before alarm
timing.eyes_closed_timeout 2.0 s Seconds of closed eyes before warning
timing.no_face_timeout 5.0 s Seconds of missing face before warning
timing.escalation_timeout 15.0 s Seconds before alarm escalates

Detection

Setting Default Description
detection.ear_threshold 0.21 Eye Aspect Ratio threshold for "open"
detection.head_yaw_max 30 ° Max head yaw before "not focused"
detection.head_pitch_max 25 ° Max head pitch before "not focused"

Camera

Setting Default Description
camera.device_id 0 Webcam device index
camera.fps 15 Target capture frame rate

Alarm

Setting Default Description
alarm.volume 0.8 Alarm volume (0.01.0)

Example config.toml

[timing]
unfocused_timeout = 3.0
eyes_closed_timeout = 2.0
no_face_timeout = 5.0
escalation_timeout = 15.0

[detection]
ear_threshold = 0.21
head_yaw_max = 30
head_pitch_max = 25

[camera]
device_id = 0
fps = 15

[alarm]
volume = 0.8

Project Structure

├── Dockerfile           # Container image definition
├── docker-compose.yml   # Docker Compose orchestration
├── .dockerignore        # Docker build exclusions
├── .gitignore           # Git ignored files
├── requirements.txt     # Python dependencies
├── setup.py             # Package setup
├── README.md
└── eyeguard/
    ├── __main__.py      # Entry point, CLI argument parsing
    ├── app.py           # EyeGuardApp — orchestrates all components
    ├── capture.py       # CaptureThread — webcam frame capture (daemon thread)
    ├── detector.py      # FaceDetector — MediaPipe Face Mesh wrapper
    ├── gaze.py          # Gaze analysis: EAR, IPR, head pose estimation
    ├── decision.py      # FocusDecisionEngine — FSM with hysteresis
    ├── alert.py         # AlertManager — sound playback & escalation
    ├── config.py        # LiveConfig — thread-safe TOML configuration
    ├── constants.py     # MediaPipe landmark indices
    ├── ui.py            # EyeGuardTray, SettingsDialog (PyQt6)
    ├── utils.py         # Shared helpers
    ├── resources/
    │   ├── alarm_default.wav
    │   └── alarm_escalated.wav
    └── tests/
        ├── test_config.py     # Config loading & live updates
        ├── test_decision.py   # State machine transitions & timeouts
        └── test_gaze.py       # EAR, IPR, GazeResult calculations

Running Tests

pip install pytest
python -m pytest eyeguard/tests/ -v

Tests cover:

  • Config — default values, live updates, TOML parsing, unknown key handling
  • Decision engine — all state transitions, timeout behavior, pause/resume, alarm escalation
  • Gaze math — EAR for open/closed eyes, IPR for centered/off-center iris, is_looking_at_screen() logic

System Requirements

  • OS: Linux (tested on Ubuntu 22.04+)
  • Python: 3.10+
  • Hardware: Webcam (USB or integrated)
  • Display: X11 or Wayland (for system tray; use --headless otherwise)
  • Docker (optional): Docker Engine 20.10+ and Docker Compose v2 for containerized usage

Dependencies

Package Purpose
opencv-python ≥ 4.8 Webcam capture, image processing, solvePnP
mediapipe ≥ 0.10 Face mesh detection (478 landmarks)
numpy ≥ 1.24 Array math for gaze calculations
simpleaudio ≥ 1.0.4 Cross-platform WAV playback
PyQt6 ≥ 6.5 System tray icon and settings dialog

License

See LICENSE for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors