👁️ EyeGuard

Real-time eye-tracking focus monitor for Linux. EyeGuard watches your gaze via webcam and triggers an escalating audible alarm when you look away from the screen — helping you stay focused during deep work sessions.

Built with Python, OpenCV, MediaPipe, and PyQt6.

Features

Real-time gaze tracking — combines Eye Aspect Ratio (EAR), Iris Position Ratio (IPR), and head pose estimation
Smart debounce — hysteresis state machine prevents false alarms on brief glances away
Escalating alerts — alarm intensifies if you remain unfocused (soft → loud after 15s)
Eyes closed / no face detection — separate warnings for closed eyes and leaving the frame
System tray integration — color-coded status icon, pause/resume, live settings dialog
Headless mode — run without GUI via --headless (great for servers or Wayland without tray support)
Live configuration — TOML config at ~/.config/eyeguard/config.toml with thread-safe runtime updates
Multi-threaded — capture, processing, and UI in separate threads for smooth performance
Docker support — run in a container with webcam and audio passthrough

Quick Start

# Clone and install
git clone <repo-url> && cd eyeguard
pip install -r requirements.txt

# Run with system tray
python -m eyeguard

# Run headless with debug logging
python -m eyeguard --headless --debug

# Or install as a package
pip install -e .
eyeguard

CLI Options

Flag	Description
`--headless`	Run without system tray UI
`--debug`	Enable verbose debug logging

Docker

Run EyeGuard in a container with webcam and audio passthrough.

Quick Start (Docker Compose)

docker compose up --build

This builds the image and runs in headless mode with webcam + PulseAudio access.

Manual Docker Build & Run

# Build
docker build -t eyeguard .

# Run (headless with webcam + audio)
docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  eyeguard

# Run with debug logging
docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  eyeguard --headless --debug

Persist Configuration

Mount your host config directory so settings survive container restarts:

docker run --rm \
  --device /dev/video0:/dev/video0 \
  -v ${XDG_RUNTIME_DIR}/pulse:/run/user/1000/pulse:ro \
  -e PULSE_SERVER=unix:/run/user/1000/pulse/native \
  -v ~/.config/eyeguard:/root/.config/eyeguard \
  eyeguard

Notes

The container runs in headless mode by default (no GUI). Override with --entrypoint if you need X11 forwarding.
Webcam: the host device /dev/video0 is passed through. Change the path if your camera uses a different device.
Audio: PulseAudio socket is shared read-only from the host. Works with both PulseAudio and PipeWire (PulseAudio compatibility).

Architecture

High-Level Overview

The application follows a capture → detect → analyze → decide → alert pipeline across three threads:

flowchart LR
    subgraph Capture["🎥 Capture Thread"]
        CAM[Webcam] --> CT[CaptureThread]
    end

    subgraph Processing["⚙️ Processing Thread"]
        FD[FaceDetector\nMediaPipe Face Mesh] --> GA[GazeEstimator\nEAR · IPR · Head Pose]
        GA --> DE[DecisionEngine\nFinite State Machine]
        DE --> AM[AlertManager\nSound Playback]
    end

    subgraph UI["🖥️ Main Thread"]
        TRAY[System Tray\nPyQt6]
        SD[Settings Dialog]
        TRAY --> SD
    end

    CT -->|Frame Queue| FD
    DE -.->|pyqtSignal| TRAY
    SD -.->|LiveConfig| DE
    AM -.->|simpleaudio| SOUND[🔔 Alarm]

    style Capture fill:#e8f5e9,stroke:#4caf50
    style Processing fill:#e3f2fd,stroke:#2196f3
    style UI fill:#fff3e0,stroke:#ff9800

Data Processing Pipeline

Each frame flows through detection, gaze analysis (three parallel metrics), into the decision engine:

flowchart TB
    subgraph Input
        FRAME["Video Frame\n(BGR, 640×480)"]
    end

    subgraph Detection
        MP["MediaPipe Face Mesh\n478 Landmarks"]
    end

    subgraph Gaze["Gaze Analysis"]
        direction LR
        EAR["Eye Aspect Ratio\n(EAR)\nOpen vs Closed"]
        IPR["Iris Position Ratio\n(IPR)\nGaze Direction"]
        HP["Head Pose\nsolvePnP\nYaw & Pitch"]
    end

    subgraph Decision
        GR["GazeResult\n11 float metrics"]
        FSM["Focus State Machine"]
    end

    subgraph Output
        EVT["FocusEvent"]
        direction LR
        ALARM["🔔 Trigger Alarm"]
        ESC["⚠️ Escalate"]
        CANCEL["✅ Cancel"]
    end

    FRAME --> MP
    MP --> EAR & IPR & HP
    EAR & IPR & HP --> GR
    GR --> FSM
    FSM --> EVT
    EVT --> ALARM & ESC & CANCEL

    style Input fill:#f3e5f5,stroke:#9c27b0
    style Detection fill:#e8f5e9,stroke:#4caf50
    style Gaze fill:#e3f2fd,stroke:#2196f3
    style Decision fill:#fff3e0,stroke:#ff9800
    style Output fill:#ffebee,stroke:#f44336

Focus Decision State Machine

The FocusDecisionEngine uses a finite state machine with hysteresis debounce to avoid false alarms. States transition based on gaze metrics and configurable timeouts:

stateDiagram-v2
    [*] --> FOCUSED

    FOCUSED --> POSSIBLY_UNFOCUSED : Gaze off-screen
    FOCUSED --> EYES_CLOSED : EAR < threshold
    FOCUSED --> NO_FACE : No face detected

    POSSIBLY_UNFOCUSED --> FOCUSED : Gaze returns
    POSSIBLY_UNFOCUSED --> UNFOCUSED : After 3s timeout

    UNFOCUSED --> FOCUSED : Gaze returns\n→ CANCEL_ALARM
    UNFOCUSED --> ALARM_ACTIVE : TRIGGER_ALARM

    ALARM_ACTIVE --> FOCUSED : Gaze returns\n→ CANCEL_ALARM
    ALARM_ACTIVE --> ALARM_ESCALATED : After 15s\n→ ESCALATE_ALARM

    ALARM_ESCALATED --> FOCUSED : Gaze returns\n→ CANCEL_ALARM

    EYES_CLOSED --> FOCUSED : Eyes open
    EYES_CLOSED --> UNFOCUSED : After 2s\n→ WARN_EYES_CLOSED

    NO_FACE --> FOCUSED : Face detected
    NO_FACE --> UNFOCUSED : After 5s\n→ WARN_NO_FACE

State	Description	Tray Color
`FOCUSED`	User is looking at the screen	🟢 Green
`POSSIBLY_UNFOCUSED`	Gaze drifted, within tolerance window	🟠 Orange
`UNFOCUSED`	Confirmed off-screen after timeout	🔴 Red
`ALARM_ACTIVE`	Alarm is sounding	🔴 Red
`ALARM_ESCALATED`	Alarm escalated to loud mode	🔴 Red
`EYES_CLOSED`	Eyes detected as closed	🟠 Orange
`NO_FACE`	No face detected in frame	⚪ Grey

Threading & Sequence

Three threads coordinate via a frame queue and Qt signals:

sequenceDiagram
    participant W as Webcam
    participant CT as Capture Thread
    participant Q as Frame Queue
    participant PT as Processing Thread
    participant DE as DecisionEngine
    participant AM as AlertManager
    participant SIG as pyqtSignal
    participant UI as Main Thread (Qt)

    loop Every ~66ms (15 FPS)
        W->>CT: Read frame
        CT->>Q: Put frame (drop oldest)
    end

    loop Every ~66ms (15 FPS)
        Q->>PT: Get latest frame
        PT->>PT: Detect face + Analyze gaze
        PT->>DE: process(gaze_result)
        DE->>PT: FocusEvent

        alt TRIGGER_ALARM
            PT->>AM: trigger_alarm()
            AM->>AM: Play sound 🔔
        else CANCEL_ALARM
            PT->>AM: cancel_alarm()
        end

        PT->>SIG: emit status update
        SIG->>UI: Update tray icon
    end

Thread-safety mechanisms:

LiveConfig uses threading.RLock() for atomic config reads/writes
AlertManager uses threading.Lock() for alarm state
TrayStatusUpdater bridges threads via pyqtSignal
Frame queue uses a drop-oldest policy (no blocking)

Module Dependency Graph

graph TD
    MAIN["__main__.py"] --> APP["app.py\nEyeGuardApp"]
    APP --> CAP["capture.py\nCaptureThread"]
    APP --> DET["detector.py\nFaceDetector"]
    APP --> GAZ["gaze.py\nGazeEstimator"]
    APP --> DEC["decision.py\nDecisionEngine"]
    APP --> ALR["alert.py\nAlertManager"]
    APP --> UI["ui.py\nEyeGuardTray"]
    APP --> CFG["config.py\nLiveConfig"]

    UI --> CFG
    DEC --> CFG
    GAZ --> CONST["constants.py\nLandmark Indices"]
    DET --> CONST
    ALR --> UTIL["utils.py\nHelpers"]
    APP --> UTIL

    CAP -.->|opencv| EXT1["OpenCV"]
    DET -.->|mediapipe| EXT2["MediaPipe"]
    GAZ -.->|numpy + cv2| EXT3["NumPy"]
    ALR -.->|simpleaudio| EXT4["SimpleAudio"]
    UI -.->|PyQt6| EXT5["PyQt6"]

    style MAIN fill:#e8eaf6,stroke:#3f51b5
    style APP fill:#e8eaf6,stroke:#3f51b5
    style EXT1 fill:#f5f5f5,stroke:#9e9e9e
    style EXT2 fill:#f5f5f5,stroke:#9e9e9e
    style EXT3 fill:#f5f5f5,stroke:#9e9e9e
    style EXT4 fill:#f5f5f5,stroke:#9e9e9e
    style EXT5 fill:#f5f5f5,stroke:#9e9e9e

Key Algorithms

Eye Aspect Ratio (EAR)

Detects whether eyes are open or closed using 6 eye contour landmarks:

$$\text{EAR} = \frac{|p_2 - p_6| + |p_3 - p_5|}{2 \times |p_1 - p_4|}$$

EAR > 0.21 → eyes open
EAR ≤ 0.21 → eyes closed

Iris Position Ratio (IPR)

Measures horizontal and vertical iris displacement relative to eye bounds:

$$\text{IPR}_x = \frac{iris_{center.x} - left_{corner.x}}{right_{corner.x} - left_{corner.x}} \quad \in [0, 1]$$

IPR ≈ 0.5 → looking straight ahead
IPR < 0.3 or > 0.7 → looking away

Head Pose Estimation

Uses cv2.solvePnP with 6 facial anchor points mapped to a 3D reference model, decomposed into Euler angles:

Yaw (left/right head turn) — threshold: ±30°
Pitch (up/down tilt) — threshold: ±25°

Configuration

Config is stored at ~/.config/eyeguard/config.toml (auto-created on first run). Settings can be changed live via the system tray Settings dialog.

Timing

Setting	Default	Description
`timing.unfocused_timeout`	`3.0` s	Seconds of off-screen gaze before alarm
`timing.eyes_closed_timeout`	`2.0` s	Seconds of closed eyes before warning
`timing.no_face_timeout`	`5.0` s	Seconds of missing face before warning
`timing.escalation_timeout`	`15.0` s	Seconds before alarm escalates

Detection

Setting	Default	Description
`detection.ear_threshold`	`0.21`	Eye Aspect Ratio threshold for "open"
`detection.head_yaw_max`	`30` °	Max head yaw before "not focused"
`detection.head_pitch_max`	`25` °	Max head pitch before "not focused"

Camera

Setting	Default	Description
`camera.device_id`	`0`	Webcam device index
`camera.fps`	`15`	Target capture frame rate

Alarm

Setting	Default	Description
`alarm.volume`	`0.8`	Alarm volume (`0.0`–`1.0`)

Example `config.toml`

[timing]
unfocused_timeout = 3.0
eyes_closed_timeout = 2.0
no_face_timeout = 5.0
escalation_timeout = 15.0

[detection]
ear_threshold = 0.21
head_yaw_max = 30
head_pitch_max = 25

[camera]
device_id = 0
fps = 15

[alarm]
volume = 0.8

Project Structure

├── Dockerfile           # Container image definition
├── docker-compose.yml   # Docker Compose orchestration
├── .dockerignore        # Docker build exclusions
├── .gitignore           # Git ignored files
├── requirements.txt     # Python dependencies
├── setup.py             # Package setup
├── README.md
└── eyeguard/
    ├── __main__.py      # Entry point, CLI argument parsing
    ├── app.py           # EyeGuardApp — orchestrates all components
    ├── capture.py       # CaptureThread — webcam frame capture (daemon thread)
    ├── detector.py      # FaceDetector — MediaPipe Face Mesh wrapper
    ├── gaze.py          # Gaze analysis: EAR, IPR, head pose estimation
    ├── decision.py      # FocusDecisionEngine — FSM with hysteresis
    ├── alert.py         # AlertManager — sound playback & escalation
    ├── config.py        # LiveConfig — thread-safe TOML configuration
    ├── constants.py     # MediaPipe landmark indices
    ├── ui.py            # EyeGuardTray, SettingsDialog (PyQt6)
    ├── utils.py         # Shared helpers
    ├── resources/
    │   ├── alarm_default.wav
    │   └── alarm_escalated.wav
    └── tests/
        ├── test_config.py     # Config loading & live updates
        ├── test_decision.py   # State machine transitions & timeouts
        └── test_gaze.py       # EAR, IPR, GazeResult calculations

Running Tests

pip install pytest
python -m pytest eyeguard/tests/ -v

Tests cover:

Config — default values, live updates, TOML parsing, unknown key handling
Decision engine — all state transitions, timeout behavior, pause/resume, alarm escalation
Gaze math — EAR for open/closed eyes, IPR for centered/off-center iris, is_looking_at_screen() logic

System Requirements

OS: Linux (tested on Ubuntu 22.04+)
Python: 3.10+
Hardware: Webcam (USB or integrated)
Display: X11 or Wayland (for system tray; use --headless otherwise)
Docker (optional): Docker Engine 20.10+ and Docker Compose v2 for containerized usage

Dependencies

Package	Purpose
`opencv-python ≥ 4.8`	Webcam capture, image processing, solvePnP
`mediapipe ≥ 0.10`	Face mesh detection (478 landmarks)
`numpy ≥ 1.24`	Array math for gaze calculations
`simpleaudio ≥ 1.0.4`	Cross-platform WAV playback
`PyQt6 ≥ 6.5`	System tray icon and settings dialog

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
eyeguard		eyeguard
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

👁️ EyeGuard

Features

Quick Start

CLI Options

Docker

Quick Start (Docker Compose)

Manual Docker Build & Run

Persist Configuration

Notes

Architecture

High-Level Overview

Data Processing Pipeline

Focus Decision State Machine

Threading & Sequence

Module Dependency Graph

Key Algorithms

Eye Aspect Ratio (EAR)

Iris Position Ratio (IPR)

Head Pose Estimation

Configuration

Timing

Detection

Camera

Alarm

Example config.toml

Project Structure

Running Tests

System Requirements

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example `config.toml`

Packages