LinkForensics - Friendly Link Checker

A comprehensive Chrome extension + Flask ML API system for detecting phishing and malicious links in real-time. LinkForensics combines rule-based analysis with AI-powered LightGBM models to protect users while browsing the web.

🎯 Key Features

Dual Detection Engine: Rule-based heuristics + LightGBM AI model for binary classification (Safe/Phishing)
Real-time Link Scanning: Automatically scans all links on pages, including dynamic content in SPAs
Smart Warnings: Visual alerts for risky links with detailed threat analysis
Quick Lookup: Paste any URL in the popup for instant AI prediction
Dashboard: Full statistics and analytics of scanned links by severity
Local Processing: All analysis happens locally - no data sent to external servers
Whitelist Support: Trusted domain whitelist for safe sites
Chrome Extension v5: Modern Manifest v3 architecture

Chrome Web Store: LinkForensics - Friendly Link Checker

Dataset: Kaggle - malicious-url-classification-706k

📁 Project Structure

LinkForensics/
├── server.js                          # Express web server (port 9091) for dashboard
├── model_api.py                       # Flask ML API (port 5001) for predictions
├── requirements.txt                   # Python dependencies
├── package.json                       # Node.js dependencies
├── download_whitelist.py              # Utility to manage trusted domain whitelist
│
├── extension/                         # Chrome Extension files (Manifest v3)
│   ├── manifest.json                  # Extension config
│   ├── background.js                  # Service worker for API bridge & network monitoring
│   ├── content-script.js              # Injected script for link scanning
│   ├── popup.html/.js/.css            # Extension popup interface
│   ├── dashboard.html/.js             # Full dashboard with ML analysis
│   ├── warning.html/.js               # Safe landing page for risky links
│   └── README.md                      # Extension-specific documentation
│
├── public/                            # Web dashboard static files
│   ├── index.html
│   ├── app.js
│   └── style.css
│
└── artifacts/                         # Pre-trained ML models & features
    ├── models/
    │   ├── RUN6_LGB_LEX_4CLASS.joblib        # 4-class classifier
    │   ├── RUN7_LGB_LEX_TFIDF_BINARY.joblib # Binary classifier (active)
    │   └── RUN8_LGB_LEX_TFIDF_4CLASS.joblib # 4-class with TF-IDF
    ├── tfidf/
    │   ├── tfidf_vectorizer.joblib          # TF-IDF feature extraction
    │   └── svd_tfidf.joblib                 # TF-IDF SVD transformation
    └── lex/
        ├── lexical_feature_columns_binary.json   # Binary model features
        ├── lexical_feature_columns_4class.json   # 4-class model features
        └── whitelist.txt                        # Trusted domains list

🚀 Quick Start

Prerequisites

Python 3.8+ (for ML API)
Node.js 14+ (for web dashboard)
Chrome browser (for extension)

Step 1: Install Python Dependencies

pip install -r requirements.txt

Required packages:

flask - REST API framework
flask-cors - Cross-origin requests
lightgbm - LightGBM ML models
scikit-learn - Feature processing
numpy & joblib - Data handling

Step 2: Start the ML API Server

python model_api.py

Expected output:

==================================================
  LinkForensics Model API
  http://localhost:5001
  POST /predict  { url: '...' }
  GET  /inspect
==================================================

The API will:

Load pre-trained LightGBM models and artifacts
Listen on http://localhost:5001 for prediction requests
Support /predict endpoint for URL classification

Step 3: Install Node Dependencies & Start Web Dashboard

npm install
npm start

Web dashboard available at http://localhost:9091

Step 4: Load the Chrome Extension

Open Chrome and go to chrome://extensions/
Enable Developer mode (toggle, top-right)
Click Load unpacked
Select the extension/ folder
Pin the extension to your toolbar for easy access

🔧 Configuration & API Endpoints

ML API (`http://localhost:5001`)

POST /predict

Request:

{ "url": "https://example.com" }

Response:

{
  "url": "https://example.com",
  "risk_score": 0.15,
  "label": "safe",
  "confidence": 0.92,
  "features_used": 42
}

GET /inspect

Returns API health and loaded model information.

Web Dashboard (`http://localhost:9091`)

Serves static dashboard with link statistics and analytics.

📊 How It Works

Detection Pipeline

Content Script → Scans all links on page
Rule-based Analysis → Checks domain against whitelist, pattern matching
Feature Extraction → Generates lexical + TF-IDF features
LightGBM Model → Classifies link as Safe/Phishing
Risk Score → Combines rule + ML confidence
Visual Alert → Shows warning for high-risk links

Model Details

Active Model: RUN7_LGB_LEX_TFIDF_BINARY.joblib
Classification: Binary (Safe / Phishing)
Features: 42 lexical + TF-IDF features
Framework: LightGBM (fast, interpretable, low overhead)

🛡️ Privacy & Security

✅ All processing is local - No data sent to external servers
✅ Offline capable - Fallback to rule-based detection if API is unavailable
✅ No tracking - Extension doesn't collect or store user history
✅ Open source - Transparent detection logic

📝 License

See LICENSE file for details.

🤝 Contributing

Contributions are welcome! Areas for improvement:

Additional 4-class model integration
Whitelist expansion
Rule refinements
Performance optimization

📚 Additional Documentation

Extension README - Detailed extension setup and usage
Chrome Web Store - Official listing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LinkForensics - Friendly Link Checker

🎯 Key Features

📁 Project Structure

🚀 Quick Start

Prerequisites

Step 1: Install Python Dependencies

Step 2: Start the ML API Server

Step 3: Install Node Dependencies & Start Web Dashboard

Step 4: Load the Chrome Extension

🔧 Configuration & API Endpoints

ML API (`http://localhost:5001`)

Web Dashboard (`http://localhost:9091`)

📊 How It Works

Detection Pipeline

Model Details

🛡️ Privacy & Security

📝 License

🤝 Contributing

📚 Additional Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
artifacts		artifacts
extension		extension
extension_old		extension_old
public		public
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
model_api.py		model_api.py
package-lock.json		package-lock.json
package.json		package.json
privacy_policy.html		privacy_policy.html
requirements.txt		requirements.txt
server.js		server.js
whitelist.txt		whitelist.txt

Folders and files

Latest commit

History

Repository files navigation

LinkForensics - Friendly Link Checker

🎯 Key Features

📁 Project Structure

🚀 Quick Start

Prerequisites

Step 1: Install Python Dependencies

Step 2: Start the ML API Server

Step 3: Install Node Dependencies & Start Web Dashboard

Step 4: Load the Chrome Extension

🔧 Configuration & API Endpoints

ML API (http://localhost:5001)

Web Dashboard (http://localhost:9091)

📊 How It Works

Detection Pipeline

Model Details

🛡️ Privacy & Security

📝 License

🤝 Contributing

📚 Additional Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

ML API (`http://localhost:5001`)

Web Dashboard (`http://localhost:9091`)

Packages