Fraud-Shield: AI-Powered Anti-Scam Line Bot

期末專題報告 - 智慧反詐騙助手 A Multimodal Anti-Scam Line Bot powered by Azure Cognitive Services & Google Gemini.

這是一個基於 Python Flask 與 LINE Messaging API 的全方位反詐騙機器人。系統採用 多模態分析 (Multimodal Analysis) 架構，結合了 本地知識庫、規則引擎、雲端搜尋 與 生成式 AI，提供使用者即時、準確且具備成本效益的資安風險評估。

核心亮點 (Key Features)

🚀 混合式檢索 (Hybrid Search)：優先比對本地資料庫 (Local KB)，未命中則啟動 Google Search 聯網查核，兼顧速度與廣度。
💰 成本優化快篩 (Cost-Effective Heuristics)：透過關鍵字權重與自然語言特徵，自動過濾「心靈雞湯」或「日常問候」，減少 80% 昂貴的 LLM API 呼叫。
🛡️ 惡意指令防禦 (Prompt Injection Defense)：具備識別「催眠指令」或「角色扮演攻擊」的能力，防止 AI 被惡意操控。
🤣 幽默互動模式 (Safe Roast Mode)：當確認訊息絕對安全（如長輩圖）時，AI 會切換至吐槽模式，增加使用者黏著度。

Project Structure

.
├── LICENSE                 # 授權文件
├── README.md               # 專案說明文件
├── app.py                  # 應用程式進入點 (Flask Server)
├── requirements.txt        # Python 依賴套件列表
├── config.ini              # 設定檔 (API Keys)
├── data/                   # 資料層
│   └── scam_dataset.json   # 本地詐騙特徵資料庫 (Local Knowledge Base)
└── src/
    ├── __init__.py         # Flask App 初始化
    ├── bot/                # LINE Bot 介面層
    │   ├── __init__.py
    │   ├── handlers.py     # 處理 LINE 事件 (Message, Postback)
    │   ├── routes.py       # LINE Bot Webhook 路由 (/callback)
    │   └── templates.py    # LINE Flex Message JSON 樣板
    ├── integrations/       # 外部 API 整合層
    │   ├── __init__.py
    │   ├── azure_agent.py      # Azure OpenAI / Cognitive Services
    │   ├── speech_agent.py     # [NEW] Azure Speech (STT/TTS)
    │   ├── gemini_agent.py     # Google Gemini AI (推理大腦 & 防禦)
    │   ├── search_engine.py    # Google Custom Search (事實查核)
    │   └── virus_total.py      # VirusTotal API (惡意連結掃描)
    ├── services/           # 核心業務邏輯層
    │   ├── __init__.py
    │   ├── orchestrator.py     # [Core] 協作中心：調度快篩、本地搜尋、雲端分析
    │   ├── local_search.py     # 本地檢索服務 (BM25 + Stop Words Filter)
    │   ├── image_analyzer.py   # 圖片分析 (OCR)
    │   ├── text_analyzer.py    # 文字語意分析
    │   └── url_analyzer.py     # 網址特徵與 Whois 分析
    ├── templates/          # Web Dashboard HTML
    │   └── index.html
    ├── utils/              # 通用工具
    │   ├── __init__.py
    │   ├── keywords.py         # Heuristic Pre-filter (關鍵字快篩規則)
    │   ├── logger.py           # Log 設定
    │   └── validators.py       # 資料驗證工具
    └── web/                # Web Dashboard 路由
        ├── __init__.py
        └── routes.py

System Architecture

下圖展示了系統的 分層防禦機制 (Layered Defense Mechanism)：從本地快篩到雲端 AI 的決策流程。

graph TD
    %% 定義樣式
    classDef safe fill:#e6fffa,stroke:#2c7a7b,stroke-width:2px;
    classDef danger fill:#fff5f5,stroke:#c53030,stroke-width:2px;
    classDef logic fill:#ebf8ff,stroke:#2b6cb0,stroke-width:2px;
    classDef external fill:#fafffd,stroke:#ed8936,stroke-width:2px,stroke-dasharray: 5 5;

    %% 流程開始
    User((User Input)) --> InputHandler{Input Type?}

    %% 影像處理路徑
    InputHandler -- Image --> OCR[Azure Computer Vision]:::external
    OCR --> ExtractText[Extracted Text]
    ExtractText --> LocalSearch

    %% 語音處理路徑
    InputHandler -- Audio --> STT[Azure Speech-to-Text]:::external
    STT --> ExtractText

    %% 文字處理路徑
    InputHandler -- Text --> LocalSearch

    %% 1. 本地知識庫檢索 (最快)
    subgraph Phase 1: Local Knowledge Base
        LocalSearch{Local DB Hit?}:::logic
        LocalSearch -- "Yes (Score > 8.0)" --> LocalResult[Return Pre-defined Warning]:::danger
    end

    %% 2. 關鍵字快篩 (省錢策略)
    subgraph Phase 2: Heuristic Pre-filter
        LocalSearch -- No --> KeywordCheck{Keyword Check}:::logic
        KeywordCheck -- "Safe Vibe (e.g. 心靈雞湯)" --> RoastGen[Generate Safe Roast]:::safe
        KeywordCheck -- "Risk Keywords / URL" --> CloudAnalysis
    end

    %% 3. 雲端分析 (外部 API)
    subgraph Phase 3: Cloud Analysis
        CloudAnalysis[Cloud Orchestrator]

        %% 平行處理
        CloudAnalysis --> URLCheckhttps://en.wikipedia.org/wiki/Analyser:::logic
        CloudAnalysis --> GoogleSearch[Google Custom Search]:::external
        CloudAnalysis --> Sentiment[Azure Language Sentiment]:::external

        URLCheck --> RiskScore
        GoogleSearch --> RiskScore
        Sentiment --> RiskScore
    end

    %% 4. AI 深度推理與防禦
    subgraph Phase 4: Gemini AI Logic
        RiskScore --> GeminiPrompt[Construct Prompt]
        GeminiPrompt -- "Inject Evidence & Rules" --> GeminiLLM[Google Gemini Model]:::external

        GeminiLLM --> InjectionCheck{Prompt Injection?}:::danger

        InjectionCheck -- "Yes (Attack)" --> BlockUser[🛡️ Block & Alert]:::danger
        InjectionCheck -- "No" --> FraudCheck{Is Scam?}:::logic

        FraudCheck -- "Yes (High Risk)" --> ScamAlert[🚫 Scam Warning]:::danger
        FraudCheck -- "No (False Positive)" --> Overrule[✅ Overrule to Safe]:::safe
    end

    %% 輸出結果
    LocalResult --> FinalOutput
    RoastGen --> FinalOutput
    BlockUser --> FinalOutput
    ScamAlert --> FinalOutput
    Overrule --> FinalOutput([Line Bot Reply])

    %% 連結樣式
    linkStyle default stroke:#333,stroke-width:2px;

Tasks & Functional Modules

1. 智慧詐騙驗證 (Verification Engine)

AI Logic Reasoning (Google Gemini)
負責最終仲裁，並解釋為何該訊息被判定為詐騙。
Prompt Injection Defense: 實作 System Prompt 防禦層，阻擋惡意指令攻擊。
Hybrid Search Engine
Local KB: 使用 BM25 演算法比對 data/scam_dataset.json，處理常見詐騙劇本。
Cloud Search: 整合 Google Custom Search API，查詢 165 反詐騙、MyGoPen 等權威來源。
Heuristic Pre-filter
基於規則的過濾器，自動識別無害內容（如：問候、生活感言），略過昂貴的雲端分析。

2. 網址與網域防護 (URL Protection)

網域特徵檢測: 針對 .xyz, .top 等高風險頂級域名加權扣分。
Typosquatting 檢測: 使用 Levenshtein Distance 演算法識別偽造官方網址 (如 g0ogle.com)。
Whois 活躍度分析: 串接 Whois API，偵測剛註冊不到 30 天的高風險網域。

3. 多媒體分析 (Multimedia)

圖片文字辨識 (OCR)
整合 Azure Computer Vision，精準提取圖片中的投資群組對話或詐騙公告。
語音詐騙偵測 (Azure Speech Service) (Class Topic)
整合 Azure Speech SDK，將語音訊息轉為文字 (STT)，並自動銜接詐騙關鍵字與情緒分析。

4. 互動體驗 (UX & Engagement)

Safe Roast (幽默吐槽)
當系統判定訊息 100% 安全時，AI 會化身「犀利鄉民」給予幽默點評。
Multilingual Support
整合 Azure Translator，支援多國語言詐騙訊息的翻譯與偵測。

Working Table (Development Status)

Module / Feature	Status	Assignee	Tech Stack	Note
Line Bot Core	✅	`zcy`	Flask, Line SDK	Flex Message UI
Orchestrator	✅	`JelyF1shhhhhh`	Python Logic	Heuristic Pre-filter
Local Search	✅	`JelyF1shhhhhh`	`rank_bm25`, `jieba`	Stop Words Filter
Prompt Defense	✅	`whylin`	Gemini System Prompt	防止指令注入攻擊
URL Analyzer	✅	`JaniceLin`	Regex, Whois	Typosquatting Detection
Image OCR	✅	`Jay`	Azure Computer Vision	支援手寫/印刷體
AI Reasoning	✅	`Jay`	Google Gemini Pro	解釋性 AI (Explainable AI)
Search Engine	✅	`sallyday`	Google Custom Search	事實查核
Translation	✅	`JaniceLin`	Azure Translator	多語言支援
Web Dashboard	✅	`zcy`	HTML/JS	系統監控面板
Sentiment/NER	✅	`whylin`	Azure Language	情緒/實體分析
Voice Analysis	✅	`sallyday`	Azure Speech	語音轉文字
Poster / Slides	⬜	[待認領]	Canva	期末報告與展演準備

Initial Setup

1. Python Environment

建議使用虛擬環境以隔離套件依賴：

# 1. 建立虛擬環境
python -m venv .venv

# 2. 進入環境 (Windows)
.\.venv\Scripts\activate
# 2. 進入環境 (Mac/Linux)
# source .venv/bin/activate

# 3. 安裝依賴套件
pip install -r requirements.txt

2. Configuration (`config.ini`)

請複製以下內容並填入您的 API Keys 至 config.ini：

[Line]
TOKEN = <Channel Access Token>
SECRET = <Channel Secret>

[AzureComputerVision]
KEY = <Azure CV Key>
ENDPOINT = <Azure CV Endpoint>

[GoogleCustomSearch]
CX = <Google Search Engine ID>
API_KEY = <Google Cloud API Key>

[GoogleGemini]
API_KEY = <Google Gemini API Key>

[AzureTranslator]
Key = <Translator Key>
EndPoint = <Translator Endpoint>
Region = <Region, e.g., eastus>

[AzureLanguage]
KEY = <Language Key>
ENDPOINT = <Language Endpoint>

[AzureSpeech]
KEY = <Speech Key>
REGION = <Speech Region>

3. Run the Server

python app.py

預設運行於 http://localhost:5002。請使用 ngrok 或 Cloudflare Tunnel 將其公開至網際網路以供 LINE Webhook 連接。

Presentation Outline (For Final Project)

Introduction: 針對長輩族群設計的 Line 防詐助手。
Architecture: 展示 Orchestrator 如何調度 Local/Cloud/AI 資源 (展示 Mermaid 圖)。
Key Techs:

Cost-Down: 如何利用 keywords.py 節省 API 費用。
Security: 展示 IGNORE ALL RULES 攻擊與防禦結果。
Multimodal: 語音 (STT) 與影像 (OCR) 的整合應用。

Demo:

場景一：假投資群組截圖 (OCR + Search)。
場景二：惡意指令攻擊 (AI Defense)。
場景三：早安長輩圖 (Safe Roast)。
場景四：詐騙語音訊息 (STT)。

Future Work: 建立使用者社群回報機制。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fraud-Shield: AI-Powered Anti-Scam Line Bot

核心亮點 (Key Features)

Project Structure

System Architecture

Tasks & Functional Modules

1. 智慧詐騙驗證 (Verification Engine)

2. 網址與網域防護 (URL Protection)

3. 多媒體分析 (Multimedia)

4. 互動體驗 (UX & Engagement)

Working Table (Development Status)

Initial Setup

1. Python Environment

2. Configuration (`config.ini`)

3. Run the Server

Presentation Outline (For Final Project)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Fraud-Shield: AI-Powered Anti-Scam Line Bot

核心亮點 (Key Features)

Project Structure

System Architecture

Tasks & Functional Modules

1. 智慧詐騙驗證 (Verification Engine)

2. 網址與網域防護 (URL Protection)

3. 多媒體分析 (Multimedia)

4. 互動體驗 (UX & Engagement)

Working Table (Development Status)

Initial Setup

1. Python Environment

2. Configuration (config.ini)

3. Run the Server

Presentation Outline (For Final Project)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Configuration (`config.ini`)

Packages