Network Intrusion Detection System

A two-stage ML classifier for network attack detection, trained on the NSL-KDD dataset (or a synthetic fall-back if you don't have the dataset locally).

Architecture

Stage	Task	Model
1	Binary: attack vs. benign	Random Forest
2	Multi-class: attack category (DoS / Probing / R2L / U2R)	Gradient Boosting

Project layout

06_network_ids/
├── data/
│   └── README.md                   # how to grab NSL-KDD
├── models/
│   ├── train.py
│   └── *.pkl                       # generated
├── evaluation/
│   ├── evaluate.py
│   └── adversarial_eval.py
├── results/
├── requirements.txt
└── README.md

Setup

cd 06_network_ids

python -m venv .venv
source .venv/bin/activate              # Linux / macOS
# .venv\Scripts\Activate.ps1            # Windows PowerShell

pip install -r requirements.txt

Run it (with NSL-KDD, recommended)

Download the dataset:

# Linux / macOS
wget https://iscxdownloads.cs.unb.ca/iscxdownloads/NSL-KDD/NSL-KDD.zip
unzip NSL-KDD.zip -d data/

# Windows (PowerShell)
Invoke-WebRequest https://iscxdownloads.cs.unb.ca/iscxdownloads/NSL-KDD/NSL-KDD.zip -OutFile NSL-KDD.zip
Expand-Archive NSL-KDD.zip data\

You should now have data/KDDTrain+.txt.

Run it (without the dataset)

If data/KDDTrain+.txt is not present, the training script automatically falls back to a small synthetic NSL-KDD-shaped dataset so the project still runs end-to-end:

python models/train.py            # auto-detects whether NSL-KDD is present
python evaluation/evaluate.py
python evaluation/adversarial_eval.py

You can also force the synthetic mode explicitly:

python models/train.py --synthetic

Adversarial perturbation example

Sample output from adversarial_eval.py (synthetic data):

   Epsilon   Detection rate    Bypass rate
--------------------------------------------
      0.00           89.92%         10.08%
      0.50           83.13%         16.87%
      1.00           76.95%         23.05%
      2.00           68.52%         31.48%

Detection rate falls roughly linearly as feature noise grows — useful as a baseline for evasion-robustness work.

Notes on `imblearn`

The training script uses SMOTE for class balancing if imblearn is installed. If it's not available, it skips SMOTE and prints a warning. This is fine for a smoke test; install imbalanced-learn for the full pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Network Intrusion Detection System

Architecture

Project layout

Setup

Run it (with NSL-KDD, recommended)

Run it (without the dataset)

Adversarial perturbation example

Notes on `imblearn`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
evaluation		evaluation
models		models
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Network Intrusion Detection System

Architecture

Project layout

Setup

Run it (with NSL-KDD, recommended)

Run it (without the dataset)

Adversarial perturbation example

Notes on imblearn

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Notes on `imblearn`

Packages