Skip to content

yeshaswiniarjula/IDS_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# SentinelNet IDS

SentinelNet IDS is a machine learning-based Intrusion Detection System built as a mini project:

**"SentinelNet — Smart Network Intrusion Detection System."**

This version implements the core modules end-to-end:

- Network traffic classification using supervised ML (SVM with PCA)

- Anomaly detection using unsupervised ML (Isolation Forest)

- Synthetic data generation using CTGAN for class balancing

- Interactive Streamlit dashboard for real-time predictions

- Trained models saved and loaded via joblib for fast inference

- KDD Cup dataset (train/test) for benchmarking

---

## Tech Stack

- Python 3.x

- Streamlit

- Scikit-learn (SVM, Isolation Forest, PCA)

- CTGAN (synthetic data generation)

- Pandas, NumPy

- Matplotlib, Seaborn

- Joblib (model persistence)

- PyArrow (parquet file support)

---

## Project Structure


IDS\_project/

├── appli.py                         # Streamlit web application

├── train\_models.py                  # Script to train SVM and Isolation Forest models

├── convert.py                       # Data conversion utility (CSV to Parquet)

├── sentinelNet.ipynb                # Main ML notebook (supervised approach)

├── Unsupervised(sentinelNet).ipynb  # Unsupervised/anomaly detection notebook

├── KDDTrain.parquet                 # Training dataset (KDD Cup)

├── KDDTest.csv                      # Test dataset (KDD Cup)

├── KDDTest.parquet                  # Test dataset (Parquet format)

├── ctgan\_synthetic\_data.csv         # CTGAN-generated synthetic samples

├── final\_attack\_predictions.csv     # Output predictions file

├── svm\_model.joblib                 # Trained SVM model

├── svm\_scaler.joblib                # Scaler for SVM input features

├── svm\_pca.joblib                   # PCA transformer for SVM

├── iso\_model.joblib                 # Trained Isolation Forest model

├── iso\_scaler.joblib                # Scaler for Isolation Forest

├── label\_encoders.joblib            # Label encoders for categorical features

├── requirements.txt                 # Python dependencies

├── runtime.txt                      # Python runtime version

└── .streamlit/                      # Streamlit configuration

---

## Run Locally

\# 1. Clone the repository

git clone https://github.com/yeshaswiniarjula/IDS\_project.git

cd IDS\_project



\# 2. Create a virtual environment

python3 -m venv .venv

source .venv/bin/activate        # On Windows: .venv\\Scripts\\activate



\# 3. Install dependencies

pip install -r requirements.txt



\# 4. Run the Streamlit app

streamlit run appli.py

Then open http://localhost:8501 in your browser.

---

## Train Models (Optional)

If you want to retrain the models from scratch:

python3 train\_models.py

This will regenerate the following saved model files:

- svm\_model.joblib

- svm\_scaler.joblib

- svm\_pca.joblib

- iso\_model.joblib

- iso\_scaler.joblib

- label\_encoders.joblib

Once models exist, the Streamlit app auto-loads them for predictions.

---

## ML Models

### Supervised — SVM Classifier

- Algorithm: Support Vector Machine (SVM)

- Preprocessing: Label Encoding → Standard Scaling → PCA

- Task: Multi-class attack type classification

- Dataset: KDD Cup 99 (Train/Test)

### Unsupervised — Isolation Forest

- Algorithm: Isolation Forest

- Task: Anomaly detection (normal vs. attack traffic)

- Used when labeled data is unavailable

### Synthetic Data — CTGAN

- Used to generate synthetic network traffic samples

- Helps balance minority attack classes in training data

---

## Dataset

This project uses the **KDD Cup 1999** dataset — a standard benchmark for intrusion detection research.

- KDDTrain.parquet — Training split

- KDDTest.csv / KDDTest.parquet — Test split

Features include network connection attributes such as protocol type, service, flag, byte counts, and more.

---

## About

SentinelNet IDS is a machine learning-based intrusion detection system that identifies malicious network activity using both supervised (SVM) and unsupervised (Isolation Forest) approaches, with a Streamlit-based interactive interface for visualization and prediction.

About

SentinelNet IDS is a machine learning-based intrusion detection system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors