GitHub - yeshaswiniarjula/IDS_project: SentinelNet IDS is a machine learning-based intrusion detection system

# SentinelNet IDS

SentinelNet IDS is a machine learning-based Intrusion Detection System built as a mini project:

**"SentinelNet — Smart Network Intrusion Detection System."**

This version implements the core modules end-to-end:

- Network traffic classification using supervised ML (SVM with PCA)

- Anomaly detection using unsupervised ML (Isolation Forest)

- Synthetic data generation using CTGAN for class balancing

- Interactive Streamlit dashboard for real-time predictions

- Trained models saved and loaded via joblib for fast inference

- KDD Cup dataset (train/test) for benchmarking

---

## Tech Stack

- Python 3.x

- Streamlit

- Scikit-learn (SVM, Isolation Forest, PCA)

- CTGAN (synthetic data generation)

- Pandas, NumPy

- Matplotlib, Seaborn

- Joblib (model persistence)

- PyArrow (parquet file support)

---

## Project Structure


IDS\_project/

├── appli.py                         # Streamlit web application

├── train\_models.py                  # Script to train SVM and Isolation Forest models

├── convert.py                       # Data conversion utility (CSV to Parquet)

├── sentinelNet.ipynb                # Main ML notebook (supervised approach)

├── Unsupervised(sentinelNet).ipynb  # Unsupervised/anomaly detection notebook

├── KDDTrain.parquet                 # Training dataset (KDD Cup)

├── KDDTest.csv                      # Test dataset (KDD Cup)

├── KDDTest.parquet                  # Test dataset (Parquet format)

├── ctgan\_synthetic\_data.csv         # CTGAN-generated synthetic samples

├── final\_attack\_predictions.csv     # Output predictions file

├── svm\_model.joblib                 # Trained SVM model

├── svm\_scaler.joblib                # Scaler for SVM input features

├── svm\_pca.joblib                   # PCA transformer for SVM

├── iso\_model.joblib                 # Trained Isolation Forest model

├── iso\_scaler.joblib                # Scaler for Isolation Forest

├── label\_encoders.joblib            # Label encoders for categorical features

├── requirements.txt                 # Python dependencies

├── runtime.txt                      # Python runtime version

└── .streamlit/                      # Streamlit configuration

---

## Run Locally

\# 1. Clone the repository

git clone https://github.com/yeshaswiniarjula/IDS\_project.git

cd IDS\_project



\# 2. Create a virtual environment

python3 -m venv .venv

source .venv/bin/activate        # On Windows: .venv\\Scripts\\activate



\# 3. Install dependencies

pip install -r requirements.txt



\# 4. Run the Streamlit app

streamlit run appli.py

Then open http://localhost:8501 in your browser.

---

## Train Models (Optional)

If you want to retrain the models from scratch:

python3 train\_models.py

This will regenerate the following saved model files:

- svm\_model.joblib

- svm\_scaler.joblib

- svm\_pca.joblib

- iso\_model.joblib

- iso\_scaler.joblib

- label\_encoders.joblib

Once models exist, the Streamlit app auto-loads them for predictions.

---

## ML Models

### Supervised — SVM Classifier

- Algorithm: Support Vector Machine (SVM)

- Preprocessing: Label Encoding → Standard Scaling → PCA

- Task: Multi-class attack type classification

- Dataset: KDD Cup 99 (Train/Test)

### Unsupervised — Isolation Forest

- Algorithm: Isolation Forest

- Task: Anomaly detection (normal vs. attack traffic)

- Used when labeled data is unavailable

### Synthetic Data — CTGAN

- Used to generate synthetic network traffic samples

- Helps balance minority attack classes in training data

---

## Dataset

This project uses the **KDD Cup 1999** dataset — a standard benchmark for intrusion detection research.

- KDDTrain.parquet — Training split

- KDDTest.csv / KDDTest.parquet — Test split

Features include network connection attributes such as protocol type, service, flag, byte counts, and more.

---

## About

SentinelNet IDS is a machine learning-based intrusion detection system that identifies malicious network activity using both supervised (SVM) and unsupervised (Isolation Forest) approaches, with a Streamlit-based interactive interface for visualization and prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.streamlit		.streamlit
KDDTest.csv		KDDTest.csv
KDDTest.parquet		KDDTest.parquet
KDDTrain.parquet		KDDTrain.parquet
README.md		README.md
Unsupervised(sentinelNet).ipynb		Unsupervised(sentinelNet).ipynb
appli.py		appli.py
convert.py		convert.py
ctgan_synthetic_data.csv		ctgan_synthetic_data.csv
final_attack_predictions.csv		final_attack_predictions.csv
iso_model.joblib		iso_model.joblib
iso_scaler.joblib		iso_scaler.joblib
label_encoders.joblib		label_encoders.joblib
requirements.txt		requirements.txt
runtime.txt		runtime.txt
sentinelNet.ipynb		sentinelNet.ipynb
svm_model.joblib		svm_model.joblib
svm_pca.joblib		svm_pca.joblib
svm_scaler.joblib		svm_scaler.joblib
train_models.py		train_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages