This project focuses on detecting phishing (spam) emails using Natural Language Processing (NLP) and Deep Learning (LSTM). The system analyzes email text and classifies it as Safe Email or Phishing Email.
- ✅ Detect phishing emails from text input
- ✅ Batch prediction using CSV file
- ✅ ZIP folder upload (multiple emails)
- ✅ Deep learning model (LSTM) for better accuracy
- ✅ User-friendly interface using Streamlit
- Python
- NLP (Text preprocessing, Tokenization)
- TensorFlow / Keras (LSTM Model)
- Streamlit (Web App)
- Pandas, NumPy
project/
│
├── app.py # Streamlit app
├── utils.py # Prediction function
├── model/
│ ├── model.keras # Trained LSTM model
│ ├── tokenizer.pkl # Tokenizer
│ └── config.json # Max length
│
├── dataset/
│ emails.csv
│
└── README.md
- Safe Emails: 11,322
- Phishing Emails: 7,328
- Total Emails: 18,650
git clone https://github.com/RajiReddy15/Email_phishing
cd projectpip install -r requirements.txtstreamlit run app.py- Enter email text
- Click Check
- Get prediction (Spam / Safe)
- Upload CSV file
- Select text column
- Get predictions for all rows
- Upload ZIP containing
.txtemails - Get predictions for each file
Text → Tokenization → Padding → Embedding → LSTM → Dense → Output
Input:
Congratulations! You won a free prize.
Output:
Phishing Email
This project demonstrates how NLP and deep learning can be used to detect phishing emails effectively. The system improves email security and reduces manual effort.
- Use BERT/Transformers for higher accuracy
- Multilingual phishing detection
- Real-time email filtering system
- Integration with email services
- Pardhu
- Mahesh
- Raji Reddy
This project is for academic purposes.