Twitter Sentiment Analysis

Real-time tweet sentiment classifier using NLP and ML.

Overview

Classifies tweets as positive or negative using a pipeline of text preprocessing and ML models trained on 1.6 million tweets (Sentiment140 dataset).

Results

Model	Train Accuracy	Test Accuracy	Overfitting Gap
Logistic Regression	79.87%	77.67%	2.2%
Bernoulli Naive Bayes	81.45%	76.48%	5.0%
LinearSVC	86.23%	76.97%	9.3%

Logistic Regression selected as final model — smallest train/test gap (2.2%) vs LinearSVC (9.3%), indicating better generalization on unseen data.

Live Prediction Demo

=======================================================
Tweet                                    Sentiment
=======================================================
I love this product, it works amazingly  Positive 😊
This is the worst experience I have ever Negative 😞
Just got promoted at work, so happy righ Positive 😊
My phone broke and customer service was  Negative 😞

Note: The model struggles with negation (e.g. "I am not happy" → Positive). This is a known limitation of bag-of-words + TF-IDF approaches, which lose word order. Future improvement: use a sequence model (LSTM, BERT) that captures context.

Pipeline

Raw Tweet → Clean Text → Tokenize → Remove Stopwords → Stem → TF-IDF Vectors → ML Model → Sentiment Label

Tech Stack

Python, Scikit-learn, NLTK, Pandas, NumPy, Matplotlib, Seaborn, Swifter

Dataset

Sentiment140 — 1.6M tweets (800K positive, 800K negative), sourced from Kaggle.

Project Structure

twitter-sentiment-analysis/
├── notebooks/
│   └── TSA.ipynb
├── README.md
└── requirements.txt

Run

Open notebooks/TSA.ipynb in Google Colab or Jupyter and run all cells.

Kaggle API credentials (kaggle.json) required for dataset download. See Kaggle API docs for setup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Sentiment Analysis

Overview

Results

Live Prediction Demo

Pipeline

Tech Stack

Dataset

Project Structure

Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Twitter Sentiment Analysis

Overview

Results

Live Prediction Demo

Pipeline

Tech Stack

Dataset

Project Structure

Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages