This project performs sentiment analysis on IMDB movie reviews using both traditional machine learning (Logistic Regression) and deep learning (LSTM) approaches.
data_preprocessing.py: Preprocesses the raw IMDB datasetdata_analysis1.py: Implements Logistic Regression modeldata_analysis2.py: Implements LSTM modeldata_visualization.py: Creates visualizations of the resultsapp.py: Flask application for real-time sentiment predictionmodel1.pickle: Saved Logistic Regression modelmodel2.pth: Saved LSTM modelvectorizer.pickle: Saved TF-IDF vectorizervocab.npy: Vocabulary for the LSTM model
- Clone this repository
- Install required packages: pip install -r requirements.txt
- Download the IMDB dataset from Kaggle and rename it to "imdb_dataset.csv"
- Place the dataset in the project root directory
- Run data preprocessing:python data_preprocessing.py
- Train and evaluate models: python data_analysis1.py python data_analysis2.py
- Visualize results: python data_visualization.py
- Run the Flask app: python app.py
- Shilong Luo, Yuxuan Liu
This project is licensed under the MIT License - see the LICENSE.md file for details.