This project combines two functionalities: predicting video game sales based on various features and recommending similar games based on user-selected titles. It leverages machine learning and natural language processing techniques.
- Objective: Predict global sales (
Global_Sales) of video games using features like platform, genre, publisher, critic score, and user score. - Techniques Used:
- Data preprocessing (handling missing values, standardization, one-hot encoding)
- Linear regression model
- Key Libraries:
pandas,scikit-learn
- Objective: Suggest similar video games based on genre, platform, and publisher.
- Techniques Used:
- TF-IDF (Term Frequency-Inverse Document Frequency) vectorization
- Cosine similarity for recommendation
- Key Libraries:
pandas,scikit-learn
- Clone this repository:
git clone https://github.com/bukkybyte/video_games_machine_learning.git
- Navigate to the project directory:
cd video_games_machine_learning - Install required libraries:
pip install -r requirements.txt
- The prediction pipeline includes preprocessing of numeric and categorical data using a
PipelineandColumnTransformer. - To train and evaluate the model:
- Run the provided script, which splits the data into training and testing sets, trains the model, and evaluates it using metrics such as Mean Squared Error (MSE) and Root Mean Squared Error (RMSE).
- Output:
- MSE and RMSE values for the test set.
- Preprocess the data to create a
combined_featurescolumn usingGenre,Platform, andPublisher. - Use TF-IDF and cosine similarity to compute similarity between games.
- Run the recommendation system:
- Pass a game title to the
get_recommendationsfunction to receive a list of 10 similar games. - Example:
get_recommendations("Game Title")
- Pass a game title to the
- Sample output after training:
Mean Squared Error (MSE): 0.12345 Root Mean Squared Error (RMSE): 0.35123
- Sample recommendations for
Super Mario:Recommendations for Super Mario: 1. Mario Kart 2. Super Mario Bros 3. Mario Party 4. Super Smash Bros 5. Yoshi's Island ...
Video_Games.csv: Dataset used for prediction and recommendation.video_games.ipynb: Main script to run both functionalities.requirements.txt: List of required Python packages.
Contributions are welcome! If you'd like to improve this project, follow these steps:
- Fork the repository.
- Create a feature branch (
git checkout -b feature-name). - Commit your changes (
git commit -m "Add feature-name"). - Push to the branch (
git push origin feature-name). - Open a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
- The dataset is sourced from Kaggle.
- Special thanks to contributors and the open-source community.