-
By Shaokang Xie, Jiazhi Sun
-
This folder contains scripts for downloading and processing the Steam games dataset.
- Install the required python libraries:
python -m pip install -r data_processing/requirements.txt- Download the raw dataset (saves to
data_processing/raw/):
python data_processing/download_py.py- Preprocess the raw CSV into cleaned outputs:
python data_processing/preprocess.py \
--input data_processing/raw/games.csv \
--json_reviews data_processing/raw/games.json \
--out data_processing/steam_project_ready.csv \
--agg_out /tmp/steam_project_agg.csv \
--sample_out /tmp/steam_project_sample.csv- Generate a slimmed JSON for the frontend:
python data_processing/make_slim_json.py- Plot the analysis figures
python data_processing/steam_analysis_slide_figure.pydata_processing/raw/games.csvanddata_processing/raw/games.json(raw dataset from kaggle)data_processing/steam_project_ready.csv(cleaned dataset for slide figures)frontend/public/games_slim.json(frontend-ready slim JSON)data_processing/images/genre_value_bar.png(analysis figure used in slides)data_processing/images/peak_ccu_by_price_band.png(analysis figure used in slides)data_processing/images/price_trend_composite.png(analysis figure used in slides)data_processing/images/price_vs_popularity.png(analysis figure used in slides)data_processing/images/spiral_plot.png(analysis figure used in slides)
cd frontendnpm installnpm start