Analysis of Spotify chart data, combining chart rankings with audio features to explore music trends.
| Folder | Dataset |
|---|---|
data/kaggle1/ |
Spotify Charts — daily/weekly top-200 and viral-50 charts by country |
data/kaggle2/ |
Spotify Dataset 1921-2020 (600k+ Tracks) — track audio features and artist metadata |
data/european_countries.csv |
List of European countries used to filter the charts |
Note: To run this notebook you must first download charts.csv from the Spotify Charts dataset on Kaggle and place it in data/kaggle1/. The file is ~3.2 GB, which exceeds GitHub's file-size limit, so it is not included in this repository.
Note: To run this notebook you must first download tracks.csv and artists.csv from the Spotify Dataset 1921-2020 (600k+ Tracks) dataset on Kaggle and place it in data/kaggle2/. The files exceed GitHub's file-size limit, so it is not included in this repository.
The notebooks in the notebooks/ directory are designed to be executed in numerical order according to their file names (e.g., 00_, 01_, 02_, etc.). This ensures that data processing and analysis steps are performed in the correct sequence, as each notebook may depend on the outputs generated by the previous ones.
Please start with the lowest-numbered notebook and proceed sequentially.
- Clone the repository.
- Create a virtual environment (recommended).
- Install the required Python packages using the
requirements.txtfile:
pip install -r requirements.txt