Extract, Transform, Load (ETL) is one of the most important weapons in the toolkit of a Data Scientist granting them the magical powers of using datasets of all shapes, types and sizes to stream into a single data warehouse where they all can be used collectively. In this project, a simple ETL pipeline is developed to allow the reader room to grasp the underlying concepts of ETL and to stimulate their imagination in how they might make ETL their own.
https://blog.det.life/build-an-etl-data-pipeline-using-python-139c6875b046
https://www.kaggle.com/datasets/unitednations/global-food-agriculture-statistics
- Download the file and use the fao_data_crops_data.csv for this demo