This project implements a Dockerized data system for storing environmental sensor data in batches. It was developed as part of the Data Engineering portfolio project (DLBDSEDE02) at IU International University of Applied Sciences.
To design and deploy a portable, scalable data storage solution that ingests environmental sensor data (temperature, humidity, CO2, etc.) in batch format and makes it accessible for future front-end and analytical applications.
- Source: Kaggle IoT Telemetry Dataset
- Type: CSV with temperature, humidity, and other sensor readings
- Database: MongoDB (Docker containerized)
- Programming Language: Python
- Tools: Docker, Docker Compose, PyMongo
To run the project locally:
# Clone the repository
git clone https://github.com/fatimagulomova/sensor-batch-storage-system.git
cd sensor-batch-storage-system
# Build and run the Docker container
docker-compose up --build
Once the containers are up, open your browser and go to:
http://localhost:8081- Username: admin
- Password: admin123
sensor-batch-storage-system/
│
├── data/
│ ├── iot_telemetry_data.csv # Raw dataset
│ └── sample_cleaned_sensors.csv # Cleaned and reduced version
│
├── scripts/
│ ├── clean_csv.py # Cleans and filters raw sensor data
│ └── load_data.py # Loads cleaned data into MongoDB
│
├── Dockerfile # Environment setup for Python
├── docker-compose.yml # MongoDB + Mongo Express + App service
└── README.md # Project description- clean_csv.py processes the raw CSV and generates a cleaned version
- load_data.py connects to the running MongoDB container and loads sample_cleaned_sensors.csv into the sensor_data collection
- Mongo Express provides a web interface to verify data loading
Fotimakhon Gulomova – LinkedIn