Skip to content

MuhammadYasir85a/Probabilistic-Weather-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Typing SVG
Status Python NumPy Pandas SciPy Matplotlib Seaborn

Overview

A statistical machine learning project that forecasts precipitation events using conditional probability and Naive Bayes classification. Weather prediction is one of the most complex and impactful applications of statistical modeling in atmospheric sciences. This project demonstrates how probabilistic models can effectively forecast rainfall by analyzing relationships among atmospheric variables.

The model leverages real-world weather data and applies empirical analysis to estimate the likelihood of rain based on four key atmospheric measurements: temperature, humidity, air pressure, and wind speed.


Key Features

  • Statistical machine learning model based on Naive Bayes classification
  • Conditional probability calculations for rainfall prediction
  • Analysis of four atmospheric variables (temperature, humidity, pressure, wind speed)
  • Complete data preprocessing pipeline
  • Exploratory data analysis (EDA) with statistical insights
  • Rich data visualizations using Matplotlib and Seaborn
  • Comprehensive project report documenting methodology and findings
  • Reproducible Python implementation

Tech Stack

Programming Language:

  • Python 3.8+

Data Science Libraries:

  • NumPy — Efficient numerical computations
  • Pandas — Data manipulation and analysis
  • SciPy — Advanced statistical operations and probability modeling
  • Matplotlib — Static visualizations
  • Seaborn — Statistical data visualization

Machine Learning:

  • Naive Bayes Classification
  • Conditional Probability Models
  • Statistical Inference

Documentation:

  • Microsoft Word for project report
  • Markdown for repository documentation

Project Structure

Probabilistic-Weather-Prediction/
│
├── DATASET.csv          # Real-world weather dataset
├── Python code          # Main implementation script
├── Report.docx          # Comprehensive project report
└── README.md            # Project documentation

Methodology

1. Data Collection

The project uses a real-world weather dataset containing historical atmospheric measurements including temperature, humidity, air pressure, and wind speed.

2. Data Preprocessing

  • Handling missing values
  • Feature scaling and normalization
  • Encoding categorical variables (rain / no rain)
  • Splitting data into training and testing sets

3. Exploratory Data Analysis

  • Statistical summary of all variables
  • Distribution analysis of atmospheric features
  • Correlation analysis between variables and rainfall
  • Visualization of patterns and trends

4. Model Implementation

  • Application of Naive Bayes classification algorithm
  • Conditional probability calculations using Bayes' Theorem
  • Probability estimation for rainfall events

5. Visualization

  • Histograms showing variable distributions
  • Scatter plots for variable relationships
  • Heatmaps for correlation analysis
  • Bar charts for prediction results

6. Evaluation

  • Model performance assessment
  • Probability-based predictions analyzed
  • Insights documented in project report

Atmospheric Variables Analyzed

Variable Description Role in Prediction
Temperature Air temperature in degrees Influences moisture capacity
Humidity Relative humidity percentage Direct indicator of moisture
Air Pressure Atmospheric pressure measurement Indicates weather system changes
Wind Speed Wind velocity measurement Affects weather pattern movement

Mathematical Foundation

The project applies Bayes' Theorem for conditional probability:

P(Rain | Weather Features) = [P(Weather Features | Rain) × P(Rain)] / P(Weather Features)

This formula calculates the probability of rain given specific atmospheric conditions, forming the foundation of the Naive Bayes classifier used in this project.


Installation and Setup

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Setup Steps

git clone https://github.com/MuhammadYasir85a/Probabilistic-Weather-Prediction.git
cd Probabilistic-Weather-Prediction

Install required dependencies:

pip install numpy pandas scipy matplotlib seaborn scikit-learn jupyter

Run the analysis:

python "Python code"

Or open in Jupyter Notebook for interactive exploration:

jupyter notebook

Usage

  1. Load the dataset (DATASET.csv) into the Python environment
  2. Run the preprocessing steps
  3. Execute the Naive Bayes model training
  4. View statistical visualizations
  5. Generate predictions on new weather data
  6. Refer to Report.docx for detailed methodology and results

Visualizations Included

  • Distribution plots for each atmospheric variable
  • Correlation heatmap showing relationships
  • Box plots comparing rain vs no-rain conditions
  • Scatter plots for variable interactions
  • Probability distribution charts
  • Prediction confidence visualizations

Use Cases

  • Educational tool for understanding probabilistic ML
  • Foundation for weather forecasting research
  • Demonstration of Bayesian inference in real-world applications
  • Reference implementation for atmospheric data analysis
  • Starting point for advanced meteorological models

Future Improvements

  • Integration of additional atmospheric variables (cloud cover, dew point)
  • Comparison with other ML algorithms (Random Forest, Neural Networks)
  • Real-time weather data API integration
  • Multi-class classification (light rain, moderate rain, heavy rain)
  • Time series forecasting using LSTM
  • Web-based prediction interface
  • Geographic location-specific models

Performance Notes

The model demonstrates the practical application of statistical machine learning to real-world weather data. While Naive Bayes provides a strong baseline with interpretable results, more complex models could potentially capture non-linear relationships in atmospheric data for improved accuracy.


References

  • Bayes' Theorem and Conditional Probability
  • Naive Bayes Classifier theory and applications
  • Probabilistic meteorology research papers
  • SciPy and Scikit-learn documentation

Project Status

Status: Completed

This is an academic project completed as part of statistical learning coursework. The implementation, dataset, and report are all available in the repository for review and learning purposes.


Author

Muhammad Yasir

Computer Science Undergraduate at Namal University Mianwali
Aspiring AI and Computer Vision Engineer


Acknowledgments

  • Namal University Mianwali for academic guidance
  • Open-source data science community for tools and resources
  • Researchers in probabilistic meteorology for foundational work
  • Python data science ecosystem (NumPy, Pandas, SciPy, Matplotlib, Seaborn)

License

This project is licensed under the MIT License.


About

Weather prediction is a complex task in atmospheric science. This project uses conditional probability and a Naive Bayes model to predict rain based on features like humidity, temperature, and pressure using Python libraries such as Pandas, NumPy, Seaborn, and SciPy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors