Skip to content

lucas-levy/compstats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computational Statistics

Labs from the Master MVA Computational Statistics class, taught by Prof. Stéphanie Allassonnière. Each lab assignment explores a core topic in statistical estimation and bayesian inference, combining theoretical derivations with practical Python simulations.

TP1. Estimators and SGD Algorithm

In a first exercise, we compare two estimators for the uniform model $\mathcal{U}([0,\theta])$: the method-of-moments estimator and the maximum likelihood estimator (MLE). The MLE is shown to have a stricyly lower quadratic risk for $n\ge 2$.

In the second part, we implement the Stochastic Gradient Descent (SGD) algorithm from scratch to learn a linear classifier, then studies how observation noise degrades estimation quality.

visualization_tp1

Finally, the method is applied to the UCI Heart Disease dataset, reaching >70% accuracy.

TP2. EM Algorithm for GMMs

This second practical work is centered on parameter estimation of a Gaussian Mixture Model (GMM) using the Expectation-maximization (EM) algorithm. We first implement a way to sample from a GMM, and then implement the EM algorithm in this particular setting.

visualization_tp2_1 visualization_tp2_2

TP3. Hasting-Metropolis and Gibbs Samplers

In the first exercise, we study a hierarchical population model for longitudinal data, such as disease progression measurements, and we try to estimate the model's parameters. Because direct sampling from the posterior is not possible, we implement the Stochastic Approximation EM (SAEM) algorithm using a Metropolis-Hastings (MH) sampler for the latent variables.

The second exercise explores Data Augmentation, using Markov chain Monte Carlo (MCMC). We construct a bivariate Markov chain and use a Gibbs sampler to approximate a specific density.

visualization_tp3_1

TP4. Improving the Metropolis-Hastings Algorithm

In this final practical work, we explore advanced techniques to overcome common limitations of the standard MH algorithm. First, we tackle the difficulty of tuning the proposal distribution's parameters by implementing an Adaptive MH within Gibbs sampler. This algorithm automatically adjusts the variances of the proposal distributions on the fly to target an optimal acceptance rate.

Next, we address the challenge of sampling from highly multimodal distributions, where standard MCMC often gets stuck in a single local mode. We demonstrate this failure on a toy target distribution defined as a mixture of 20 well-separated Gaussians To solve this, we implement Parallel Tempering, which runs multiple Markov chains in parallel at varying "temperatures". This allows to improve exploration, and then to correctly sample from the target distribution.

visualization_tp4_1

About

Labs from the Master MVA "Computational Statistics" class

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors