This repository demonstrates the use of ensemble learning with fine-tuned CNN models for the classification of traditional Polish Christmas dishes. The project was developed as a part of an image classification hackathon.
The dataset consists of images from the following categories:
- Mushroom Soup (Zupa Grzybowa)
- Cheesecake (Sernik)
- Dumplings (Pierogi)
- Gingerbread (Pierniki)
- Poppy Seed Cake (Makowiec)
- Kutia (Kutia)
- Hunter's Stew (Bigos)
- Beetroot Soup (Barszcz)
The dataset was collected manually and supplemented using tools to download images from websites, ensuring a diverse representation of traditional Polish Christmas dishes.
The models used in this project are:
| Model | Number of Parameters | PyTorch Implementation | Related Paper |
|---|---|---|---|
| GhostNet 100 | 5,200,000 | GhostNet 100 (Hugging Face) | https://arxiv.org/abs/1911.11907 |
| EfficientNet-B0 | 5,288,548 | EfficientNet-B0 (torchvision) | https://arxiv.org/abs/1905.11946 |
| MobileNetV3 Large | 5,483,032 | MobileNetV3 Large (torchvision) | https://arxiv.org/abs/1905.02244 |
| ViT-Tiny | 5,717,416 | ViT-Tiny (TIMM) | https://arxiv.org/abs/2207.10666 |
| MNASNet1.3 | 6,282,256 | MNASNet1.3 (torchvision) | https://arxiv.org/abs/1807.11626 |
| RegNetY-800MF | 6,432,512 | RegNetY-800MF (torchvision) | https://arxiv.org/abs/2003.13678 |
| ShuffleNetV2 X2.0 | 7,393,996 | ShuffleNetV2 X2.0 (torchvision) | https://arxiv.org/abs/1807.11164 |
| EfficientNet-B1 | 7,794,184 | EfficientNet-B1 (torchvision) | https://arxiv.org/abs/1905.11946 |
| ResNet-18 | 11,689,512 | ResNet-18 (torchvision) | https://arxiv.org/abs/1512.03385 |
The learning curves for two models are down below:
![]() |
![]() |
![]() |
![]() |
In this project, an ensemble learning approach has been used with weighted voting to combine the predictions of multiple fine-tuned models. Each model's contribution is weighted based on its performance during validation.
The final four models for the ensemble were selected through an automated process of testing every combination of the models with weight powers (in the weighted voting scheme) ranging from 1 to 10. The combination with the highest F1 score on the test set was chosen as the final ensemble. The models whose training curves are shown above have made their way to the final ensemble.
The dataset was preprocessed and augmented with a transform pipeline. Examples of transformations include resizing, rotation, and color adjustments, as shown below:
To clone the repository, use the command:
git clone https://github.com/DzmitryPihulski/ensemble-finetuned-models.git




