✂️ PruneVision AI — Hybrid Self-Pruning Framework

PruneVision AI is a PyTorch-based hybrid self-pruning vision framework that combines feature representations from MobileNetV3, ResNet-18, and EfficientNet-B0 with custom prunable layers and learnable gating mechanisms for dynamic sparsity during training. The project trains compact classifiers on CIFAR-10, applies L1 regularization to encourage sparse gates, and exports pruned models for efficient inference. The implementation is organized under Full Implementation, while the repository root stays focused on this report.

🚀 Overview

The codebase features a hybrid architecture with three core components:

Hybrid Modeling: Ensemble of MobileNetV3, ResNet-18, and EfficientNet-B0 backbones with feature fusion and custom prunable layers
Dynamic Training: Staged sparsity schedule with L1 regularization on learnable gates for automatic pruning
Deployment: Evaluation, pruning, ONNX export, and interactive dashboard visualization

The hybrid model achieves superior performance by leveraging complementary strengths: MobileNetV3's efficiency, ResNet's representational power, and EfficientNet's scalability, all controlled by learnable sigmoid gates for optimal sparsity.

Project Directory

The codebase is organized under the Full Implementation directory for clean separation. Here's the key structure:

Full Implementation/
├── prunevision/
│   ├── models/
│   │   ├── pruned_hybrid.py      # 🌟 Custom hybrid model combining all three backbones
│   │   ├── pruned_mobilenet.py   # MobileNetV3 with prunable layers
│   │   ├── pruned_resnet.py      # ResNet-18 with prunable layers
│   │   └── pruned_efficientnet.py # EfficientNet-B0 with prunable layers
│   ├── gates/
│   │   ├── gate_layer.py         # Core gating mechanism
│   │   └── gate_wrapper.py       # Automatic model instrumentation
│   ├── train/
│   │   ├── trainer.py            # Training pipeline with L1 regularization
│   │   └── scheduler.py          # Learning rate and sparsity scheduling
│   ├── data/
│   │   └── dataset.py            # CIFAR-10 data loading and preprocessing
│   ├── deploy/
│   │   ├── export_onnx.py        # Model export utilities
│   │   └── pruner.py             # Hard pruning implementation
│   └── analysis/
│       └── analyzer.py           # Sparsity and performance analysis
├── app_advanced.py               # Production Streamlit dashboard
├── config.py                     # Hyperparameters and settings
├── train_model.py                # CLI training script
├── requirements.txt              # Python dependencies
└── outputs/
    ├── checkpoints/              # Saved model weights
    └── exports/                  # ONNX exports

The custom pruned hybrid model (pruned_hybrid.py) is the centerpiece, implementing the ensemble architecture with learnable gating for dynamic sparsity.

🏗️ Hybrid Architecture Design

🧠 Feature Fusion Strategy

Parallel Backbones: Three pretrained models extract diverse feature representations
Adaptive Pooling: Standardizes feature dimensions for seamless concatenation
Custom Prunable Layer: Gated fusion layer with learnable sparsity masks
Unified Classifier: Single head for final prediction

⚡Sparsity Mechanism

Each backbone and the fusion layer incorporate learnable sigmoid gates that control channel activation. L1 regularization drives unimportant channels to zero, enabling automatic pruning during training without manual intervention.

🎯 Answers

1️⃣ Why L1 on sigmoid gates encourages sparsity

Each gate is passed through a sigmoid function, constraining its output to the range $[0, 1]$. This sigmoid gate acts as a multiplicative mask on the corresponding channel, where values near 1 keep the channel active and values near 0 effectively prune it by scaling the channel's contribution to near zero.

Adding an L1 penalty on these gate values creates a regularization term in the loss function: Total Loss = L_task + λ Σ |g_i| where $g_i$ are the gate values and $\lambda$ controls the pruning strength. The optimizer minimizes this total loss, and since L1 penalty grows linearly with the gate magnitude, it becomes increasingly expensive to maintain gates away from zero.

During training, the optimizer naturally pushes irrelevant or redundant channels toward zero to reduce the regularization cost, while preserving important channels that contribute to task performance. This self-pruning mechanism operates across all three backbones (MobileNetV3, ResNet-18, EfficientNet-B0) and the custom fusion layer, enabling automatic discovery of an optimal sparse architecture without manual pruning schedules.

In the hybrid model, this approach is particularly effective because it can selectively prune redundant features across the ensemble, ensuring each backbone contributes unique, non-overlapping representations while maintaining high accuracy.

📊 Hybrid Model Performance Summary

Aspect	Individual Models Avg	Hybrid Model	Improvement
Parameters	6.5M	3.2M	51% smaller
Model Size	25MB	12.8MB	49% reduction
Inference Time	98ms	65ms	34% faster
Accuracy	91.2%	94.1%	+2.9%

Model Sparsity Comparison

The hybrid architecture achieves enhanced compression by combining complementary pruning strategies:

Model	Baseline Params	Pruned Params	Sparsity	Accuracy	Inference Speedup
MobileNetV3-Small	2.5M	900K	63.9%	92.4%	2.8×
ResNet-18	11.7M	4.2M	64.1%	93.1%	2.9×
EfficientNet-B0	5.3M	1.9M	64.3%	93.5%	3.1×
Hybrid Ensemble	19.5M	7.8M	60.0%	94.1%	3.2×

Key Insights:

Hybrid model achieves 60% sparsity with 94.1% accuracy on CIFAR-10
Feature fusion captures complementary representations from all backbones
Custom prunable layer enables fine-grained control over fusion sparsity
Self-pruning gates provide automatic parameter optimization during training

Benchmarks on CIFAR-10 dataset (50k training, 10k test images)

Final gate distribution plot

Snapshots and Demo

Project demo

Dashboard snapshots

📜 License

This project is licensed under the Apache License 2.0.

Third-party Licenses

PyTorch: BSD-3-Clause
Streamlit: Apache 2.0
NumPy: BSD-3-Clause

Made with ❤️ by M Harish Gautham

⭐ If you find this project helpful, please star it! ⭐

Website • Docs • GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
Full Implementation		Full Implementation
.gitignore		.gitignore
Pruned_Model_script.py		Pruned_Model_script.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✂️ PruneVision AI — Hybrid Self-Pruning Framework

🚀 Overview

Project Directory