Efficient Superpixel Segmentation

CS-7641 Machine Learning Project Georgia Institute of Technology

This repository implements a novel approach for efficient semantic segmentation using decoupled superpixel clustering with Vision Transformers (ViT). The project addresses the computational challenges of standard ViTs by combining unsupervised superpixel clustering (SLIC) with transformer-based semantic segmentation.

Project Structure

Root Directory Files

README.md: Main project documentation and overview

pyproject.toml: Python project configuration and dependency management using modern PEP 517/518 standards

requirements.txt: Python package dependencies (torch, torchvision, scikit-image, timm, etc.)

/src/: Source Code Directory

Main implementation directory containing all model code and utilities.

src/trainer.py: Main training script with AdamW optimizer, training/validation loops, and checkpoint management. Implements differential learning rates for backbone vs decoder, and includes metrics computation (mIoU, pixel accuracy, confusion matrix).

/src/models/: Model Architecture Components

Neural network modules implementing the decoupled superpixel segmentation pipeline.

src/models/init.py: Package initialization for models module

src/models/decoupled_superpixel_vit.py: Main end-to-end model combining all pipeline components (backbone, tokenizer, classifier, associator) into a single DecoupledSuperpixelViT module

src/models/feature_extractor.py: CNN backbone (ResNet-based) for extracting dense pixel-level features with configurable output stride and optional backbone freezing

src/models/superpixel_tokenizer.py: SLIC-based superpixel generation and feature pooling to convert dense pixel features into compact superpixel tokens

src/models/superpixel_classifier.py: Vision Transformer encoder for processing superpixel tokens with multi-head self-attention and MLP layers for semantic classification

src/models/superpixel_associator.py: Association module for mapping superpixel-level predictions back to dense pixel-level segmentation masks

/src/data/: Dataset and Data Loading

Data loading and preprocessing utilities.

src/data/init.py: Package initialization for data module

src/data/dataset_loader.py: Cityscapes dataset loader with preprocessing, label mapping (from 34 to 19 classes), and augmentation transforms (resize, normalization)

/dataset/: Dataset Storage

Directory for storing training and validation datasets.

/dataset/Cityscapes/: Cityscapes Dataset

Semantic segmentation dataset with urban street scenes.

dataset/Cityscapes/README.md: Instructions for downloading and setting up the Cityscapes dataset, including directory structure requirements

/docs/: Documentation and Reports

Project documentation, proposals, and progress reports.

docs/index.md: Project homepage with team information and navigation for the Jekyll website

docs/proposal.md: Initial project proposal describing problem definition, background literature, methodology, and dataset

docs/midterm.md: Midterm progress report with preliminary results and implementation updates

Key Features

Efficient Segmentation: Reduces computational complexity from ~10^6 pixels to ~2048 superpixel tokens
Unsupervised Clustering: Uses parameter-free SLIC algorithm for superpixel generation
Transformer Architecture: Multi-head self-attention on superpixel tokens for semantic understanding
Dense Predictions: Association module maps superpixel classifications back to pixel-level predictions
Cityscapes Dataset: Trained on 19-class urban scene segmentation task

Model Pipeline

Feature Extraction (PixelFeatureExtractor): ResNet backbone extracts dense features
Superpixel Tokenization (SuperpixelTokenization): SLIC clustering + feature pooling
Classification (SuperpixelClassifier): Vision Transformer processes superpixel tokens
Association (SuperpixelAssociation): Maps predictions back to dense pixel space

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
_layouts		_layouts
assets		assets
docs		docs
src		src
.gitignore		.gitignore
Gemfile		Gemfile
README.md		README.md
_config.yml		_config.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient Superpixel Segmentation

Project Structure

Root Directory Files

/src/: Source Code Directory

/src/models/: Model Architecture Components

/src/data/: Dataset and Data Loading

/dataset/: Dataset Storage

/dataset/Cityscapes/: Cityscapes Dataset

/docs/: Documentation and Reports

Key Features

Model Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Efficient Superpixel Segmentation

Project Structure

Root Directory Files

/src/: Source Code Directory

/src/models/: Model Architecture Components

/src/data/: Dataset and Data Loading

/dataset/: Dataset Storage

/dataset/Cityscapes/: Cityscapes Dataset

/docs/: Documentation and Reports

Key Features

Model Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages