A PyTorch-based U-Net autoencoder designed to compress and analyze astrophysical images from FITS files. This project leverages deep learning to extract meaningful latent representations of astronomical observations while preserving critical spatial information. The latent representation could be used for downstream tasks like clustering, anomaly detection etc.
Sample Output #1
Sample Output #2
Sample Output #3
This repository contains an end-to-end pipeline for:
- Data Processing: Loading and normalizing astrophysical FITS images
- Model Training: U-Net encoder-decoder architecture with an 8-dimensional latent space
- Evaluation: Comprehensive metrics including MSE, SSIM, and MS-SSIM
- Analysis: Latent space visualization using PCA and t-SNE
- Inference: Reconstruction and export of astronomical data
┌─────────────────────────────────────────────────────────────────┐
│ ASTRO-AUTOENCODER PIPELINE │
└─────────────────────────────────────────────────────────────────┘
│
▼
Install Libraries
(numpy, torch, etc.)
│
▼
Data Preparation
• Load FITS files
• Resize & Normalize (187×187)
│
▼
Model Definition
• U-Net Encoder
• Latent Space (8D)
• U-Net Decoder
│
▼
Training Phase
• 80/20 Train/Validation Split
• BCE Loss + Adam Optimizer
• 150 Epochs
│
▼
Evaluation Metrics
• MSE, SSIM, MS-SSIM
• Visualize Reconstructions
│
▼
Latent Space Analysis
• Extract Latent Vectors
• PCA/t-SNE Visualization
│
▼
Inference & Export
• Test Data Reconstruction
• Export Metrics & Results
-
Encoder: Progressive downsampling with skip connections
- Input: 187×187 grayscale images
- Channels: 1 → 16 → 32 → 64 → 128
- Output: 8-dimensional latent vector
-
Decoder: Progressive upsampling with skip connections
- Reconstructs from 8D latent space
- Restores spatial dimensions to 187×187
- Output: Sigmoid-normalized grayscale image
- Framework: PyTorch
- Optimizer: Adam
- Loss Function: Binary Cross Entropy (BCE)
- Batch Normalization: Applied throughout
- Regularization: Dropout (0.1)
- Input Normalization: Z-scale normalization with percentile clipping
- MSE (Mean Squared Error): Pixel-level reconstruction accuracy
- SSIM (Structural Similarity Index): Perceptual quality preservation
- MS-SSIM (Multi-Scale SSIM): Multi-scale structural similarity
- Clone the repository (or set up the local directory):
git clone https://github.com/yourusername/astro-autoencoder.git
cd astro-autoencoder- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install Jupyter (if not already installed):
pip install jupyter- Launch Jupyter:
jupyter notebook astro-autoencoder.ipynb- Execute cells sequentially:
- Cell 1: Configure data directories
- Cell 2: Install required libraries
- Cell 3: Visualize sample FITS images
- Cell 4-5: Define model architecture
- Cell 6-7: Train the autoencoder
- Cell 8-9: Evaluate performance
- Cell 10-11: Analyze latent space
- Cell 12-13: Run inference on test data
In the notebook, you can customize:
# Training parameters
batch_size = 32
num_epochs = 150
learning_rate = 1e-3
latent_dim = 8
dropout_rate = 0.1
# Validation split
train_split = 0.8
val_split = 0.2
# Model architecture
# Encoder channels: 1 → 16 → 32 → 64 → 128
# Decoder channels: 128 → 64 → 32 → 16 → 1