[ICML 2026] Demystifying When Pruning Works via Representation Hierarchies

Shuai He¹, Guoheng Sun¹, Haichao Zhang², Yun Fu², Ang Li¹
¹University of Maryland, College Park, ²Northeastern University

📄 Paper • 🌐 Project Page • 📦 Structure • ⚙️ Environment • 🔍 Scripts • 🧪 Metrics

Codebase for representation-hierarchy analysis of pruning in LLMs.

Figure 1: Overview. This repo studies pruning through a representation hierarchy (`h → z → p`) and compares dense vs dropped/pruned behaviors.

Pruning often preserves non-generative metrics while revealing much larger differences across representation spaces during generation. This repo studies that discrepancy through a representation hierarchy:

Embedding space (h): hidden states
Logit space (z): pre-softmax outputs
Probability space (p): post-softmax distributions

We provide analysis code for both inter-layer dropping (layer/block drop) and intra-layer sparsification (Wanda/SparseGPT), and paper-aligned scripts that quantify how pruning perturbs h → z → p across layers and decoding steps.

What You Can Run Here

Inter-layer pruning (layer / block drop)
Intra-layer pruning (Wanda / SparseGPT)
Representation-level analysis in dropped and pruned modes

Environment and Repository Structure

Install from requirements.txt (recommended, pinned versions):

pip install -r requirements.txt

inter-layer/: layer/block dropping pipeline.
intra-layer/: intra-layer sparsification (Wanda / SparseGPT).
representation-analysis/: paper-aligned analysis scripts for representation hierarchy.

Empirical Results

Generative vs Non-generative Discrepancy

Non-generative metrics are often stable after pruning

Figure 3: Pruning often preserves non-generative metrics (single-step / fixed-target evaluations).

Generative quality can degrade after pruning as representation-space differences affect decoding

Figure 4: Pruning can hurt generative quality because representation-space differences become exposed during autoregressive decoding.

Figure 5: After pruning, generation can degrade qualitatively when probability-space shifts alter the autoregressive trajectory.

Distinct Observations Across Representation Spaces

_Attention

Layerwise cosine similarity under pruning across embedding/logit/probability spaces (MLP)

_MLP

Figure 2: Representation hierarchy under pruning. Layerwise representation similarity trends differ across embedding/logit/probability spaces (left: Attention, right: MLP).

Layerwise transition analysis

representation-analysis/transition_layerwise_compare.py

# Dropped model
python transition_layerwise_compare.py \
  --analysis_mode dropped \
  --model_name Qwen/Qwen2.5-7B-Instruct \
  --dropped_root_path /path/to/dropped_results \
  --target_layer attn \
  --drop_n 8

# Pruned model
python transition_layerwise_compare.py \
  --analysis_mode pruned \
  --model_name /path/to/dense_model \
  --pruned_model_name /path/to/pruned_model

Purpose:

Compare attn/mlp sublayer transitions at the same layer and same context.
Log transition metrics in embedding/logit/probability spaces. For example:
- Embedding/hidden space (h): cosine similarity cos(h_dense, h_pruned), and the parallel/orthogonal decomposition of Δh = h_pruned - h_dense w.r.t. h_dense.
- Logit space (z): cosine similarity cos(z_dense, z_pruned), plus the parallel/orthogonal decomposition of Δz = z_pruned - z_dense w.r.t. z_dense.
- Probability space (p): cosine similarity cos(p_dense, p_pruned) where p = softmax(z/T), and KL(p_pruned || p_dense) (reported as REAL_KL in logs).
- Second-order estimates (paper-aligned): KL_estimate and 1-cos_estimate computed from weighted variance terms with the 1/(2T^2) scaling.

Theoretical Theorems

Theorem 1 (Local Deviation Induced by Pruning)

For cosine similarity in any representation space, the deviation induced by pruning can be approximately characterized via a second-order Taylor expansion (see Appendix C.1 in the paper).

Theorem 2 (Sensitivity of Probability Space to Logit Perturbations)

To compare probability-space and logit-space deviations on the same footing, we rewrite probability-space deviation in terms of the logit variable $z$ (rather than applying Theorem 1 directly). Using a second-order Taylor expansion (see Appendix C.2), the probability-space cosine similarity admits a tractable approximation.

Theorem 3 (Distributional Shift under Pruning)

In probability space, KL divergence is a standard measure of distributional shift under pruning. Based on the derivation in Appendix B, the pruning-induced KL can be approximated in a closed form.

Empirical Support and Key Findings

Matching Theorems to Observations

Cosine similarity at a representative Attention layer (embedding/logit/probability)

_{Angular Deviation}

_{KL Divergence}

Figure 6: Example layerwise signals. Cosine similarity and KL divergence can show different sensitivity across spaces at the same layer (illustrative Attention layer).

Top Tokens vs. Option Subspaces

Top-token distribution changes under pruning

_{Top Tokens}

Answer-option subspace robustness under pruning

_{Categorical Tokens}

Figure 7: Subspace vs global behavior. Comparing answer-option subspaces with full-vocabulary behavior reveals why some non-generative scores remain stable.

Task subspace analysis (MCQ)

representation-analysis/compare_mcq_subspace_metrics.py

# Dropped model
python compare_mcq_subspace_metrics.py \
  --analysis_mode dropped \
  --model_name Qwen/Qwen2.5-7B-Instruct \
  --dropped_root_path /path/to/dropped_results \
  --target_layer attn \
  --drop_n 8

# Pruned model
python compare_mcq_subspace_metrics.py \
  --analysis_mode pruned \
  --model_name /path/to/dense_model \
  --pruned_model_name /path/to/pruned_model

Purpose:

Compare global vocabulary-space behavior vs answer-option subspace behavior.
Mirrors the non-generative subspace robustness discussion in the paper.

Pruning-Induced Errors During Autoregressive Decoding

Final-step similarity in embedding/logit spaces

_{Embedding and Logits}

Final-step similarity in probability/vocabulary space

_{Probability Space}

Figure 8: Step-wise representation comparison during autoregressive decoding. Embedding/logit similarity can remain high while probability-space similarity (vocabulary distribution) shows larger deviation.

Generation-time divergence analysis

representation-analysis/compare_generation_metrics.py

# Dropped model
python compare_generation_metrics.py \
  --analysis_mode dropped \
  --model_name Qwen/Qwen2.5-7B-Instruct \
  --dropped_root_path /path/to/dropped_results \
  --target_layer attn \
  --drop_n 8

# Pruned model
python compare_generation_metrics.py \
  --analysis_mode pruned \
  --model_name /path/to/dense_model \
  --pruned_model_name /path/to/pruned_model

Purpose:

Compare dense vs target trajectories across decoding steps.
Report cosine/KL and second-order estimates tied to the paper’s Section 6 formulas.

Acknowledgements

Inter-layer layer/block dropping is adapted from LLM-Drop
Intra-layer pruning builds on Wanda and SparseGPT

Citation

If this repository helps your research, please cite the corresponding paper:

BibTeX:

@misc{he2026demystifyingpruningworksrepresentation,
      title={Demystifying When Pruning Works via Representation Hierarchies}, 
      author={Shwai He and Guoheng Sun and Haichao Zhang and Yun Fu and Ang Li},
      year={2026},
      eprint={2603.24652},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.24652}, 
}

📬 Contact

Shwai He: shwaihe@umd.edu

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
docs		docs
figs		figs
inter-layer		inter-layer
intra-layer		intra-layer
representation-analysis		representation-analysis
.gitignore		.gitignore
Overview.svg		Overview.svg
README.md		README.md
modeling_qwen.py		modeling_qwen.py
requirements.txt		requirements.txt
transition_metrics_logging.py		transition_metrics_logging.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICML 2026] Demystifying When Pruning Works via Representation Hierarchies

Environment and Repository Structure

Empirical Results

Generative vs Non-generative Discrepancy

Distinct Observations Across Representation Spaces

Layerwise transition analysis

Theoretical Theorems

Empirical Support and Key Findings

Matching Theorems to Observations

Top Tokens vs. Option Subspaces

Task subspace analysis (MCQ)

Pruning-Induced Errors During Autoregressive Decoding

Generation-time divergence analysis

Acknowledgements

Citation

📬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[ICML 2026] Demystifying When Pruning Works via Representation Hierarchies

Environment and Repository Structure

Empirical Results

Generative vs Non-generative Discrepancy

Distinct Observations Across Representation Spaces

Layerwise transition analysis

Theoretical Theorems

Empirical Support and Key Findings

Matching Theorems to Observations

Top Tokens vs. Option Subspaces

Task subspace analysis (MCQ)

Pruning-Induced Errors During Autoregressive Decoding

Generation-time divergence analysis

Acknowledgements

Citation

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages