#

linear-attention

Here are 34 public repositories matching this topic...

BlinkDL / RWKV-LM

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

deep-learning transformers pytorch transformer lstm rnn gpt language-model attention-mechanism gpt-2 gpt-3 linear-attention rwkv chatgpt

Updated Feb 12, 2026
Python

weigao266 / Awesome-Efficient-Arch

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

moe mamba linear-models state-space-model mixture-of-experts efficient-architectures linear-attention sparse-attention linear-rnn diffusion-llm

Updated Nov 11, 2025

thu-ml / SLA

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

transformer video-generation mlsys inference-acceleration ai-infra linear-attention sparse-attention diffusion-transformer train-acceleration sparse-linear-attention

Updated Jan 17, 2026
Python

happinesslz / LION

[NeurIPS 2024] Official code of ”LION: Linear Group RNN for 3D Object Detection in Point Clouds“

3d-object-detection linear-attention linear-rnn

Updated Dec 16, 2025
Python

zhouyuan888888 / CARE-Transformer

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

transformer linear-models mobile-friendly linear-attention cvpr2025 care-transformer

Updated Dec 3, 2025
Python

DAGroup-PKU / MHLA

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head (ICLR 2026)

language-modeling imagenet image-classification image-generation dit video-generation sana linear-attention

Updated Feb 6, 2026
Python

lironui / Multi-Attention-Network

The semantic segmentation of remote sensing images

remote-sensing segmentation attention-mechanism semantic-segmentation linear-attention

Updated Jul 29, 2022
Python

keshik6 / grafting

[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting

image-generation post-training self-attention convolutions diffusion-models grafting linear-attention text-to-image-generation architecture-research diffusion-transformer sub-quadratic-attention model-grafting hyena-operator model-architecture-editing diffusion-transformers architecture-editing hyena-x hyena-y mamba-2

Updated Jan 9, 2026
Jupyter Notebook

lironui / MAResU-Net

The semantic segmentation of remote sensing images

remote-sensing attention segmentation attention-mechanism semantic-segmentation linear-attention

Updated Jul 29, 2022
Python

BICLab / MetaLA

Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)

linear-attention llm

Updated Jan 18, 2025
Python

glassroom / sata_attention

Reference implementation of "Self-Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation" (Heinsen and Kozachkov, 2026)

attention symmetric-tensors self-attention linear-attention taylor-approximation symmetry-aware-taylor-approximation

Updated Feb 8, 2026
Python

glassroom / heinsen_attention

Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)

attention attention-mechanism attention-model linear-attention linear-attention-model heinsen-attention

Updated Jun 6, 2024
Python

gmongaras / Cottention_Transformer

Code for the paper "Cottention: Linear Transformers With Cosine Attention"

transformers attention linear-attention

Updated Nov 15, 2025
Cuda

robflynnyh / hydra-linear-attention

Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)

machine-learning transformers attention linear-attention efficient-attention

Updated Jan 8, 2023
Python

gmlwns2000 / sea-attention

Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)

attention linear-attention efficient-attention sea-attention

Updated Jun 20, 2025
Python

RWKV-Wiki / rwkv-wiki.github.io

RWKV Wiki website (archived, please visit official wiki)

deep-learning transformers transformer lstm rnn gpt language-model attention-mechanism gpt-2 gpt-3 linear-attention rwkv

Updated Mar 26, 2023
Shell

OSU-STARLAB / LeaPformer

[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."

efficiency language-modeling transformer-architecture linear-attention simultaneous-translation long-range-arena

Updated Nov 12, 2024
Python

milmor / LadaGAN

Efficient generative adversarial networks using linear additive-attention Transformers

transformer gan attention-mechanism linear-attention

Updated Jan 21, 2026
Python

Superposition09m / Awesome-LM-Architecture

A Curated Collection of Frontier Language Model Architectures

moe linear-attention sparse-attention test-time-learning diffusion-llm

Updated Jan 30, 2026

mtanghu / LEAP

LEAP: Linear Explainable Attention in Parallel for causal language modeling with O(1) path length, and O(1) inference

deep-learning parallel transformers pytorch transformer rnn attention-mechanism softmax local-attention dot-product-attention additive-attention linear-attention

Updated Jun 18, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the linear-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the linear-attention topic, visit your repo's landing page and select "manage topics."