Here are
34 public repositories
matching this topic...
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
Updated
Feb 12, 2026
Python
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
Updated
Jan 17, 2026
Python
[NeurIPS 2024] Official code of ”LION: Linear Group RNN for 3D Object Detection in Point Clouds“
Updated
Dec 16, 2025
Python
CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
Updated
Dec 3, 2025
Python
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head (ICLR 2026)
Updated
Feb 6, 2026
Python
The semantic segmentation of remote sensing images
Updated
Jul 29, 2022
Python
[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting
Updated
Jan 9, 2026
Jupyter Notebook
The semantic segmentation of remote sensing images
Updated
Jul 29, 2022
Python
Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)
Updated
Jan 18, 2025
Python
Reference implementation of "Self-Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation" (Heinsen and Kozachkov, 2026)
Updated
Feb 8, 2026
Python
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
Updated
Jun 6, 2024
Python
Code for the paper "Cottention: Linear Transformers With Cosine Attention"
Updated
Nov 15, 2025
Cuda
Updated
Jan 8, 2023
Python
Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)
Updated
Jun 20, 2025
Python
RWKV Wiki website (archived, please visit official wiki)
Updated
Mar 26, 2023
Shell
[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."
Updated
Nov 12, 2024
Python
Efficient generative adversarial networks using linear additive-attention Transformers
Updated
Jan 21, 2026
Python
A Curated Collection of Frontier Language Model Architectures
LEAP: Linear Explainable Attention in Parallel for causal language modeling with O(1) path length, and O(1) inference
Updated
Jun 18, 2023
Jupyter Notebook
Improve this page
Add a description, image, and links to the
linear-attention
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
linear-attention
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.