This repo implements the decoder-only version of the Transformer Architecture as defined in Attention Is All You Need (Vaswani et al.). Specifically, the implementation includes lots of very detailed comments that I had questions for when first learning about the Transformer.
chenneking/transformer-explained
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|