GitHub - mytechnotalent/mechanistic_interpretability: Mechanistic Interpretability (MI) is a subfield of AI alignment and safety research focused on reverse-engineering neural networks to understand their internal computational mechanisms by discovering the actual algorithms and circuits they learn.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README_files		README_files
LICENSE		LICENSE
README.md		README.md
corpus.json		corpus.json
mechanistic_interpretability.ipynb		mechanistic_interpretability.ipynb

Repository files navigation

About

Mechanistic Interpretability (MI) is a subfield of AI alignment and safety research focused on reverse-engineering neural networks to understand their internal computational mechanisms by discovering the actual algorithms and circuits they learn.

model transformers neural-networks gpt model-interpretation mechanistic-interpretability generative-pre-trained-transformer

Readme

MIT license