CircuitsVis
Mechanistic Interpretability Visualizations using React (by alan-cooney)
TransformerLens
A library for mechanistic interpretability of GPT-style language models (by TransformerLensOrg)
CircuitsVis | TransformerLens | |
---|---|---|
1 | 3 | |
140 | 993 | |
- | 9.8% | |
4.7 | 9.2 | |
about 2 months ago | 1 day ago | |
Jupyter Notebook | Python | |
MIT License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CircuitsVis
Posts with mentions or reviews of CircuitsVis.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-21.
-
Intermediate Activations – The Forward Hook
For those interested in playing with or doing research using model internals, the TransformerLens [1] project appears to be the leading open-source tooling in this area. It allows for loading dozens of different models, adding hooks, displaying activations in a format compatible with CircuitsVis, and other (mechanistic) interpretability work.
[1] https://github.com/neelnanda-io/TransformerLens
[2] https://github.com/alan-cooney/CircuitsVis
TransformerLens
Posts with mentions or reviews of TransformerLens.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-21.
-
Intermediate Activations – The Forward Hook
For those interested in playing with or doing research using model internals, the TransformerLens [1] project appears to be the leading open-source tooling in this area. It allows for loading dozens of different models, adding hooks, displaying activations in a format compatible with CircuitsVis, and other (mechanistic) interpretability work.
[1] https://github.com/neelnanda-io/TransformerLens
[2] https://github.com/alan-cooney/CircuitsVis
-
I made an open-source Python package, TorchLens, that can visualize the structure of any PyTorch model and extract any intermediate activations you want in one line of code.
Very cool! How does the functionality compare to TransformerLens?
-
Show HN: Visual intuitive explanations of LLM concepts (LLM University)
Two additional ones that come to mind now are:
Transformer Feed-Forward Layers Are Key-Value Memories https://arxiv.org/abs/2012.14913
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention https://arxiv.org/abs/2202.05798
https://github.com/neelnanda-io/TransformerLens