Project #2: LLM Visualization
So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
Dec 2, 2023 · 8:09 PM UTC
It also contains a walkthrough/guide of the steps, as well as a few interactive elements to play with.
Why, you ask? For what purpose did I put all the time & effort into this project?
There's a real advantage to unpacking a set of abstractions, flattening them out. Abstractions can be useful for terseness and management, but they can be a real blocker to seeing the big picture.
With this, you can see the whole thing at once. You can see where the computation takes place, its complexity, and relative sizes of the tensors & weights.
The model with all the animations is tiiny, to make it tractable. For comparison, I threw in a few of the larger models (GPT-2, GPT-3), render-only.
And when you see what it takes to just produce a single value in a mat-mul, the sheer scale of these things becomes apparent.
What about understanding what each layer does? Uhh, sorry, won't be much help.
The project just came out of "Let's build a 3D viz!", so the scope is a bit limited. It's more: here's a way to learn & digest the algorithm, and perhaps think about how to optimize the process.
As for what I got out of creating this: before I made it, I mostly knew how image convolution nets worked, but language-based models seemed kinda magical in comparison.
Well, now I know them in a fair amount of detail!
I also learnt a good amount of GL (dF/dx, fwidth, ubos, instancing), and animation approaches. So, uhh, even if no-one sees this, the project definitely has some value to me.
Oh yeah, the link is here: bbycroft.net/llm/ Works best on desktop (sorry mobile). Left-click drag, right-click rotate, scroll to zoom. And hover over the tensor cells. Blue cells are weights/parameters, green cells are intermediate values. Each cell is a single number!
Well, I hope you find it interesting. Let me know your thoughts! And if someone makes it through the walkthrough and finds it a little ~incomplete towards the end I might even getting around to fix it (my attention has largely turned to other projects oops)