Hacker Newsnew | past | comments | ask | show | jobs | submit | code_devil's favoriteslogin

The Illustrated Transformer is fantastic, but I would suggest that those going into it really should read the previous articles in the series to get a foundation to understand it more, plus later articles that go into GPT and BERT, here's the list:

A Visual and Interactive Guide to the Basics of Neural Networks - https://jalammar.github.io/visual-interactive-guide-basics-n...

A Visual And Interactive Look at Basic Neural Network Math - https://jalammar.github.io/feedforward-neural-networks-visua...

Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) - https://jalammar.github.io/visualizing-neural-machine-transl...

The Illustrated Transformer - https://jalammar.github.io/illustrated-transformer/

The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) - https://jalammar.github.io/illustrated-bert/

The Illustrated GPT-2 (Visualizing Transformer Language Models) - https://jalammar.github.io/illustrated-gpt2/

How GPT3 Works - Visualizations and Animations - https://jalammar.github.io/how-gpt3-works-visualizations-ani...

The Illustrated Retrieval Transformer - https://jalammar.github.io/illustrated-retrieval-transformer...

The Illustrated Stable Diffusion - https://jalammar.github.io/illustrated-stable-diffusion/

If you want to learn how to code them, this book is great: https://d2l.ai/chapter_attention-mechanisms-and-transformers...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: