Sunday Apr 07, 2024

Episode 11.83: More attention than you need; Query, Key and Value embeddings almost explained.

Going beyond mere token input embeddings to the attention, self-attention and multi-head attention process that lies at the heart of transformer architecture and generative prompt/completion models. The central section contains minor additions to the first and third.

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments