
Sunday Apr 07, 2024
Episode 11.83: More attention than you need; Query, Key and Value embeddings almost explained.
Going beyond mere token input embeddings to the attention, self-attention and multi-head attention process that lies at the heart of transformer architecture and generative prompt/completion models. The central section contains minor additions to the first and third.
No comments yet. Be the first to say something!