Nebius’ Post

View organization page for Nebius, graphic

16,047 followers

🔀 How transformers, RNNs and SSMs are more alike than you think Recent research has exposed deep connections between different architectural options: transformers, recurrent networks (RNNs), state space models (SSMs) and matrix mixers. This is exciting because it allows for the transfer of ideas from one architecture to another. 
In the next installment of our AI research series, we’ll mainly follow papers like “Transformers are RNNs” and Mamba 2, getting elbows deep in algebra to understand how: * Transformers may sometimes be RNNs. * State space models may hide inside the mask in the self-attention mechanism. * Mamba may sometimes be rewritten as masked self-attention. Read the article on our blog: https://lnkd.in/dQsyEnV5 #transformers #RNN #SSM #research #papers

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics