Machine Learning Week Europe’s Post

View organization page for Machine Learning Week Europe, graphic

3,114 followers

Today’s LLMs such as #ChatGPT show an impressive performance and have the potential to revolutionize our daily life. All these LLMs are based on the Transformer architecture with the Attention mechanism at its core. Due to the quadratic scaling with context length, Attention makes processing of long sequences very expensive. In this talk, Maximilian Beck presents xLSTM, a novel architecture for #LLMs that scales only linear in context length while still outperforming Transformers on language modeling. https://ow.ly/3Q1x50T7v0F #mlweek #machinelearning

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics