Today we are proud to release two new research models: MathΣtral and Codestral Mamba. https://lnkd.in/gMy2aepz https://lnkd.in/gchwK-9D
linked inside of MathΣtral article - "We’re contributing Mathstral to the science community to bolster efforts in advanced mathematical problems requiring complex, multi-step logical reasoning. The Mathstral release is part of our broader effort to support academic projects—it was produced in the context of our collaboration with Project Numina." (Schafer Gizel!) https://projectnumina.ai/
Awesome looking forward for your next open source medium sized general model (7b to 56b), Europe need you !
Ollama you guys know what to do. Thank you on behalf of the community.
Does Mistral AI have any secret lab with wizards 🧙♂️..which helps to build these LLMs ? Mistral won ❤️ again ! Eager to try out MathΣtral and Codestral Mamba. Btw... Mistral AI do you guys ever sleep ? 😅..
Soma Dhavala Bharat Shetty B we were discussing about this last weekend!
Le Designer est en feu
Been excited for their Mamba2 model ever since I heard they were going to do it, been experimenting with Mamba for music generation as well not too long ago and saw big qualitative and performance gains compared to transformers
I ♥️ Mistral And finally the first major Release of a mamba based language model 😍
Back-end Developer @ Moveit | AWS Certified Cloud Practitioner
1moThe Mamba2 model is incredible, 5x faster with massive context size. Although, I don’t think that creating a model for math is the way to go. It’s extremely difficult to teach an LLM math because it doesn’t have chain-of-thought thinking. It’s much better for the model to write some python code and then execute it. Simple and effective solution and doesnt need absurd amounts of computation for pre-training such models.