Andy Savage’s Post

View profile for Andy Savage, graphic

Principal Software Engineer - [Golang, Linux, AWS, Cybersecurity, VPN Tech, Git IaC, Software Development, Backend Architect]

Give Groq a try, it's a free OpenAI compatible API provider hosting the llama3 meta models. The thing is, they design their own GPUs and they completely trounce Nvidia GPUs in speed. If that doesn't mean much, just sign up and give it a try, use the same query in GPT-4 and Groq Llama3 70B and see the difference in speed. It's amazing to see results come back instantly, compared to line by line.

View profile for Alex Banks, graphic
Alex Banks Alex Banks is an Influencer

Building a better future with AI

I'm amazed by this. Llama 3 on Groq makes GPT-4 look like a grandpa: I asked both models to list all the prime numbers from 1 to 1000. Llama 3 hit over 830 tokens per second(!) Plus, it generated the entire sequence again whilst GPT-4 was still inferencing. In an arena as competitive as LLMs, speeds like this really set you apart and open up hundreds of use cases. You must give it a go. If you enjoy insights like this, follow me Alex Banks for more on AI.

To view or add a comment, sign in

Explore topics