Mark Huang’s Post

Co-Founder at Gradient. Enterprise Agentic Automation

4mo

Really excited to share some of our learnings integrating Amazon Web Services (AWS) Inferentia instances into our evals stack. I know a lot of our customers have trouble obtaining GPUs for quick eval iteration cycles in their development process so we've worked to remove that friction. #AWS #GenerativeAI #LLMs #Evals

Gradient

9,197 followers

4mo

We’re excited to share our open-source framework that lets you score different generative language models across various evaluation tasks and benchmarks - used by leaderboards such as Hugging Face. While working with the Amazon Web Services (AWS) team to train our models on AWS Tranium, we realized we were restricted to both VRAM and the availability of GPU instances when it came to the mainstream tool for LLM evaluation. Our open source solution overcomes these challenges—integrating AWS Neuron, the library behind AWS Inferentia and Trainium, into lm-evaluation-harness. Take a look at: ✅ How We Broke Down Our Tests ✅ The Challenges We Encountered ✅ An Example of Using the Testing Harness on AWS Inferentia A huge thank you Michael Feil and Jim Burtoft for the partnership and collaboration, giving back to the developer community. #Gradient #GradientAI #AWS #LLM #OpenSource #LLMEvaluation

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia | Amazon Web Services

aws.amazon.com

To view or add a comment, sign in

More Relevant Posts

Gradient

9,197 followers
4mo
Report this post
We’re excited to share our open-source framework that lets you score different generative language models across various evaluation tasks and benchmarks - used by leaderboards such as Hugging Face. While working with the Amazon Web Services (AWS) team to train our models on AWS Tranium, we realized we were restricted to both VRAM and the availability of GPU instances when it came to the mainstream tool for LLM evaluation. Our open source solution overcomes these challenges—integrating AWS Neuron, the library behind AWS Inferentia and Trainium, into lm-evaluation-harness. Take a look at: ✅ How We Broke Down Our Tests ✅ The Challenges We Encountered ✅ An Example of Using the Testing Harness on AWS Inferentia A huge thank you Michael Feil and Jim Burtoft for the partnership and collaboration, giving back to the developer community. #Gradient #GradientAI #AWS #LLM #OpenSource #LLMEvaluation

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia | Amazon Web Services

aws.amazon.com

5 Comments
Like Comment
To view or add a comment, sign in
Casey Jones

Founder, Head of Marketing @ CJ&CO + Ad Gurus
4mo
Report this post
Want to take your LLM game to the next level without breaking the bank? Gradient has cracked the code with AWS Inferentia! I just stumbled upon this gem of an article that breaks down how Gradient is making LLM benchmarking a breeze. And let me tell you, it's a game-changer! Here are the key takeaways: • Gradient's AI Development Lab is helping enterprises build custom LLMs and AI co-pilots. Pretty cool, right? • They faced some challenges with the mainstream benchmarking tool, but AWS Neuron came to the rescue! ️ • By integrating AWS Neuron into lm-evaluation-harness, Gradient can now benchmark their models against public ones during training and after. Talk about efficiency! • Moving to Amazon EC2 Inf2 instances powered by AWS Inferentia2 gave them access to a whopping 384 GB of shared accelerator memory. Plus, they saved up to 90% with AWS Spot Instances! • The article even provides a step-by-step guide on deploying Gradient's model using AWS Inferentia2 instances. It's like having a cheat sheet! If you're into LLMs and want to up your evaluation game, this is a must-read! Check out the full article here: "Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia" https://lnkd.in/gW753SFp Happy Wednesday! Let me know your thoughts. Casey Jones #LLM #Benchmarking #AWSInferentia #Gradient #AI

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Usama Alameldin Salama

☁️Data mining, Data storytelling, ML and just about everything in between☁️
7mo
Report this post
Excited to announce the availability of Llama 2 inference and fine-tuning support on AWS Trainium and AWS Inferentia instances in Amazon SageMaker JumpStart. Using AWS Trainium and Inferentia based instances, through SageMaker, can help users lower fine-tuning costs by up to 50%, and lower deployment costs by 4.7x, while lowering per token latency. https://lnkd.in/dhAjnwtM #amazonsagemaker #llama2

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Vatsal Shah

Principal Solutions Architect @ AWS ☁ AI/ML Specialist ☁ GenAI Focused ☁ Leader in Tech ☁ 8x AWS Certified ☁ Cloud Computing ☁ Digital Dexterity ☁ Growth Hacking ☁ Scalable Solutions ☁ Design Thinking ☁ Game Tech
1mo
Report this post
Accelerate LLM training with #Meta Llama 3 and #AWSTrainium. 🦙⚡ https://go.aws/3RTFged In this post, you'll learn best practices for training LLMs on AWS Trainium, scaling the training on a cluster with over 100 nodes, improving efficiency of recovery from system and hardware failures, improving training stability, & achieving convergence. #AWS

End-to-end LLM training on instance clusters with over 100 nodes using AWS Trainium | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Slava Gorlov

Ex. AWS Ambassador | Lead Cloud Alliances Manager | Principal Solutions Architect | Full AWS-Certified (13/13) | First MLOps APN Competency in the World
1mo
Report this post
Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container. 🧠 As machine learning workloads become more complex, monitoring them can be challenging. 🤯 AWS Neuron Monitor container helps by providing real-time metrics and alerts for ML jobs running on Amazon EKS clusters with AWS Inferentia chips. 💡 Key benefits: - Easy to deploy - drops into EKS clusters and starts monitoring. 👍 - Visualize GPU utilization, batch sizes, throughput etc for each job. ✨ - Set alerts based on metrics to catch issues early. 🚨 - Works with popular frameworks like PyTorch, TensorFlow, MXNet. 🤝 With Neuron Monitor, you get full visibility into your ML workloads on EKS and Inferentia. This allows you to optimize resource usage, troubleshoot faster and reduce costs. 💰 Read more below on how to get started! There are also some great graphs showing the rich metrics you get. 📈 Let me know if you have any questions! 👇 https://lnkd.in/eA_JG43y #aws #amazon #ai #artificialintelligence #bigdata #ml #machinelearning #dataanalytics #datascience #genai #generativeai #llm #amazoneks #awsneuronmonitorcontainer #awsinferentia #awstrainium

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Artem Lovan

Technologist | Engineering Leader | Cloud and AI
8mo
Report this post
Explore AWS's approach of fine-tuning the powerful Llama 2 model using QLoRA and deploying it on Amazon SageMaker, supercharged by the new AWS EC2 Inferentia2 instances. This approach not only enhances model performance but also ensures cost-efficient, high-speed ML inference, marking significant improvements in your AI applications. #finetuning #inf2 #QLoRA #SageMaker https://lnkd.in/eK5Vmuyx

Fine-tune Llama 2 using QLoRA and Deploy it on Amazon SageMaker with AWS Inferentia2 | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Fabrizio Avantaggiato

Prototyping Manager at AWS | Helping the World’s Largest Enterprises Explore Generative AI, Machine Learning and other Emerging Technologies on AWS.
4mo
Report this post
📖 Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia This is a guest post co-written with Michael Feil at Gradient. Evaluating the performance of large language models (LLMs) is an important step of the pre-training and fine-tuning process before deployment. The faster and more frequent you’re able to validate performance, the higher the chances you’ll be able to improve the performance of the model. […]

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia

aws.amazon.com
Like Comment
To view or add a comment, sign in
Mirela Juravle

Data & AI, IIoT, Digitalisation, CX, Manufacturing, Energy, Smart Cities, Sustainability | Lecturer
7mo
Report this post
Calling out to all Foundation Models developpers in Asia - academia or start ups - check out the fantastic grants AWS offers to support you in development. Apply here between January 17th and March ! Let’s #build the future #together ! #genai #llm #aws #awsresearch

Emily Webber

Principal SA, Annapurna ML
7mo Edited

This just in!! I'm thrilled to announce that Amazon Research Awards is now funding the development of novel foundation models. We're offering $250K in promotional credits to support academics in designing and creating the FM's of the future. We are also inviting select AWS customers to review these grant proposals with us, and the researchers will be invited to execute their work on AWS Trainium chips, in addition to our entire suite of managed AI/ML services. 👉 https://lnkd.in/eDKxrVP3 Carissa Lundquist Larry Davis

Foundation Model Development call for proposals — Winter 2024

amazon.science
Like Comment
To view or add a comment, sign in
Towards AWS

2,822 followers
2mo
Report this post
AWS Serverless Product Recommendation: powered by k-NN with OpenSearch #serverless #opensearch #aws

AWS Serverless Product Recommendation: powered by k-NN with OpenSearch

towardsaws.com
Like Comment
To view or add a comment, sign in
Raju Pillai

Director @ Onyx CenterSource | Data Engineering, Analytics, AI/ML
5mo
Report this post
This is useful, if you do not want the data to leave your AWS VPC , but still want to use the Claude 3 !! Planning to lead my team to do a PoC for an use case.

Anthropic’s Claude 3 Haiku model is now available on Amazon Bedrock | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in

3,500 followers

160 Posts

View Profile Follow

Mark Huang’s Post

More Relevant Posts

Explore topics