Gael Chardon’s Post

Lead Architect

5mo

One more step toward cheaper or faster LLM inference on AWS? Also what about Neuron support on Lambda and ECS Fargate? (cold start is an issue .. I know) https://lnkd.in/gyZgY-53

AWS Neuron introduces speculative decoding and vLLM support

aws.amazon.com

To view or add a comment, sign in

More Relevant Posts

Ravi Shankar

Manager II - Machine Learning Product Discovery @ Bed Bath & Beyond | Machine Learning, Computer Vision, NLP, LLMs
8mo
Report this post
Deploying Llama2 models can be expensive but AWS has specific hardware in the Inferential family that supports cost effective deployment. The blog post discusses the deployment of Llama 2 on Amazon EC2 Inf2 instances using AWS Inferentia2 for both training and inference. It provides detailed steps on creating, compiling, and deploying the Llama-2 model using the latest AWS Neuron SDK release, achieving high performance at low cost. https://lnkd.in/gDib5znW

PyTorch

pytorch.org

1 Comment
Like Comment
To view or add a comment, sign in
Vatsal Shah

Principal Solutions Architect @ AWS ☁ AI/ML Specialist ☁ GenAI Focused ☁ Leader in Tech ☁ 8x AWS Certified ☁ Cloud Computing ☁ Digital Dexterity ☁ Growth Hacking ☁ Scalable Solutions ☁ Design Thinking ☁ Game Tech
3mo
Report this post
Accelerate LLM training with #Meta Llama 3 and #AWSTrainium. 🦙⚡ https://go.aws/3RTFged In this post, you'll learn best practices for training LLMs on AWS Trainium, scaling the training on a cluster with over 100 nodes, improving efficiency of recovery from system and hardware failures, improving training stability, & achieving convergence. #AWS

End-to-end LLM training on instance clusters with over 100 nodes using AWS Trainium | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
AWS AI

426,488 followers
3mo
Report this post
Accelerate LLM training with #Meta Llama 3 and #AWSTrainium. 🦙⚡ https://go.aws/3RTFged In this post, you'll learn best practices for training LLMs on AWS Trainium, scaling the training on a cluster with over 100 nodes, improving efficiency of recovery from system and hardware failures, improving training stability, & achieving convergence. #AWS

End-to-end LLM training on instance clusters with over 100 nodes using AWS Trainium | Amazon Web Services

aws.amazon.com

4 Comments
Like Comment
To view or add a comment, sign in
Charlie Higgs

Enterprise Account Executive @ Amazon Web Services (AWS)
6mo
Report this post
Deploy low-latency, high-throughput inference using AWS Graviton or AWS Inferentia on Amazon EKS. Read this guidance page to learn how to use prebuilt AWS solutions to deploy an ML inference workload. This guidance walks through how to pack up thousands of unique PyTorch deep learning models into a scalable architecture using Graviton or Inferentia on Amazon EKS.

Guidance for Low Latency, High Throughput Inference using Efficient Compute on Amazon EKS

aws.amazon.com
Like Comment
To view or add a comment, sign in
Diwakar Bansal
4mo
Report this post
Check out this awesome blogpost from my colleagues Jianying Lang, Fei Chen et.al. on training SOTA LLMs on AWS Trainium, our AWS ML chips.

Barry Bolding
4mo

Excellent work by our WWSO Advanced Computing and Service teams in scaling training workloads on Trainium. https://lnkd.in/gje7QHKg #hpc #genai #AWS

End-to-end LLM training on instance clusters with over 100 nodes using AWS Trainium | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Shefali Arora

AWS Lead Instructor | Train the Trainer | AWS reStart Accredited Instructor | Training Program Curriculum Designer | Freelancer
6mo Edited
Report this post
Course here: https://lnkd.in/gpQf5B85 Take the course now, it's FREE and includes an AWS environment so all your experiments are covered! DeepLearning.AI Amazon Web Services (AWS) #Serverless #GenAI #llms #amazonbedrock #aws

Serverless LLM apps with Amazon Bedrock

deeplearning.ai

1 Comment
Like Comment
To view or add a comment, sign in
Michael Hume

AWS Senior Solutions Architect & UKPS Serverless Lead.
1mo Edited
Report this post
Interested in GenAI Prompt Chaining on AWS ? Then checkout my two patterns on Serverlessland. Here I demonstrate two simple examples using Amazon API Gateway (REST) https://lnkd.in/gtFg_hmq & AWS AppSync (GraphQL) https://lnkd.in/g_K9JRjM Both invoke an Express state machine synchronously and utilize AWS Step Functions intrinsic functions to chain two prompts, which are then used to invoke the Amazon Bedrock language model. These no-code examples showcases how the results from the first prompt can be used to provide context for the second prompt, allowing the language model to deliver a highly-curated response. By chaining these prompts, the system can leverage the capabilities of the LLM to generate more meaningful and contextual outputs.
3 Comments
Like Comment
To view or add a comment, sign in
DevOps Bulletin

13,479 followers
2mo
Report this post
The Alarm Context Tool (ACT) enhances AWS CloudWatch Alarms by providing additional context to aid in troubleshooting and analysis. By leveraging AWS services such as Lambda, CloudWatch, X-Ray, and Amazon Bedrock, this solution aggregates and analyzes metrics, logs, and traces to generate meaningful insights.

GitHub - aws-samples/alarm-context-tool: The Alarm Context Tool (ACT) enhances AWS CloudWatch Alarms by providing additional context to aid in troubleshooting and analysis.

github.com
Like Comment
To view or add a comment, sign in
Martin Aristizabal

Cloud Architect/ 5x AWS Certified/ Terraform Certified Specialist / Team Leader /Information Technology Management / Technical Pre-Sales / Product and Service design
4mo
Report this post
A very enriching Amazon Web Services (AWS) Skill Builder course to know better and practice with the labs, the steps in a Machine Learning Pipeline!
1 Comment
Like Comment
To view or add a comment, sign in
AWS Graviton Weekly

983 followers
2mo
Report this post
AWS Graviton Weekly # 98 https://lnkd.in/eWT6bJVE Highlights of the week 🗞️ Amazon Web Services (AWS) Graviton-based EC2 instances now support hibernation 🗞️ New Amazon CloudWatch dimensions for Amazon EC2 On Demand Capacity Reservations 🗞️ Amazon EC2 Fleet and EC2 Auto Scaling groups now supports aliases for Amazon Machine Images (AMIs) ⛏️ RAG solution on Amazon Bedrock - Part 9: Optimizing ECS and EKS Infrastructure with AWS Graviton, by Vivek V ⛏️ RISE with SAP on AWS delivers consumer goods innovation, by Justin Honaman ⛏️ Amazon EC2 R8g Instances with AWS Graviton4 Processors Generally Available, by Steef-Jan Wiggers ⛏️ AMD vs Intel vs Graviton: Comparison between Modern Processors, by Visak Krishnakumar ⛏️ Using the MAQAO framework to analyze application performance across instances, by Hugo Bolloré, Cédric Valensi, and William Jalby ⛏️ Cloudera Data Engineering 1.22 for Public Cloud: Support for AWS Graviton, Spark 3.5 + Iceberg 1.4, and More! ⛏️ Top 6 Strategies for AWS EMR (Elastic Map Reduce) Cost Optimization, by Sanika Kotgire ⛏️ Databricks Runtime ML on AWS Graviton instances ⛏️ Get a 50% Price-Performance Boost With StarRocks on AWS Graviton3, by Andy Ye ⛏️ Read how Simon Phillips( CTO at SecureAck) migrated its platform to AWS Graviton ⛏️ Exploring AWS Lambda: Performance and Cost Analysis of Different Architectures and Deployment Methods, by Evgeny Lukashov ⛏️ Move workloads on x86-based instances to AWS Graviton | The Keys to AWS Optimization | S10 E11, with Steph Gooch, Rem Baumann, and Hahnara Hyun ⛏️ Harness the power of Karpenter to scale, optimize & upgrade Kubernetes | Architecting on AWS ⛏️ Watch Jeff Underhill and Sunita Nadampalli demonstrate blazing fast performance of llama3.1 on Graviton, using the highly optimized llama.cpp framework ⛏️ 269: Crowdstrike: Does Anyone Know the Graviton of this Situation? by Justin Brodley, Jonathan Baker, Ryan Lucas, and Matthew K. ⛏️ SUSE x Arm at SUSECON 2024, with Robert Sirchia and Andrew Wafaa ⛏️ Applying FinOps to GenAI: Perspectives and Best Practices from Grammarly and Intuit, by Jason Rhoades and Josh Collier #aws #awsgraviton #finops #kubernetes

AWS Graviton Weekly # 98

awsgravitonweekly.com
Like Comment
To view or add a comment, sign in

415 followers

138 Posts

View Profile Follow

Gael Chardon’s Post

More Relevant Posts

Explore topics