Baseten

Software Development

San Francisco, CA 4,151 followers

Fast, scalable inference in our cloud or yours

See jobs Follow

View all 43 employees

About us

At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently. Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models. We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic. Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.

Website: https://www.baseten.co/
External link for Baseten
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Francisco, CA
Type: Privately Held
Specialties: developer tools and software engineering

Products

Baseten

Machine Learning Software

Locations

Primary

San Francisco, CA, US

Get directions
New York, NY, US

Get directions

Employees at Baseten

See all employees

Updates

Baseten

4,151 followers
2w Edited
Report this post
🎉 We’re excited to announce Baseten Self-hosted for unparalleled control over AI model deployments! 👉🏻 Check out our announcement blog to learn more: <<<<<https://lnkd.in/gVR6GhQ6>>>>> After working with countless AI builders across different industries, we consistently heard the need for a high-performance inference solution running in their VPC to: • Meet strict data residency requirements • Align with organizational and industry compliance standards • Leverage existing cloud commitments and resources • Customize hardware and GPU usage Both Baseten Cloud and Baseten Self-hosted offer enterprise-grade security, performance, and reliability. Baseten Self-hosted is specifically designed for companies and enterprises needing enhanced control over infrastructure and data, while gaining the performance, reliability, and scale we specialize in. 🥇 Baseten Self-hosted enables you to run inference in your own VPC with the same user experience as our Cloud offering. Model inference inputs and outputs go directly to your compute—they never touch our premises. 💚 We love to support our customers with state-of-the-art AI inference. If Baseten Self-hosted can help you meet your security and compliance needs, provide necessary control over hardware, or leverage your existing resources, get in touch! <<<<<https://lnkd.in/gSQWwH5m>>>>>
4 Comments

Like Comment Share
Baseten

4,151 followers
19h
Report this post
Bland AI announced $22M of funding and launched on Product Hunt today! 🎊 We’re so pumped to support the future of AI phone calling that we decided to throw an end-of-summer party with them. 🍸 Check the comments for the registration link. With Baseten, Bland reduced end-to-end call latency from 3 seconds to under 400 milliseconds and gained seamless traffic-based autoscaling to meet customer demands—with 50x growth in usage, and 100% uptime to date. Check out the story, support their ProductHunt launch, and come celebrate with us!
7 Comments

Like Comment Share
Baseten

4,151 followers
1d
Report this post
toby founders Lucas Campa 🤌 and Vincent Wilmet 🤌 came to Baseten one week away from their startup’s Product Hunt launch. Their AI-powered real-time translation service allows people to have a live video call while speaking different languages. After working with our engineers, Vincent and Lucas migrated from their development infrastructure to an ultra-low-latency production-ready deployment on Baseten—and reached #3 on Product Hunt on launch day, with zero minutes of downtime. 🔥 Read their story: https://lnkd.in/efz2_DKb
2 Comments

Like Comment Share
Baseten

4,151 followers
2d
Report this post
You love building robust systems and processes? Join us as an SRE. ⚙ Optimizing AI models? Join our model performance team. 🚀 Engaging with potential customers? Join our sales team as an SDR! 💪 We're thrilled to welcome many new team members, but we're not stopping there! We're hiring for 9 open roles, take a look: https://lnkd.in/eMHByrHz 📣 Share or tag someone you know would be a great fit!
Like Comment Share
Baseten

4,151 followers
3d
Report this post
Using open-source ML models poses a few advantages: 🎛️ Control (over model inputs, outputs, and environment) 📊 Custom optimizations 💰 Predictable spend 👉 With so many open-source models to pick from, Philip Kiely put together a guide on how to choose the right model for your use case—take a look: https://lnkd.in/eduzEivF And as always, you can launch all of these models from our model library. 🔥 https://lnkd.in/eKJebzGs
Like Comment Share
Baseten

4,151 followers
4d
Report this post
Philip Kiely got tired of waiting 8-10 seconds for Stable Diffusion XL to generate images on an A100, so he set out to make it faster. 🏎 Using 5 different optimizations, he first made it 5x faster: SDXL inference took only 1.92 seconds 💪 (see how: https://lnkd.in/e2ABQxX8). Then, by adding TensorRT to the mix, Philip Kiely and Pankaj Gupta decreased latency by another 40%! Take a look: https://lnkd.in/ePqpa6Hj 🏅 Optimizing model performance is one of our specialties. If you're looking to optimize your own models in production, give us a shout!
Like Comment Share
Baseten

4,151 followers
5d
Report this post
What precision format do you use for LLM serving? 🤔 LLMs have billions of parameters that translate to billions of numbers needing to be stored, read, and processed when they're run. FP16 has been a common default format, but it's increasingly common to serve LLMs using FP8—and for good reasons. FP8 can massively improve inference speed and decrease operational costs, with less output quality degradation compared to other techniques. 💡 Learn more about FP8 quantization in Philip Kiely's article: https://lnkd.in/eKvQzsni Tell us: what precision formats do you use for your models? 🧮
Like Comment Share
Baseten

4,151 followers
1w
Report this post
🛠 We built Truss, an open-source model packaging framework, to give developers unparalleled control and simplicity for serving ML models. Model serving requires iterative development; Truss addresses this need with live reload. With Truss, the upload-build-deploy loop is practically instantaneous. ⚡️ Otherwise, this can take anywhere from 3 to 30 minutes! 🐌 🧠 Our Co-Founder Pankaj Gupta wrote a technical deep-dive on Truss' live reload feature on our blog, check it out: https://lnkd.in/e6XasSbc ⭐ Or take a look at Truss on GitHub: https://lnkd.in/gAivnGWz
Like Comment Share
Baseten

4,151 followers
1w
Report this post
Using Medusa, we achieved a 94% to 122% increase in tokens per second for Llama 3! 🤯 Medusa is a method for generating multiple tokens per forward pass during LLM inference. After trying more fundamental optimizations (like quantization, using H100 GPUs, or TensorRT-LLM), more speed requires implementing cutting-edge inference techniques like Medusa. Check out Philip Kiely and Abu Qader's new article to learn how Medusa works, performs on different benchmarks, and how you can use a Medusa-optimized LLM in production! 💪 https://lnkd.in/eK9i3hTu
1 Comment

Like Comment Share

Browse jobs

Funding

Baseten 4 total rounds

Last Round

Series B Apr 4, 2024

US$ 40.0M

Investors

Spark Capital IVP + 5 Other investors

See more info on crunchbase

Baseten

Software Development

San Francisco, CA 4,151 followers

Fast, scalable inference in our cloud or yours

About us

Products

Baseten

Machine Learning Software

Locations

Employees at Baseten

William Lau

Amir Haghighat

Co-founder at Baseten

Aaron Relph

Design at Baseten

Anupreet Walia

Engineering leadership

Updates

Join now to see what you are missing

Similar pages

Doss

Glean

Conviction

Together AI

Anthropic

HeyGen

Modal

Perplexity

Remitly

Adept

Browse jobs

Corporate Finance Intern jobs

Appointment Setter jobs

Data Science Specialist jobs

Sales Development Director jobs

Patent Agent jobs

Enterprise Account Executive jobs

Community Lead jobs

Vice President Finance jobs

Engineer jobs

Psychologist jobs

Scientist jobs

Senior Sales Executive jobs

Evangelist jobs

Specialist jobs

Sales Director jobs

Director jobs

Head of Sales jobs

Executive jobs

Linguist jobs

Analyst jobs

Funding