Vast.ai

Software Development

Los Angeles, California 1,478 followers

Peer GPU rental: One simple interface to search, compare and utilize GPU computing at the best prices.

See jobs Follow

View all 16 employees

About us

Vast.ai is the market leader for low cost GPU rentals. The service connects data centers and professionals running the Vast hosting software with users who can quickly find the best deals for compute according to their specific requirements. Vast.ai GPU rentals are ~3-5X cheaper than current alternatives. Consumer computers and consumer GPUs in particular are considerably more cost effective than equivalent enterprise hardware. We are helping the millions of underutilized consumer GPUs around the world enter the cloud computing market for the first time.

Website: https://vast.ai
External link for Vast.ai
Industry: Software Development
Company size: 2-10 employees
Headquarters: Los Angeles, California
Type: Privately Held
Founded: 2018

Locations

Primary

6600 W Sunset Blvd

STE 256

Los Angeles, California 90028, US

Get directions

Employees at Vast.ai

See all employees

Updates

Vast.ai

1,478 followers
2d
Report this post
It's that time again! We're back with the latest updates here at #VastAI, aimed at bringing you the best possible GPU rental platform experience. Last month, we've rolled out numerous template updates as well as added a new guide to our Docs on serving Infinity Embeddings. https://lnkd.in/grZzpS-i

September 2024 Product Update

Like Comment Share
Vast.ai

1,478 followers
4d
Report this post
Medusa is slightly different than other types of speculative decoding in that it adds a piece of the original model to do the speculation. TGI is the first major serving framework for large language models that enables Medusa-style speculative decoding.

Serving Online Inference with TGI and Medusa on Vast.ai

vast.ai

Like Comment Share
Vast.ai

1,478 followers
6d
Report this post
As the year winds down, rumors are intensifying around NVIDIA's highly anticipated GeForce RTX 5090 GPU. Industry insiders are divided on the release date, with some sources suggesting a launch just in time for Christmas, while other reports point to a formal announcement at CES 2025 in the new year.

NVIDIA RTX 5090: Out by Christmas? A Look at the Latest Rumors

vast.ai

Like Comment Share
Vast.ai

1,478 followers
1w
Report this post
In the complex landscape of data center operations, understanding and adhering to various compliance standards is crucial.

Navigating Data Center Compliance: Understanding Tier 2/3 and HIPAA/ISO 27001 Standards

vast.ai

Like Comment Share
Vast.ai

1,478 followers
1w
Report this post
Medusa is a method of specular decoding. Speculative decoding speeds up inference of large language models by having a smaller model multiple tokens and lets the larger model just verify. If the Verification for the large model is cheaper than generating the tokens themselves. If the smaller model is accurate enough, then the cost to generate tokens goes down overall. Medusa is slightly different than other types of speculative decoding in that it adds a piece of the original model to do the speculation.

Serving Online Inference with TGI and Medusa on Vast.ai

Like Comment Share
Vast.ai

1,478 followers
2w
Report this post
This guide will show you how to set up SGLang to serve a language model on Vast.

Serving sglang on Vast

vast.ai

Like Comment Share
Vast.ai

1,478 followers
2w
Report this post
The L40S was developed to meet the surging demand for GPUs that can handle the intense computational requirements of machine learning training and inference. How does it stack against the L40 -- and which one do you need?

Comparing NVIDIA L40 vs. L40s – and More

vast.ai

Like Comment Share
Vast.ai

1,478 followers
3w
Report this post
SGLang provides an OpenAI-compatible server, allowing you to easily integrate it into chatbots and other applications. As companies develop their AI products, they often face challenges like rate limits and high costs when using these models. With SGLang on Vast, you can run your own models in the form factor you need, at a much more affordable price point. As inference demand grows with agents and complex workflows, SGLang on Vast excels in performance and affordability where it matters most.

Serving sglang on Vast

vast.ai

Like Comment Share
Vast.ai

1,478 followers
3w
Report this post
When deciding between the A100 and H100, consider your specific workload requirements. If you need top-tier double-precision performance and superior memory bandwidth, or you're dealing with next-gen HPC at datacenter scale and trillion-parameter AI, the H100 is the clear winner. For a more versatile and cost-effective solution that still delivers powerful AI performance, the A100 is a solid choice.

H100 vs A100: Comparing Two Powerhouse GPUs

vast.ai

Like Comment Share
Vast.ai

1,478 followers
3w
Report this post
vLLM is now more flexible than ever as it also supports embedding models. This brings vLLM's dynamic batching and Paged Attention to embedding models for much faster throughput, all from the docker image that developers are used to. This guide will show you how to setup vLLM to serve embedding models on Vast.

Serving vLLM Embeddings on Vast.ai

vast.ai

Like Comment Share

Vast.ai

Software Development

Los Angeles, California 1,478 followers

Peer GPU rental: One simple interface to search, compare and utilize GPU computing at the best prices.

About us

Locations

Employees at Vast.ai

Travis Cannell

COO @ Vast.ai | Global GPU Cloud Platform

Jake Cannell

Ryan Barry

Senior Software Engineer @ Vast.ai

Sammy Javed

Wanderer

Updates

Join now to see what you are missing

Similar pages

RunPod

Lambda

Together AI

Anyscale

FluidStack

VAST Data

Perplexity

TensorDock

CoreWeave

Mistral AI

Browse jobs

Engineer jobs

Senior Software Engineer jobs

Game Programmer jobs

Test Lead jobs

Developer jobs

Lead Software Engineer jobs

Back End Developer jobs

Quality Assurance Automation Engineer jobs

Machine Learning Engineer jobs

iOS Developer jobs

Android Developer jobs

Scrum Master jobs

Software Engineer jobs