Baseten’s Post

View organization page for Baseten, graphic

4,780 followers

Philip Kiely got tired of waiting 8-10 seconds for Stable Diffusion XL to generate images on an A100, so he set out to make it faster. 🏎 Using 5 different optimizations, he first made it 5x faster: SDXL inference took only 1.92 seconds 💪 (see how: https://lnkd.in/e2ABQxX8). Then, by adding TensorRT to the mix, Philip Kiely and Pankaj Gupta decreased latency by another 40%! Take a look: https://lnkd.in/ePqpa6Hj 🏅 Optimizing model performance is one of our specialties. If you're looking to optimize your own models in production, give us a shout!

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics