Fireworks AI’s cover photo
Fireworks AI

Fireworks AI

Software Development

Redwood City, CA 15,003 followers

Generative AI platform empowering developers and businesses to scale at high speeds

About us

Fireworks.ai offers generative AI platform as a service. We optimize for rapid product iteration building on top of gen AI as well as minimizing cost to serve. https://fireworks.ai/careers

Website
http://fireworks.ai
Industry
Software Development
Company size
11-50 employees
Headquarters
Redwood City, CA
Type
Privately Held
Founded
2022
Specialties
LLMs and Generative AI

Locations

Employees at Fireworks AI

Updates

  • Fireworks AI reposted this

    View profile for Lin Qiao

    CEO and cofounder of Fireworks AI

    🔥 Announcing FireOptimizer/Multi-LoRA 🔥 I didn't expect what I considered to be a small feature launched last year delivered a powerful impact to our customers. I'm excited to announce Multi-LoRA, an important component of FireOptimizer. Personalized experiences are critical to driving greater usage, retention and customer satisfaction for your product. Without Multi-LoRA, deploying hundreds of fine-tuned models on separate GPUs would be prohibitively expensive. With Multi-LoRA, you can now deliver personalized experiences across thousands of users and use cases, without scaling your costs! More specifically, Multi-LoRA has benefits below: -- Fine-tune and serve hundreds of personalized LoRA models at the same cost as a single base model, which is just $0.2/1M tokens for Llama3.1 8B -- 100x cost-efficiency compared to serving 100 fine-tuned models without Multi-LoRA on other platforms with per-GPU pricing -- Convenient deployment on Fireworks Serverless with per-token pricing and competitive inference speeds, or Fireworks On-Demand and Reserved for larger workloads Multi-LoRA is part of FireOptimizer, our adaptation engine designed to customize and enhance AI model performance for your unique use cases and workload. FireOptimizer capabilities include Adaptive Speculative Execution (https://lnkd.in/ejdD-wGG), that enables up to 3x latency improvements, Customizable Quantization (https://lnkd.in/dwpTU233), to precisely balance speed and quality, and LoRA Fine-Tuning (https://lnkd.in/et2UFzDy) to customize and improve model performance. ⚡Cresta uses Multi-LoRA to personalize their Knowledge Assist feature for each individual customer on the Fireworks enterprise platform. "Fireworks' Multi-LoRA capabilities align with Cresta's strategy to deploy custom AI through fine-tuning cutting-edge base models. It helps unleash the potential of AI on private enterprise data." - Tim Shi, Co-Founder and CTO of Cresta ⚡Brainiac Labs helps businesses leverage their proprietary data to fine-tune and deploy models using Multi-LoRA on the Fireworks self-serve platform. “Using Fireworks, clients with limited AI expertise can successfully maintain and improve the solutions I provide. Additionally, students in my course are able to complete real-world fine-tuning projects, dedicating just a few hours per week to the process.” - Scott Kramer, CEO of Brainiac Labs 👉 Read more in our blog post https://lnkd.in/d3_HGRqy

  • View organization page for Fireworks AI

    15,003 followers

    We’re beyond thrilled to share that Fireworks AI has been named #10 on Fast Company’s list of the world’s most innovative AI companies! This recognition highlights our core mission: empowering developers to easily build GenAI applications on state of the art open models. We are honored to be mentioned alongside industry leaders and our partners NVIDIA, Amazon, Google, and Mistral AI. Join our exceptional team of innovators. Find your perfect role on our careers page and become part of our rapidly growing success story 🚀 https://lnkd.in/geVW6EFk

    • No alternative text description for this image
  • Fireworks AI reposted this

    View profile for Lin Qiao

    CEO and cofounder of Fireworks AI

    🔥 Fireworks AI matches DeepSeek AI pricing 🔥 After significant performance optimization in the past 2 months, we are excited to pass our efficiency improvement back to our users. We launched two additional Deepseek R1 tiers to current tier: 1. base: providing the matching price with the original Deepseek API on self-serve platform. 2. ultra fast: optimized through FireOptimizer for specific workload for max speed. Enterprise only. Given R1 is an important logical reasoning model, we have been continuously adding features to make development easy on Fireworks AI developer cloud: https://lnkd.in/eX5cy7D6 Tell us what other features are on your mind. We will build what you need.

    View profile for Lin Qiao

    CEO and cofounder of Fireworks AI

    🔥 Fireworks AI Developer Cloud for DeepSeek AI models 🐳 Fireworks mission is to provide the best developer toolchain using open models for transparency, steerability, control, privacy, low latency and cost. Within one month, Fireworks launched a comprehensive AI developer cloud for Deepseek models. Here is a list of launched product features: 👉 Launched Deepseek models on the same day of open weights release! 👉 Latency and cost optimization:  🪄 Fireworks continuously pushes top performance and cost efficiency with a special version of FireAttention and distributed inference engine optimized for Deepseek’s unique MLA, MTP, and wide MOE architecture.  🪄 Controllable reasoning effort: we added shorter and better CoT for R1 via reasoning_effort = low 👉 Agentic development: 🪄 Agentic multi-modal workflow: we added vision capability to Deepseek v3 and R1  🪄 Agentic tool use: we added function calling to Deepseek v3 so it can integrate easily with other tools and APIs for agent development 🪄 Constrained generation: we added JSON mode and Grammar mode to Deepseek v3 and R1 👉 Model quality enhancement: 🪄 Additional Deepseek derivative models: launched Perplexity R1-1776 with higher accuracy for deep research and many tuned Deepseek models in production 👉 Research reproduction: 🪄 Reinforcement learning verifiable reward with minimal label 🪄 Distillation: R1 doing better than human We have many more features to launch soon, including Deepseek SFT and RL tuning platform as part of the FireOptimizer [https://lnkd.in/ejdD-wGG] stack. We will release many real world demos for you to get up to speed on the developer platform of Deepseek. Stay tuned! Please comment below what your wish list is. I would love to hear from you.

    • No alternative text description for this image
  • View organization page for Fireworks AI

    15,003 followers

    Fireworks AI matches DeepSeek pricing for R1, with secure deployments in EU and US Excited to share the latest enhancements to our DeepSeek R1 offerings: 💡 Base DeepSeek R1: Cost-effective and high-quality throughput for real-time applications (Endpoint: deepseek-r1-basic) 🚀 Ultra-Fast DeepSeek R1: Up to 130 tokens/sec for lightning-fast interactions on Fireworks Enterprise. ⚡ Fast DeepSeek R1: Balanced performance at 90 tokens/sec, optimized for interactive applications on Fireworks Serverless. (Endpoint: deepseek-r1) With specialized versions of FireAttention and tailored distributed inference, we're pushing the envelope for speed, efficiency, and cost-effectiveness in agentic products. More innovations coming soon with Blackwell GPUs! Explore our optimized DeepSeek deployments here: https://lnkd.in/edJh9MXb

  • Fireworks AI reposted this

    We're thrilled to announce a groundbreaking integration Fireworks AI now seamlessly integrates with NVIDIA NIM microservices, powered by NVIDIA AI Enterprise. This means enterprises can rapidly deploy advanced AI models—accelerating innovation and driving competitive advantage like never before. Here's why this is game-changing: → Unmatched performance: Supercharge your AI capabilities with industry-leading open-source models like DeepSeek and Llama. → Expanded possibilities: Instantly access NVIDIA NIM’s extensive AI offerings, including embeddings, video processing, and 3D modeling. → Effortless integration: Utilize powerful NVIDIA Llama Nemotron Reasoning models effortlessly within Fireworks AI. Learn more about how Fireworks AI Supports NVIDIA NIM Deployments for Blazing AI Inference: https://lnkd.in/dtndSbhv

  • We're thrilled to announce a groundbreaking integration Fireworks AI now seamlessly integrates with NVIDIA NIM microservices, powered by NVIDIA AI Enterprise. This means enterprises can rapidly deploy advanced AI models—accelerating innovation and driving competitive advantage like never before. Here's why this is game-changing: → Unmatched performance: Supercharge your AI capabilities with industry-leading open-source models like DeepSeek and Llama. → Expanded possibilities: Instantly access NVIDIA NIM’s extensive AI offerings, including embeddings, video processing, and 3D modeling. → Effortless integration: Utilize powerful NVIDIA Llama Nemotron Reasoning models effortlessly within Fireworks AI. Learn more about how Fireworks AI Supports NVIDIA NIM Deployments for Blazing AI Inference: https://lnkd.in/dtndSbhv

  • View organization page for Fireworks AI

    15,003 followers

    🚀 Announcing DeepSeek R1 & V3 Fine-Tuning on Fireworks AI Fine-tuning state-of-the-art open models has never been easier. With DeepSeek R1 & V3 fine-tuning now available on Fireworks, you can tailor model behavior to your specific use case—with a seamless path to dedicated deployment. Key Benefits of DeepSeek Fine-Tuning on Fireworks: ✅ Quantization Aware Tuning (QAT): Ensures high accuracy, efficiency, and training speed. ✅ Seamless Model Alignment: QAT minimizes discrepancies between training and deployment performance. ✅ Optimized for Large-Scale Models: Efficiently manages memory and complexity in Mixture of Experts architectures. ✅ Effortless Deployment: Fine-tuned models require dedicated deployments, fully supported on Fireworks. 👉 With just three lines of code, you can fine-tune and deploy your model with ease. Check out this blog to read more: https://lnkd.in/dfKmRtWq

  • A smart reasoning LLM is good, but a smart reasoning VLM is better! We're thrilled to share a demo showcasing "AI Research Assistant" that with DeepSeek R1 which can now reason over both text and images thanks to Fireworks AI's new Document Inlining feature! Check out the demo below and experience the future of multimodal AI reasoning! 👇 Read the detailed blog here: https://lnkd.in/dZbi9ETD Get started with building your use-cases with DeepSeek R1 model on Fireworks AI: https://lnkd.in/g9Xt4grp

  • Building an AI Agent with Reasoning Capability is as simple as 👇 We are excited to share this demo built by Shane Thomas using Mastra and Fireworks AI for DeepSeek R1 model. Mastra is a TypeScript Agentic AI Framework that lets you build intelligent agents with persistent memory, robust state management, contextual data integration, and transparent tracking. Checkout the project here: https://lnkd.in/dvfk-yqe

Similar pages

Browse jobs

Funding