Weights & Biases

Software Development

San Francisco, California 77,027 followers

The AI developer platform.

See jobs Follow

Discover all 297 employees

About us

Weights & Biases: the AI developer platform. Build better models faster, fine-tune LLMs, develop GenAI applications with confidence, all in one system of record developers are excited to use. W&B Models is the MLOps solution used by foundation model builders and enterprises who are training, fine-tuning, and deploying models into production. W&B Weave is the LLMOps solution for software developers who want a lightweight but powerful toolset to help them track and evaluate LLM applications. Weights & Biases is trusted by over a 1,000 companies to productionize AI at scale including teams at OpenAI, Meta, NVIDIA, Cohere, Toyota, Square, Salesforce, and Microsoft. Sign up for a 30-day free trial today at http://wandb.me/trial.

Website: https://wandb.ai/site
External link for Weights & Biases
Industry: Software Development
Company size: 201-500 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2017
Specialties: deep learning, developer tools, machine learning, MLOps, GenAI, LLMOps, large language models, and llms

Products

Weights & Biases

Machine Learning Software

Weights & Biases helps AI developers build better models faster. Quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, and manage your ML workflows end-to-end.

Locations

Primary

400 Alabama St

San Francisco, California 94110, US

Get directions

Employees at Weights & Biases

See all employees

Updates

Weights & Biases

77,027 followers
1d
Report this post
Ever notice your standards for “good” vs. “bad” LLM outputs start to shift once you’ve seen more examples? That’s Criteria Drift—our evaluation rules evolve the moment unexpected outputs appear. It’s perfectly natural, but can quickly complicate consistency and alignment if we’re not prepared. Why does this matter? 🤔 • Each new batch of outputs can reveal fresh failure modes, nudging us to redefine or add evaluation criteria. • Without a solid process, you’ll constantly be playing catch-up, with unclear or ever-changing metrics. The good news? 🙌 Our latest course breaks down strategies to manage Criteria Drift and keep your evaluations stable—so you always know what “good” looks like, no matter what your LLM throws at you. Check it out below and safeguard your LLM evals from unplanned shifts! 🎓 LLM Apps: Evaluation Course is here: https://lnkd.in/gCHffA24

LLM Apps: Evaluation Course (Criteria Drift)

1 Comment

Like Comment Share
Weights & Biases

77,027 followers
2d Edited
Report this post
DeepSeek AI, Stargate and AI's $600 Billion Question with with Sequoia Capital's David Cahn In this episode of Gradient Dissent, Our CEO and Co-founder Lukas Biewald sits down with David Cahn, partner at Sequoia Capital, for a compelling discussion on the dynamic world of AI investments. They dive into recent developments, including Deepseek and Stargate, exploring their implications for the AI industry. Drawing from his articles, "AI's $200 Billion Question" and "AI's $600 Billion Question," David unpacks the financial challenges and opportunities surrounding AI infrastructure spending and the staggering revenue required to sustain these investments. Together, they examine the competitive strategies of cloud providers, the transformative impact of AI on business models, and predictions for the next wave of AI-driven growth. This episode offers an in-depth look at the crossroads of AI innovation and financial strategy. 🎙️ Tune in here: https://lnkd.in/gzDbupv3

Gradient Dissent Podcast: DeepSeek, Stargate & AI's $600 Billion Dollar Question

Like Comment Share
Weights & Biases

77,027 followers
3d
Report this post
🚀 Groundbreaking AI Research with Reinforcement Learning, powered by Weights & Biases Incredible advancements are being made in AI reasoning and problem-solving, and we’re thrilled to share this exciting achievement by Jiayi Pan, Xingyao Wang & Lifan Yuan. Using reinforcement learning (RL), they reproduced DeepSeek R1-Zero—a method that enabled a 3B parameter language model to develop self-verification and search abilities autonomously. This was demonstrated in Countdown, a game where players combine numbers and arithmetic to reach a target number. Key highlights from their findings: 🔸 The model starts with dummy outputs and, over time, learns revision, search, and self-verification tactics—critical reasoning behaviors. 🔸 These abilities scale with model size, emerging at 1.5B parameters and beyond. 🔸 RL algorithms like PPO, GRPO, and PRIME all work well, showing robustness in the approach. 🔸 The training process costs less than $30, making this a highly accessible method for furthering RL research. 📊 Why it matters: Their work sheds light on how reinforcement learning can unlock reasoning behaviors in language models without extensive instruction fine-tuning. These insights could transform the way we approach problem-solving tasks across industries. ✨ Powered by Weights & Biases: This research leveraged Weights & Biases to log, monitor, and analyze the experiment results, enabling transparency and collaboration. Their experiment logs and findings are publicly available on W&B for others to explore and build upon: wandb.ai/jiayipan/TinyZero Congratulations to the team on this incredible achievement! We’re proud to support research that makes advanced AI methods accessible and drives innovation in the field. See Jiayi's thread that has over 650k views right now on X here: https://lnkd.in/gYCxbw5n
1 Comment

Like Comment Share
Weights & Biases

77,027 followers
3d
Report this post
We have enormous goals for 2025, and we want YOU to be a part of them! Weights & Biases is on the lookout for passionate Software Engineers and dynamic Sales professionals to help us build the best tools for AI developers. With over 1,000 customers (including OpenAI, NVIDIA, Microsoft, and Toyota Motor Corporation) and over $250M in funding, we’re on a mission to revolutionize machine learning and empower teams building the future of AI. Explore all our open roles here: https://lnkd.in/gdrG-ien Join us as we take on the most consequential challenges in AI, together.

Why we like working at W&B

1 Comment

Like Comment Share
Weights & Biases

77,027 followers
6d
Report this post
Why does Pass/Fail work so well for LLM evaluations? It forces clarity. No more guessing the difference between a 3 and a 4. With Pass/Fail, every decision is immediately actionable, and easier to improve. Our LLM Apps: Evaluation course dives deeper into this framework, helping you create better systems and more effective GenAI apps. 📚: https://lnkd.in/gCHffA24

LLM Apps: Evaluation (W&B Course)

Like Comment Share
Weights & Biases

77,027 followers
1w
Report this post
Why does the order of words matter for LLMs? Two words: Position Bias. LLMs rely on positional embeddings to determine “who did what to whom.” Without this positional context, words lose their relationships, making it nearly impossible to capture true meaning. If you’re ready to dive deeper into these concepts—and more—check out our new, free, on-demand course: LLM Apps: Evaluation. In just 2 hours, you’ll learn how to: - Build an evaluation pipeline for LLM applications. - Leverage LLMs as evaluators to assess outputs programmatically. - Minimize human input by aligning auto-evaluations with best practices. By the end of the course, you’ll have hands-on experience, practical implementation methods, and a clear understanding of how to effectively evaluate and improve your GenAI apps. Meet your expert instructors: Ayush Thakur – AI Engineer at Weights & Biases Anish Shah – AI Engineer at Weights & Biases Paige Bailey – AI Developer Relations Lead at Google Graham Neubig – Co-Founder at All Hands AI Join us and take the next step in advancing your LLM expertise—one (positional) token at a time! 📚: https://lnkd.in/gCHffA24

LLM Apps: Evaluation Course (What is Position Bias?)

Like Comment Share
Weights & Biases

77,027 followers
1w
Report this post
Join us in Paris, France on January 22 for an exclusive panel discussion featuring industry leaders at the cutting edge of generative AI—Mistral AI, Thales, and NVIDIA. Discover how to harness GenAI to fuel innovation, enhance customer experiences, and accelerate growth. What to expect • Expert insights: Real-world applications and transformative advancements in generative AI from Mistral AI, Thales, and NVIDIA. • Actionable strategies: Practical guidance on adopting AI to drive business results. • Interactive discussions: Dive into ethical, regulatory, and technical considerations, and get your questions answered during the Q&A. Featured speakers • Adrien Bécue – AI & Cybersecurity Expert, Thales • Richard Wright – EMEA DGX AI Platform Segment Sales Lead, NVIDIA • Sophia Yang, Ph.D. – Head of Developer Relations, Mistral AI Register here to secure your seat: https://lnkd.in/gNKxnz29
Like Comment Share
Weights & Biases

77,027 followers
2w
Report this post
🚀 How do you build an autonomous programming agent that dominates SWE-bench Verified? Our Co-Founder and CTO, Shawn Lewis, tackled this challenge and delivered an o1-based AI agent that now holds the new state-of-the-art, solving 64.6% of issues on SWE-bench Verified! SWE-bench is the ultimate benchmark for autonomous programming agents. It evaluates an agent’s ability to autonomously read, write, test, and iterate on code in a real-world, GitHub-issue-like environment. So, how did he achieve this? By combining OpenAI’s powerful o1 model, our W&B Weave toolkit, and relentless experimentation, including 977 logged evaluations. The result? Precise debugging, streamlined iteration, and groundbreaking results on SWE-bench. This achievement reaffirms what we stand for at Weights & Biases: the belief that the BEST tools unlock the BEST results. For a detailed breakdown of Shawn’s process, check out his blog post here: https://lnkd.in/gsiRjg8e
2 Comments

Like Comment Share
Weights & Biases

77,027 followers
2w
Report this post
🛠️ New Tutorial: Weights & Biases Models + Weave Integration The combination of W&B Models and Weave simplifies: • LLM fine-tuning and tracking. • RAG chatbot integration. • Comprehensive evaluations, including metrics like accuracy, latency, and cost. Want to see it in action? Full Tutorial: https://lnkd.in/g725SWpq Colab Demo: https://lnkd.in/g8Fzy_-J Public Workspace: https://lnkd.in/gdyENh6Q
Like Comment Share
Weights & Biases

77,027 followers
2w
Report this post
🛠️ Ready to build a flawless RAG system? Join Pinecone & the W&B team on 1/22 in NYC 🗽 for a hands-on session on designing, evaluating, and optimizing Retrieval-Augmented Generation workflows. This workshop will cover: 1️⃣ Structuring RAG systems for balanced retrieval + generation. 2️⃣ Evaluating performance to identify improvements. 3️⃣ Advanced tools like Pinecone & W&B Weave for optimization. Details: 📍 1375 Broadway, NYC 🗓️ 1/22 ⏰ 6-9 PM EST RSVP here: https://lnkd.in/gzVsSEus
Like Comment Share

Browse jobs

Funding

Weights & Biases 7 total rounds

Last Round

Secondary market Oct 1, 2023

See more info on crunchbase

Weights & Biases

Software Development

San Francisco, California 77,027 followers

The AI developer platform.

About us

Products

Weights & Biases

Machine Learning Software

Locations

Employees at Weights & Biases

Altay Guvench

Software Engineer at Weights & Biases

Dan Scholnick

Investor, entrepreneur

Chander Matrubhutam

product marketing @Weights & Biases, ex-AWS

Roger Scott

Operating Executive | Chief Customer Officer (CCO) | Advisor | Investor | Ex Oracle, New Relic, Confluent, Miro

Updates

LLM Apps: Evaluation Course (Criteria Drift)

Gradient Dissent Podcast: DeepSeek, Stargate & AI's $600 Billion Dollar Question

Why we like working at W&B

LLM Apps: Evaluation (W&B Course)

LLM Apps: Evaluation Course (What is Position Bias?)

Join now to see what you are missing

Similar pages

Hugging Face

LangChain

Cohere

Lightning AI

Scale AI

Anthropic

MLOps Learners

Break Into Data

Pinecone

LlamaIndex

Browse jobs

Director of Product Management jobs

Director jobs

Manager jobs

Senior Product Manager jobs

Principal Product Manager jobs

Staffing Recruiter jobs

Engineer jobs

Product Manager jobs

Vice President jobs

Human Resources Specialist jobs

Senior Vice President Finance jobs

Director of Payroll jobs

Bid Manager jobs

Localization Manager jobs

Corporate Controller jobs

News Reporter jobs

Strategic Account Executive jobs

Director of Accounting jobs

Product Strategist jobs

Technical Product Manager jobs

Funding