Together AI

Software Development

San Francisco, California 34,034 followers

The future of AI is open-source. Let's build together.

See jobs Follow

View all 122 employees

About us

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society.

Website: https://together.ai
External link for Together AI
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2022
Specialties: Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing

Locations

Primary

251 Rhode Island St

Suite 205

San Francisco, California 94103, US

Get directions

Employees at Together AI

See all employees

Updates

Together AI

34,034 followers
1mo Edited
Report this post
🚀 Announcing the launch of Llama 3.2 and Llama Stack on Together AI, in partnership with AI at Meta. 🎉 We are excited to offer free access to the Llama 3.2 vision model for developers to build and innovate with open source AI. Start building with the Llama-Vision-Free model today: 👉 https://lnkd.in/gWxwdaVd ▶ What we are launching: - Free Llama 3.2 Vision Model (11B): Develop and experiment with our high-quality Llama-Vision-Free endpoint for multimodal tasks. - Together Turbo Inference Endpoints (11B, 90B): High performance and accuracy for tasks like image captioning, visual question answering, and image-text retrieval. - New Llama Stack APIs: Standardized APIs to simplify building agentic and retrieval-augmented generation (RAG) conversational apps. ▶ Unlock powerful use cases: - Interactive Agents: Build AI agents that process both image and text inputs Image Captioning: Create high-quality image descriptions for e-commerce and digital accessibility - Visual Search: Enable users to search via images, enhancing search efficiency for retail and e-commerce ▶ Industry applications: - Healthcare: Accelerate medical image analysis for faster diagnostics Retail & E-commerce: Revolutionize shopping with image-based search and personalized recommendations - Finance & Legal: Streamline workflows by analyzing visual and textual content to optimize contract reviews and audits 💡 Check out napkins.dev: Our open-source demo app uses Llama 3.2 vision to transform sketches and wireframes into React code! Try it out at https://napkins.dev 🔧 Get started today: Experiment with the Llama-Vision-Free endpoint, or build for production with Llama 3.2 Together Turbo endpoints. 🌟 Read more in the blog https://lnkd.in/gDfmXxzu
2 Comments

Like Comment Share
Together AI

34,034 followers
1d
Report this post
👨🍳 New Cookbook: A step-by-step walkthrough of the PDF to Podcast workflow! In short it involves: 1. Structured decoding with Llama-3.1-70B on Together AI to extract a JSON script. 2. Cartesia TTS to bring the script to life! Notebook: https://lnkd.in/g7vS_4nV
Like Comment Share
Together AI

34,034 followers
4d
Report this post
We’re excited to power the AI infrastructure of leading enterprises like Salesforce, The Washington Post, and Zoom. With the Together Enterprise Platform, you can securely run inference, fine-tuning, and training on your own models, in your own environment, while unlocking 2-3x faster inference speeds and reducing operational costs by up to 50%. Learn more about our enterprise offerings at our new page: together.ai/enterprise

1 Comment

Like Comment Share
Together AI

34,034 followers
5d Edited
Report this post
Announcing Together Cookbooks! 👨🍳 A collection of hands-on notebooks showcasing powerful use cases of open-source models with Together AI, including Text RAG, Multimodal Document RAG, Semantic Search, Rerankers, and Structured JSON extraction. Here's a glimpse at what’s inside: 🖼 Multimodal Document RAG with Nvidia Investor Slide Deck: Implement multimodal RAG with ColQwen2 and Llama 3.2 90b Vision, combining text and images for advanced retrieval. 📊 Embedding Visualization: Visualize vector embeddings to explore structure in high-dimensional spaces. 🌐 Knowledge Graphs with Structured Outputs: Generate knowledge graphs from LLMs using structured JSON generation. 🧠 Semantic Search: Boost search precision with BERT-based embedding models for better retrieval. 🔎 Improving Search with Rerankers: Refine search results with rerankers to enhance relevance across large document corpora. 📰 Structured Text Extraction from Images: Extract structured text from images, ideal for document digitization and workflow automation. 📝 Text RAG: Implement text-based Retrieval-Augmented Generation to enrich responses with relevant knowledge. 🔗 Explore the cookbooks here: https://lnkd.in/g5E4VYqd
1 Comment

Like Comment Share
Together AI

34,034 followers
1w
Report this post
Our CEO, Vipul Ved Prakash, joined Clara Shih, CEO of Salesforce AI, on the Ask More of AI podcast to dive into the future of generative AI and how Together AI is leading the charge. In this episode, they break down how Together AI is optimizing AI workloads to make models faster, smarter, and more efficient for real-world applications. Key innovations like FlashAttention, speculative decoding, and model quantization are highlighted, showing how we’re transforming AI workloads for greater speed, scalability, and impact. It’s an exciting look at how we’re helping businesses bring AI from pilot to production at scale. 🎧 Tune in to the conversation here: https://lnkd.in/g8HU3P4B
Clara Shih Clara Shih is an Influencer

CEO of Salesforce AI | Founder & Board Chair of Hearsay Systems | TIME 100 AI | WEF YGL
1w

As generative AI shifts from pilot to production, efficiency, cost, and scalability matter a lot more. Founded 2 years ago as "AWS for Generative AI," Together AI has raised $240M to provide cloud compute optimized for AI workloads. In this week's episode of my #AskMoreOfAI podcast, CEO/founder Vipul Ved Prakash talks about innovations to make models faster and smarter including: 🔹 FlashAttention: Smart GPU-aware tricks to reduce memory needed for calculating attention and rearrange calculations to speed up inference. 🔹 Speculative decoding: Speeds up inference by predicting multiple tokens in advance instead of one at a time, then selects the best ones and prunes the rest. 🔹 Model quantization: Reduce model size and speed up inference by reducing precision of numerical representations used in models without significantly degrading performance. In most LLMs, parameters are stored as 32-bit floating-point numbers, which consume a lot of memory and processing power. Quantization converts these to lower sig figs, eg 16-bit floats or even 8-bit integers. 🔹 Mixture of Agents, combining use of multiple specialized models (agents) that work together, with each agent handling a different aspect of a problem such as a sales agent, sales manager agent, deal desk agent, and legal contracts agents collaborating together. Vipul predicts that cloud compute for #GenAI will surpass the traditional hyperscaler business within 2-3 years. Salesforce Ventures is proud to have led the Series A earlier this year, and customers running models on Together can BYOM with Einstein Model Builder. 🎧 Listen or watch here! https://lnkd.in/g6XX4KCR
Like Comment Share
Together AI

34,034 followers
1w
Report this post
📬 Never miss a beat! Today we are introducing "Together We Build", a newsletter with a handpicked selection of news, product launches, novel research, and AI tools from Together AI. Subscribe to keep up with the latest developments in generative AI and LLMs! And don't miss our first issue 👇

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Together AI on LinkedIn

1 Comment

Like Comment Share
Together AI

34,034 followers
1w
Report this post
New work on linearizing LLMs! Like subquadratic capabilities? Like modern 7B+ LLMs? But don’t have the budget to pre-train billions of parameters on trillions of tokens to get subquadratic, 7B+ LLMs? Then check out LoLCATs, our new work led by Michael Zhang that converts existing Transformers like Llama and Mistral into state-of-the-art subquadratic variants. Now for the same cost as a LoRA finetune! LoLCATs builds on a simple framework to convert Transformers into subquadratic models: 1. Swap an LLM’s softmax attentions for more efficient alternatives. 2. Fine-tune the LLM to adapt to these layers & recover pre-trained quality. However, to improve linearized LLM quality, while drastically reducing the cost of this recovery, we build LoLCATs around two simple findings. First, we can learn how to approximate softmax attentions with existing linear attentions. This lets us replace softmax attentions with near-literal drop-in replacements, that are still subquadratic to compute. Next, this makes parameter-efficient fine-tuning like LoRA, sufficient to adjust for any approximation errors & rapidly recover LM quality. The results speak for themselves. LoLCATs-linearized Llama 3 8Bs and Mistral 7Bs significantly outperform both prior linearized LLMs and strong subquadratic LLMs, while only training 0.2% of their parameters on 0.003 - 0.02% of their training tokens. And we did one last thing! Mostly just because we could, we used LoLCATs to linearize the entire Llama 3.1 family. And deliver the first linearized 70B and 405B LLMs, while significantly improving over baseline qualities. Learn more on our blog here: https://lnkd.in/gQFeZxMY
- +2
1 Comment

Like Comment Share
Together AI

34,034 followers
2w
Report this post
New Blog post: How to build a realtime AI image generator with Flux Schnell and Together AI! We discuss: - Generating & refining images with Flux Schnell - Debouncing API requests to make them "realtime" - How we built "consistency mode" in 1 line of code Read here: https://lnkd.in/eApuAbAq
1 Comment

Like Comment Share
Together AI

34,034 followers
2w
Report this post
### In this event we'll discuss how you can perform RAG over complex PDF documents that contain images, graphs, tables text charts and more! We'll describe in detail how: - The new image retriever ColPali works - How you can finetune ColPali to improve further for your usecase - How to leverage multi-vector retrieval to retrieve from PDFs - How to use language vision models like the new Llama 3.2 vision series to perform document RAG

How to Build Multimodal Document RAG with Llama 3.2 Vision and ColQwen2

www.linkedin.com

2 Comments

Like Comment Share
Together AI

34,034 followers
2w
Report this post
Congratulations to Braintrust on their Series A! 🎉 We've been partnering with them to showcase the performance of our models in real-world tests. Check out this Braintrust recipe showing that Llama 3.2 vision models running on Together AI are 3x faster with the same accuracy as GPT-4o-mini and GPT-4o. Read here: https://lnkd.in/gq-jDNWq

Braintrust

2,569 followers
2w

We’re thrilled to announce that we've raised a $36M Series A led by Martin Casado at Andreessen Horowitz to advance the future of AI software engineering, bringing our total funding to $45 million. Through our work with top AI engineering and product teams from Notion, Stripe, Vercel, Airtable, Instacart, Zapier, Coda, The Browser Company, and many others, we’ve had a front-row seat to what it takes to build world-class AI products. Along the way, we’ve learned a few key lessons: - Crafting effective prompts requires active iteration. - Evaluations are crucial for systematically improving quality over time. - Production logs provide a vital feedback loop, generating new data points that drive better evaluations. Evals are just the first step to building AI apps. That’s why we’re also excited to introduce functions, the flexible primitive for creating prompts, tools, and scorers that sync between your codebase and the Braintrust UI.

Like Comment Share

Funding

Together AI 3 total rounds

Last Round

Series A Apr 13, 2024

US$ 106.0M

Investors

Salesforce Ventures + 14 Other investors

See more info on crunchbase

Together AI

Software Development

San Francisco, California 34,034 followers

The future of AI is open-source. Let's build together.

About us

Locations

Employees at Together AI

Vipul Ved Prakash

Co-founder & CEO Together AI

Yaron Samid 🇮🇱🇺🇸🎗️

Founder & Managing Partner, TechAviv. 3X founder & CEO, investor, and community builder.

Justin Foutts

🧠 Ryan Pollock

🤖 Together.ai Product Marketing 🚀 Previously PMM / GTM / Eng @ Google Cloud, DigitalOcean, Oracle, Vultr 🌥 Expertise in Cloud IaaS, GPUs, AI…

Updates

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Together AI on LinkedIn

How to Build Multimodal Document RAG with Llama 3.2 Vision and ColQwen2

www.linkedin.com

Join now to see what you are missing

Similar pages

Mistral AI

Perplexity

Hugging Face

Anthropic

Cohere

Anyscale

Glean

Scale AI

Pika

Fireworks AI

Funding