🚀 Announcing the launch of Llama 3.2 and Llama Stack on Together AI, in partnership with AI at Meta. 🎉 We are excited to offer free access to the Llama 3.2 vision model for developers to build and innovate with open source AI. Start building with the Llama-Vision-Free model today: 👉 https://lnkd.in/gWxwdaVd ▶ What we are launching: - Free Llama 3.2 Vision Model (11B): Develop and experiment with our high-quality Llama-Vision-Free endpoint for multimodal tasks. - Together Turbo Inference Endpoints (11B, 90B): High performance and accuracy for tasks like image captioning, visual question answering, and image-text retrieval. - New Llama Stack APIs: Standardized APIs to simplify building agentic and retrieval-augmented generation (RAG) conversational apps. ▶ Unlock powerful use cases: - Interactive Agents: Build AI agents that process both image and text inputs Image Captioning: Create high-quality image descriptions for e-commerce and digital accessibility - Visual Search: Enable users to search via images, enhancing search efficiency for retail and e-commerce ▶ Industry applications: - Healthcare: Accelerate medical image analysis for faster diagnostics Retail & E-commerce: Revolutionize shopping with image-based search and personalized recommendations - Finance & Legal: Streamline workflows by analyzing visual and textual content to optimize contract reviews and audits 💡 Check out napkins.dev: Our open-source demo app uses Llama 3.2 vision to transform sketches and wireframes into React code! Try it out at https://napkins.dev 🔧 Get started today: Experiment with the Llama-Vision-Free endpoint, or build for production with Llama 3.2 Together Turbo endpoints. 🌟 Read more in the blog https://lnkd.in/gDfmXxzu
Together AI
Software Development
San Francisco, California 34,203 followers
The future of AI is open-source. Let's build together.
About us
Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society.
- Website
-
https://together.ai
External link for Together AI
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Francisco, California
- Type
- Privately Held
- Founded
- 2022
- Specialties
- Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing
Locations
-
Primary
251 Rhode Island St
Suite 205
San Francisco, California 94103, US
Employees at Together AI
-
Vipul Ved Prakash
Co-founder & CEO Together AI
-
Yaron Samid 🇮🇱🇺🇸🎗️
Founder & Managing Partner, TechAviv. 3X founder & CEO, investor, and community builder.
-
Justin Foutts
-
🧠 Ryan Pollock
🤖 Together.ai Product Marketing 🚀 Previously PMM / GTM / Eng @ Google Cloud, DigitalOcean, Oracle, Vultr 🌥 Expertise in Cloud IaaS, GPUs, AI…
Updates
-
📢 Excited to announce our upcoming webinar on November 6th, at 9AM PT: on"How We Built LlamaCoder (400k Users) – a Full-Stack AI App with Next.js"! Join Hassan El Mghari from our Dev Rel team as he dives into the story behind LlamaCoder, an open-source code generation tool with 400k users and 3k stars on GitHub. He’ll share exactly how he built it a single weekend, scaled it, and tips on creating your own full-stack AI apps with Next.js. Don’t miss out! Click the link to RSVP 👉https://lu.ma/euvemy0q
-
Join Hassan El Mghari from the Together AI Dev Rel team as he goes over how he built LlamaCoder – an open-source code generation tool where you can build small apps with a single prompt. It currently has over 400k users and 3k stars on GitHub. He’ll go over exactly how he built it in a single weekend, how he scaled it to 400k users, and advice for how to build your own full-stack AI apps with Next.js
How We Built LlamaCoder (400k Users) – A Full-Stack Next.js AI App
www.linkedin.com
-
We’re excited to share that our CEO, Vipul Ved Prakash, represented Together AI on an incredible panel titled "AI Disruption: The Bigger Picture" at #TecShift2024, the aramco Entrepreneurship Summit at Ithra in Dhahran, Saudi Arabia. Joined by fellow AI visionaries Aidan Gomez (CEO of Cohere), Alexander Ratner (CEO of Snorkel AI), and Yuan (Alan) Qi (CEO of INF TECH), Vipul explored AI’s transformative impact on healthcare—from diagnostic advancements to personalized treatments and the evolving role of healthcare practitioners in this technology-driven landscape. A big thank you to Aysar Tayeb of Prosperity7 Ventures for moderating this forward-looking discussion. #TecShift #FII8
-
New Cookbook: An open source implementation of Contextual RAG from Anthropic! In short, it involves: 1. Using Llama 3.2 3B to efficiently generate context for each chunk 2. Building two data indices: vector and BM25 3. Performing hybrid search using reciprocal rank fusion 4. Reranking hybrid search results 5. Generate with Llama 3.1 405B Check it out below! https://lnkd.in/gR9MniE5
-
👨🍳 New Cookbook: A step-by-step walkthrough of the PDF to Podcast workflow! In short it involves: 1. Structured decoding with Llama-3.1-70B on Together AI to extract a JSON script. 2. Cartesia TTS to bring the script to life! Notebook: https://lnkd.in/g7vS_4nV
-
We’re excited to power the AI infrastructure of leading enterprises like Salesforce, The Washington Post, and Zoom. With the Together Enterprise Platform, you can securely run inference, fine-tuning, and training on your own models, in your own environment, while unlocking 2-3x faster inference speeds and reducing operational costs by up to 50%. Learn more about our enterprise offerings at our new page: together.ai/enterprise
-
Announcing Together Cookbooks! 👨🍳 A collection of hands-on notebooks showcasing powerful use cases of open-source models with Together AI, including Text RAG, Multimodal Document RAG, Semantic Search, Rerankers, and Structured JSON extraction. Here's a glimpse at what’s inside: 🖼 Multimodal Document RAG with Nvidia Investor Slide Deck: Implement multimodal RAG with ColQwen2 and Llama 3.2 90b Vision, combining text and images for advanced retrieval. 📊 Embedding Visualization: Visualize vector embeddings to explore structure in high-dimensional spaces. 🌐 Knowledge Graphs with Structured Outputs: Generate knowledge graphs from LLMs using structured JSON generation. 🧠 Semantic Search: Boost search precision with BERT-based embedding models for better retrieval. 🔎 Improving Search with Rerankers: Refine search results with rerankers to enhance relevance across large document corpora. 📰 Structured Text Extraction from Images: Extract structured text from images, ideal for document digitization and workflow automation. 📝 Text RAG: Implement text-based Retrieval-Augmented Generation to enrich responses with relevant knowledge. 🔗 Explore the cookbooks here: https://lnkd.in/g5E4VYqd
-
Our CEO, Vipul Ved Prakash, joined Clara Shih, CEO of Salesforce AI, on the Ask More of AI podcast to dive into the future of generative AI and how Together AI is leading the charge. In this episode, they break down how Together AI is optimizing AI workloads to make models faster, smarter, and more efficient for real-world applications. Key innovations like FlashAttention, speculative decoding, and model quantization are highlighted, showing how we’re transforming AI workloads for greater speed, scalability, and impact. It’s an exciting look at how we’re helping businesses bring AI from pilot to production at scale. 🎧 Tune in to the conversation here: https://lnkd.in/g8HU3P4B
As generative AI shifts from pilot to production, efficiency, cost, and scalability matter a lot more. Founded 2 years ago as "AWS for Generative AI," Together AI has raised $240M to provide cloud compute optimized for AI workloads. In this week's episode of my #AskMoreOfAI podcast, CEO/founder Vipul Ved Prakash talks about innovations to make models faster and smarter including: 🔹 FlashAttention: Smart GPU-aware tricks to reduce memory needed for calculating attention and rearrange calculations to speed up inference. 🔹 Speculative decoding: Speeds up inference by predicting multiple tokens in advance instead of one at a time, then selects the best ones and prunes the rest. 🔹 Model quantization: Reduce model size and speed up inference by reducing precision of numerical representations used in models without significantly degrading performance. In most LLMs, parameters are stored as 32-bit floating-point numbers, which consume a lot of memory and processing power. Quantization converts these to lower sig figs, eg 16-bit floats or even 8-bit integers. 🔹 Mixture of Agents, combining use of multiple specialized models (agents) that work together, with each agent handling a different aspect of a problem such as a sales agent, sales manager agent, deal desk agent, and legal contracts agents collaborating together. Vipul predicts that cloud compute for #GenAI will surpass the traditional hyperscaler business within 2-3 years. Salesforce Ventures is proud to have led the Series A earlier this year, and customers running models on Together can BYOM with Einstein Model Builder. 🎧 Listen or watch here! https://lnkd.in/g6XX4KCR