Daily

Daily

Software Development

San Francisco, CA 3,024 followers

Build voice, video, and real-time AI into any app

About us

Daily is a real-time voice, video, and AI platform for developers. Build and scale on our secure and compliant global infrastructure. Loved by developers at startups to the Fortune 500, Daily offers AI-ready capabilities, customizable and flexible APIs, advanced recording, real-time insights, and more — all delivered with transparent and affordable pricing.

Website
https://daily.co
Industry
Software Development
Company size
51-200 employees
Headquarters
San Francisco, CA
Type
Privately Held
Founded
2016
Specialties
WebRTC, AI, Video, and Voice

Products

Locations

Employees at Daily

Updates

  • Daily reposted this

    This is a nice example of what production Voice AI applications often look like. The core voice AI loop is [speech-to-text] + [LLM] + [text-to-speech]. But in real-world use cases you also need: - event-driven UI updates - use-case specific context management - context storage/hydration - observability/metrics - text processing/reformatting - content guardrails - function calling abstractions and orchestration helpers This is Pipecat code. Pipecat is the largest framework for real-time voice and multi-modal AI — Open Source and vendor-neutral tooling for building voice AI agents, interactive video avatars, voice co-pilots, and more. https://lnkd.in/gRQfHTY2

    • No alternative text description for this image
  • View organization page for Daily, graphic

    3,024 followers

    Voice AI + Winne the Pooh 🤩 This is a wonderful example of real-time voice AI augmenting experiences. Neat post from David Beros and Brian Foody about what they're building at Sprout!

    View profile for David Beros, graphic

    Head of Product @RWG | Honing the craft of great product

    🌱 Over the past few weeks, my good friend Brian Foody and I have spent some evenings working on something close to home—creating screen-free, audio-based language and learning games for our kids. Today, we’re excited to introduce Sprout! Inspired by the joy our daughters experience listening to beautiful audiobook classics like Beatrix Potter’s The Tales of Peter Rabbit, and A.A Milne’s The House at Pooh Corner, we started thinking: what if these were interactive? How might we create rich, engaging, voice-driven experiences that educate and entertain without more screen time? The results exceeded our expectations, so we shipped it. One of our first releases is the game Guess the Animal, featuring the lovable (and now public-domain!) character of Winnie the Pooh. It's been super fun to work hands-on with new AI tooling including Daily, Cartesia, and Anthropic's Claude LLM to design and build the experience, plus the power of Framer for a rapid website. We’re excited to see Sprout in the hands of families everywhere. Give it a try with your kids and let us know what they think. 👇 #AI #EdTech #Parenting #LanguageLearning

    • No alternative text description for this image
  • View organization page for Daily, graphic

    3,024 followers

    Big Pipecat release today! Lots of low-level improvements to performance and ergonomics. ✳️ TTS service additions and improvements (Google, AWS, Azure). The event framework that makes it easy to build complex client apps and workflows is rounding out nicely. ✳️ The team spent a lot of time on this release working on function calling implementations and tests, across multiple models. (Stay tuned for more thoughts on this!) 🔥 There will be another release tomorrow, to add support for the OpenAI Realtime API! 🔥 Follow along in the repo/discord if you're interested in Python+WebRTC tooling for gpt-4o-realtime-preview. 🙌 via Aleix Conchillo Flaqué, our fearless maintainer: "Congrats everyone on this release because, whether you contribute or just use it, you all just make Pipecat better." #pipecat #python #webrtc #voiceai #multimodal #ai #genai #conversationalai #openai #realtimeapi #llm #llms

    Release v0.0.42 · pipecat-ai/pipecat

    Release v0.0.42 · pipecat-ai/pipecat

    github.com

  • View organization page for Daily, graphic

    3,024 followers

    Developers love building conversational voice AI agents with Cartesia. They bring the latest in research to enable state-of-the-art, realistic voices. We're excited to partner, and make it easy for developers to build with Cartesia right in Daily Bots! #conversationalAI #voiceAI #genAI

    View organization page for Cartesia, graphic

    3,062 followers

    🎉Daily launches Daily Bots with Cartesia as Primary Voice Provider! 🚀🤖 Daily launched Daily Bots last month, attracting hundreds of developers instantly. We're honored to be the default provider for both Daily Bots and Pipecat, one of the fastest-growing open-source frameworks for voice AI. Since 2016, Daily has been pioneering real-time multimodal AI. It's a privilege to collaborate with Kwindla Hultman Kramer and the exceptional Daily team. Why Daily Bots with Cartesia is a Game-Changer: ⚡️ Ultra-low latency: voice-to-voice responses as fast as 500ms 🔄 Interruption support: Natural conversation flow, just like human dialogue 🧼 Clean abstractions: Simplified development for faster deployment 📚 Rich example library: A treasure trove of resources for developers Read the full story here 👇

    Daily launches Daily Bots with Cartesia as Primary Voice Provider

    Daily launches Daily Bots with Cartesia as Primary Voice Provider

    cartesia.ai

  • View organization page for Daily, graphic

    3,024 followers

    What are the best tools for builders in 2024? Product Hunt is sharing expert insights across key categories. Daily cofounder Kwindla Hultman Kramer discusses the best #AI infrastructure tools. #GenAI #LLMs

    View organization page for Product Hunt, graphic

    84,416 followers

    Anonymous product reviews are low-value. Imagine instead you could ask the smartest founders what the best products in their niche are. How much more valuable would that be??? Today we launched Product Landscapes: - Immad Akhund broke down online banking software - Rajiv Ayyangar broke down video conferencing - Ben Lang broke down Notion templates .... and many more. Check out our launch here. We hope you learn a thing or two about these landscapes!! https://lnkd.in/e9MuQR7M

    Product Landscapes - Category overviews written by top founders | Product Hunt

    Product Landscapes - Category overviews written by top founders | Product Hunt

    producthunt.com

  • Daily reposted this

    This a really nice example of how to build an AI voice and video conversational agent that uses function calling. Function calling is an increasingly important component of production real-time AI applications. The latest LLMs are now quite good at calling functions reliably, which has greatly expanded the use cases that conversational AI is well suited to. For example, you can use function calling: ➡ as part of a dynamic RAG system that allows flexible access to a knowledge base, ➡ to help an AI follow a script, ➡ for saving information gathered from a user during a conversation, ➡ to implement lightweight lookup of dynamic information That last one is the basis for everyone's favorite docs example of function calling, the `get_weather()` tool! Function calling is still a fairly new capability and still a little bit challenging to implement from scratch for a new app. The first challenge is that you have to implement a control flow that glues together the LLM request/responses with code that makes the function calls and formats the function return values. The second challenge is that function calls add to latency, and low latency — fast response times — is critical for voice AI applications. The example in the tweet below uses state-of-the-art, low-latency AI tools from Tavus and Cartesia, and then goes the extra mile by hosting a Mistral AI model specialized for function calling on Cerebrium's excellent serverless GPU infrastructure. Go try out the demo. It's very, very responsive. It's also super-impressive that this is a pretty small amount of code — ~400 lines of Python, including all the function calling and scripting of the AI agent.

    View organization page for Cerebrium, graphic

    1,039 followers

    At Cerebrium, we have built a few demos showcasing voice AI capabilities but we wanted to push the boundary and see if we could create realistic, human-like situations in order to train and onboard teams to perform better - recreating real life scenarios! An example of this is simulations in handling an angry customer on a sales call, praciting to sell a new product line or preparing for the notoriously stressful YC interview👀 Excited to see what companies create internally! We built this with great partners Tavus Cartesia Mistral AI Check the comments for links to the demo, blog and code:) #startups #ai #aiavatars #genai #llm #voiceAI

  • View organization page for Daily, graphic

    3,024 followers

    Today we’re excited to share our Twilio Voice native integration in Daily Bots, https://lnkd.in/gane5M7r Build voice-to-voice AI agents that directly use your Twilio numbers, Twilio Flex, Twilio Studio, and Twilio WebSockets, for both dial-in and dial-out. With Daily Bots adaptive voice AI, agents can hold real-time conversations for use cases like:  ✳️ Answer customer questions about bills, policies, and more ✳️ Schedule appointments ✳️ Triage and route customer inbound ✳️ Conduct sales outreach and interviews. ✳️ Follow up with clients and patients Daily Bots gives developers and enterprise flexibility in how they build adaptive voice AI, across telephony and mobile, on top of open source SDKs and Pipecat. Build with our partners Anthropic, Cartesia, Deepgram, OpenAI, and Together AI, as well as your preferred and custom models. Leverage function calling, tool use, and structured data generation. Talk to us about your on prem and VPC deployment needs. Links to blog post above, along with developer docs and YouTube walkthrough in comments! #Twilio #ProgrammableVoice #AI #ConversationalAI #Pipecat #TwilioVoice

    • No alternative text description for this image
  • Daily reposted this

    View profile for Tom Shapland, graphic

    CEO Canonical AI. YC Alum. Mixpanel for Voice AI Agents

    Here is why Voice AI will eat the world. Kwindla Hultman Kramer from Daily is a thought leader and community builder in the Voice AI world. Follow him if you’re not already :)

    Customer satistfaction scores go up with Voice AI deployments. Today's customer support operations are constrained on three axes: 1. staffing availability 2. access to the right information at the right time 3. process and technology Voice AI provides step-function improvements in 1 & 2 , and gives companies an opportunity to solve 3 in high-leverage/high-ROI ways. It's impossible to have enough humans on call to manage peak support volumes. (Staffing availability.) So if you can only call your health insurance support line after you get off work, for example, your wait times are going to be pretty long. Voice AI agents scale in a way that a human staff can't. It's hard to over-state how big a benefit this is for quality of customer experience. I'm down in the trenches doing these deployments, so I see the early customer satisfaction numbers that directly compare Voice AI agent experiences to human agents. Voice AI agents are already very good, beating human agents at a wide range of tasks. But even if you think you'd always rather talk to a human, or don't think a Voice AI agent is a good fit for specific support contexts, being able to deploy auto-scalable Voice AI agents that can handle common tasks will massively improve the general customer experience by reducing wait times overall. And why are Voice AI agents performing so well in tests and early deployments? A big reason is the ability of LLMs to make use of large amounts of semi-structured data, quickly. You can pull all of a customer's account records into a Large Language Model's context (maybe with a little bit of contextual/RAG filtering) and get immediate, accurate answers to questions that previously required a human agent to go step-by-step through complex records. (Access to the right information at the right time.)

  • View organization page for Daily, graphic

    3,024 followers

    Great notes for developers building conversational voice AI. Lukas Wolf breaks down the voice-to-voice pipeline, and shares insights from shipping Sonia (YC W24). Learn more below and check out Sonia (YC W24) for AI Therapy #conversationalai #ai #voiceai #generativeai #ycombinator

    View profile for Lukas Wolf, graphic

    Co-founder at Sonia (YC W24)

    Don’t spend a **** of money on generic voice AI providers. Build it yourself. What are the basic components? At the heart of a voice AI stack is an LLM that engages in the conversation by generating responses. Whether your LLM is a custom RAG or some other fancy architecture is up to you. LLMs operate on text, so you need a second model that transcribes speech into text. There’s a tradeoff between quality and speed, but if you want to build a real-time voice application these days, I don’t see anything else being used than Deepgram. What’s next? You need to take your generated text and synthesize a voice — unless you want to read the responses aloud to your clients. This is NOT what Paul Graham means by saying, “Do things that don’t scale.” There are amazing voice APIs, e.g., ElevenLabs high-quality voices and recently launched Cartesia with BLAZINGLY fast voice generation. But how do you move audio between the client and AI/server? WebRTC is the standard for sending real-time data online. Daily has an amazing WebRTC platform with client libraries that abstract playing audio for you (try that in Swift). So, what’s missing? We covered HOW to generate a response, but you must also decide WHEN to respond. Use a voice activity detection model like Silero that tells if your client is speaking. During silence, you can query a small, fast LLM to evaluate semantically if the client really finished speaking, and your AI should respond. Turn-taking is very domain-specific. At Sonia (YC W24), we’re building an AI therapist, and pace and turn-taking fundamentally differ from customer support or sales agents. To make turn-taking work really well, you want to incorporate the full audio signal, as the spoken word contains important information beyond just the text. Depending on your domain, a custom-built turn-taking model might make sense. Care about latency. Perplexity CEO Aravind Srinivas cares A LOT about latency. Improve latency by hosting models yourself. Baseten has a simple interface for model hosting with a large library of open-source models. But will GPT-4o voice mode come and haunt us? Embrace the progress in the field. It will be hard to beat multimodal model latency as long as you use a similarly complex LLM in your custom stack. But I don’t think latency matters since you get 1-2 second latency building it yourself. However, depending on the application, there will be huge benefits of feeding raw (or preprocessed) audio into the model without having text as the bottleneck modality. Many apps already go beyond voice with video avatars like Tavus. You can expect increased engagement for many applications (e.g. healthcare, education). The space is quite hot right now. Our YC friends at Arini (YC W24) handle phone calls for dentists, and Scritch (YC W24) recently launched AI assistants for vets. My calendar is open if you want to connect or get advice!

    Voice AI Chat - Lukas Wolf

    Voice AI Chat - Lukas Wolf

    calendly.com

Similar pages

Browse jobs

Funding

Daily 6 total rounds

Last Round

Seed
See more info on crunchbase