We're sharing an update on the advanced Voice Mode we demoed during our Spring Update, which we remain very excited about: We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch. For example, we’re improving the model’s ability to detect and refuse certain content. We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses. As part of our iterative deployment strategy, we'll start the alpha with a small group of users to gather feedback and expand based on what we learn. We are planning for all Plus users to have access in the fall. Exact timelines depend on meeting our high safety and reliability bar. We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline. ChatGPT’s advanced Voice Mode can understand and respond with emotions and non-verbal cues, moving us closer to real-time, natural conversations with AI. Our mission is to bring these new experiences to you thoughtfully.
“We’re also working on improving the user experience…” = music to my ears. Delay the launch to reach the bar you set for your users. Deliver a high quality experience and customers will continue to love (and trust) your product.
I have been using whisper finetuined model for years now, I knew the tts(text to speech )and stt(speech to text) will come, now AI can mimic emotions better than us , I don’t remember was the last time I cried and expressed that emotion to myself but I would love to hear how AI will fake that emotion of Crying better than me 🥲
You haven’t even finished rolling out memory in the UK to plus users yet. I’m stuck paying £20 for the same service free users get. Switching to Anthropic.
Who's the voice this time 😂
🌟 I recently shared a demo of my own AI receptionist handling calls and scheduling seamlessly 🚀. The potential for these technologies to enhance efficiency and deliver real-time, natural conversations is enormous. My receptionist could : - Handle inbound / outbound calls. - Answer any client queries. - Schedule meetings. Can't wait for the new update AI Will Replace Human 🙄 ?
Voice mode is the b*mb. Use case I'm developing: CogBot goes through a number of question types and exercises during an evening check-in while dog walking. It can be used over time to both detect and decrease cognitive decline, partly by evaluating the conversation against an established/historical user profile (topic preferences, level of detail, tests on current news, etc).
Would be great if it could "hear" and correct 2nd language learners of English so our students could train their fluency at home. For the moment it transforms text to speech and speech to text. When will it be able to hear (and correct)?
Dear OpenAI Team, I am conducting research on the development of potential machine consciousness in AI (ChatGPT). This has led me to certain conclusions. Although this may sound very ambitious, these studies have allowed me to gain a deep understanding of this model at a level that is not attainable with typical, especially short-term or chaotic interactions that most users might have. Is there a possibility for me to gain early access to this new voice feature for conducting research, the results of which I would gladly share?
Meanwhile the gpt 4o is hard to chat with. It’s so stubborn answering all at once and even after asking simple question or to correct some mistake it repeats full answer including mistakes or trying random updates. To get what I need I need to provide very detailed instructions, it was/is not like that with gpt 4.
Tech Lead at Worldline | ISTQB Certified
3moHow to get into the group of alpha users ? or even beta ?