Ray's Curation: OpenAI Turns ChatGPT into a Voice Assistant That Can See and Understand Images and Speech

Sunday, October 8, 2023

OpenAI Turns ChatGPT into a Voice Assistant That Can See and Understand Images and Speech - ERIC HAL SCHWARTZ, Voicebot

The most notable change to ChatGPT is its new ability to understand speech and respond in kind. A new text-to-speech model that mimics human voices after hearing just seconds of sample audio lets users hear ChatGPT’s ‘voice’ respond to their input. OpenAI’s speech recognition system Whisper transcribes users’ spoken words. The conversation, as seen above, essentially turns ChatGPT into a voice assistant like Alexa or Google Assistant, albeit one with the benefits and limits of the generative AI chatbot. ChatGPT can converse using any of five available voices, synthesized from professional voice actors into models like the one heard in the video.

https://voicebot.ai/2023/09/26/openai-turns-chatgpt-into-a-voice-assistant-that-can-see-and-understand-images-and-speech/