Gemini Live takes on GPT-4o’s realistic AI speech

GPT-4o wowed the world with its human-like speech, which echoes technology from the movie “Her.” In response, Google launched a similar tool called Gemini Live. 

It’s an AI assistant that uses human-like speech, complete with realistic inflections, “umms,” and other human expressions. Moreover, you can choose different voices like you do with GPT-4o. 

Gemini Live is available in English on the Google Gemini app for Android devices. However, you must pay for a Gemini Advanced subscription, which costs $19.99 monthly. 

What can Gemini Live do?

Gemini Live works like Siri and other voice assistants but with more free-flowing, natural conversations. Ask a question, and it will answer with relevant information instead of requiring you to read sources yourself.

The search engine giant shared the Gemini Live demo on YouTube, showing the voice options Vega, Ursa, and Nova. Then, it recommends a few use cases like practicing for interviews, building positive habits, and brainstorming ideas.

READ: How to use the Google Gemini app

If that sounds familiar, it’s because OpenAI unveiled its advanced ChatGPT voice mode in May 2024. However, Gemini Live has a few advantages over OpenAI’s contender.

VentureBeat says Google has over 3 billion active Android users and 2.2 billion active iOS users. As a result, the tech juggernaut would likely open its AI voice tool to significantly more people.

Sissie Hsiao, Google’s Vice President and General Manager for Gemini experiences and Google Assistant, shares more details on Gemini Live. 

Hsiao admits AI unlocks powerful new capabilities but presents new challenges:

“Ironically, using large language models, that can better interpret natural language and handle complex tasks often means simple tasks take a moment longer to complete.”

READ: Google rolls out Pixel 9 phones earlier than usual

“And while generative AI is flexible enough to complete a wide array of tasks, it can sometimes behave in unexpected ways or provide inaccurate information.” 

That is why the company incorporated models like Gemini 1.5 Flash that provide faster, higher-quality responses. 

Nevertheless, Hsiao believes, “We’re in the early days of discovering all the ways an AI-powered assistant can be helpful.”

“Gemini will just keep getting better.” 

Read more...