Microsoft AI Voice Tool Mimics Voices From Three-Second Clips
People have been clamoring about AI-generated text and art. Now, we should look out for the next step of artificial intelligence: Microsoft AI.
The tech giant announced its latest AI creation named VALL-E. It can say anything in your voice based on a three-second sound bite.
At the time of writing, it is not available for public use. Eventually, it would become a mainstream tool and improve, similar to ChatGPT and DALL-E.
How does the Microsoft AI VALL-E work?
Microsoft published a Github report discussing VALL-E in layman’s terms. The overview says it creates relies on a text and acoustic prompt.
The former indicates what someone is saying in text form. Meanwhile, the latter is another term for a three-second recording of their voice.
Then, VALL-E uses Neural Codec Language Modeling to turn them into personalized speech. It combines the user’s preferences and machine learning.
As a result, the new Microsoft AI provides voice statements that are nearly indistinguishable from a real person’s voice.
Moreover, the VALL-E researchers say it could “preserve the speaker’s emotion and acoustic environment of the acoustic in synthesis.”
In other words, the samples could have ambient noise, further improving their realism.
Other text-to-speech tools have an unrealistic cadence and eerily absent background noise when “speaking.”
What is the potential real-world impact of VALL-E?
Some people dread the day when artificial intelligence can speak like humans. After all, we have seen many issues with AI-generated content.
Also, the internet has many deepfake clips featuring photos of prominent figures singing or dictating a silly quote.
Of course, we merely laugh at them because they are fake. Imagine if those deepfakes could closely replicate a politician’s or celebrity’s voice.
They could potentially interrupt governments and businesses. After all, the recent Twitter debacle showed the world the potential damage from a single tweet.
For example, pharmaceutical company Eli Lilly’s stock went down coincidentally after a fake account tweeted it would offer insulin for free.
If Microsoft AI falls into the wrong hands, telemarketers could use it to scam more people with automated realistic phone calls.
On the other hand, it could become a boon for people who need to promote their products and services online.
A business owner could use VALL-E to produce a promotional blurb. Then, they could overlay the clip into their online ad.
The latest Microsoft AI claims it can mimic your voice using a three-second clip. The results are allegedly so realistic that they could include ambient noise.
At the time of writing, the company does not allow the public to use VALL-E. Still, it could become widely available soon, similar to AI-generated art and text.
In response, you must adapt using the latest digital news and updates. Fortunately, you can start by following Inquirer Tech.
Subscribe to INQUIRER PLUS to get access to The Philippine Daily Inquirer & other 70+ titles, share up to 5 gadgets, listen to the news, download as early as 4am & share articles on social media. Call 896 6000.