OpenAI is making its voice cloning AI model, called Voice
Engine, available to a limited number of companies and developers.
Voice Engine has been in the making since 2022 and is designed
to generate voice clones based on 15-second long reference clips. According to
OpenAI, the model was trained on “a mix of licensed and publicly available
data.” It can read out text prompts on command both in the speaker’s language
as well as other languages. It also provides the foundation for preset
voices for the text-to-speech API as well as ChatGPT’s Read Aloud feature.
Companies from a wide range of sectors are included among those with early access to Voice Engine – from education tech and storytelling to AI communication, health, and more.
Education technology platform Age of Learning. for instance,
has used Voice Engine to create pre-scripted voice-over, and real-time,
personalized responses that it reads out to students.
Just last month, the Federal Communications Commission
banned robocalls using AI voices and therefore, OpenAI is making sure to comply
with the new policies. The company and its partners have agreed to not use the voice
generation technology to impersonate people or organizations without their
consent.
There’s also requirements to obtain explicit and informed consent of the original speaker, avoid enabling users to create their own voices, and inform listeners that the voices are AI-generated. The audio clips will also be watermarked, so their source can be tracked, and usage can be monitored.