Giving a personal touch to the text-to-speech voices we all know and loathe, AI voice cloning allows you to type with the sound of your own voice. Today’s algorithms can create a high-quality clone of any person’s voice with very little effort upfront. We put the best to the test below.
What is the best AI voice cloning tool?
Voice cloning software has taken great strides in pace with artificial intelligence as a whole this year. Historically, we know text-to-speech (TTS) as robotic and unemotional, lacking the emotions, accents, and inflections of what we consider a realistic voice.
Essential AI Tools
Content Guardian – AI Content Checker – One-click, Eight Checks
Originality AI detector
Jasper AI
WordAI
Copy.ai
The platforms below break that preconception. Armed with these tools, your generated voice will be suitable for voiceovers, podcasts, audiobooks, advertisements, and any other use-case requiring pleasant and natural sounding voices. Try “What is ChatGPT – and what is it used for?” or “Inside ElevenLabs: Understanding their key services and functions” for further reading on AI.
ElevenLabs – 8/10
Still very impressive since the last time I tried it. Offering ‘Voice Library’ and ‘Voice Design’ for free, with paid tools including ‘Instant Voice Cloning’ and ‘Professional Voice Cloning’. Here’ what those words mean:
- Voice Design: Design entirely new voices by adjusting their parameters. Every voice you create is randomly generated and is entirely unique even if the same settings are applied.
- Instant Voice Cloning: Clone a voice from a clean sample recording. Samples should contain 1 speaker and be over 1 minute long and not contain background noise.
- Voice Library: Add a voice from their library.
- Professional Voice Cloning: “Pre-register to create an identical AI-version of your voice. Sequential roll-out starting from July.”
Descript – 7/10
Requires you to download their software, with a steeper learning curve than other options. Honestly, though, I’m not mad at it. The software is packed with useful and bespoke features, it’s essentially a whole NLE like Premiere Pro but free. I’ll have to come back to all these features some other time.
Murf – 0/10 (Disqualified for lack of voice cloning)
The free version of Murf AI includes:
- Access to 120+ AI voices
- Royalty free music, sorted by mood and use
- 10 minutes of voice generation time
- 10 minutes of transcription time
- Shareable links for audio/video output
- Single user login
- No voice cloning but rank themselves on Google for that search term.
Speechify – 6/10
The voice cloning aspect itself was very easy. It took about 2 minutes following a pre-set script so that the service has a consistent set of words to base the sound of your voice on. The quality of the result was reasonably human, enough to use it for marketing or content creation. However, it didn’t sound that much like me. I’m not just balking at the uncanny valley of it all, I mean it sounded just as much like me as it did Elon Musk – and a little more American than expected. The model may have it’s biases, but overall a very useful service.
Rask AI – 0/10 (Disqualified for lack of voice cloning)
When using Rask to transcribe a video, it does come with useful features such as auto-detection for ‘number of speakers’ and ‘language in video’.
- No voice cloning but rank themselves on Google for that search term.
- Vocal dubbing for video correctly identified the language to translate from and performed reasonably well with wind noise. Expressiveness and vocal characteristics of the original video were well maintained. I would tentatively rate what I’ve seen an 8/10 based on one sample experiment.
Elai – 7/10
Another much-requested service on the forefront of generative AI is Elai.
- No voice cloning as part of the free plan.
- Speech synthesis quality is excellent. Pauses, breaths, and intonation are all well implemented. This varies between individual voices, with some demonstrating odd audio artefacts and compression or otherwise quality degradation. However, some voices I would rate as highly as 9/10 for quality and expressiveness.
Conclusion
If you really care about quality, that will probably come from a premium plan (pro plan/enterprise plan). That said, the list above is useful for anyone just dipping their toes into the best AI voice generators with a free plan (basic plan/free version).
Remember, none of the above will provide good results without a decent recording of your own voice. The quality of the audio files you submit to any AI voice generator will affect the end result, in that background noise will remain audible in the resulting speech synthesis. A decent quality mic is also recommended, ideally a $50+ dedicated mic as opposed to a built-in laptop or headset mic. Less important (but still a consideration) is the use of wav files as opposed to mp3, as the former is uncompressed.
There are multiple factors that voice synthesis services consider ‘voice quality’. With AI apps that bring AI voice changer tech to iOS and Android through your smartphone mic, we don’t expect user-generated voices to match the high-quality voices of a voice actor in a professional recording studio. Which is fine! Much of the adoption of this tech comes from casual users, having a bit of fun. Deep learning technology combined with natural language processing (NLP) makes custom voice models easy for anyone to create, whether that be of your own voice or that of an actor, politician, or celebrity.