Home > AI

SeamlessExpressive AI translation from Meta breaks new ground

Could Meta take the AI audio crown from ElevenLabs?
Last Updated on December 8, 2023
SeamlessExpressive from Meta AI, the real-time speech translation AI model.
PC Guide is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Read More
You can trust PC Guide: Our team of experts use a combination of independent consumer research, in-depth testing where appropriate - which will be flagged as such, and market analysis when recommending products, software and services. Find out how we test here.

Meta AI continues to release impressive new tools in the wake of Meta Connect, this September. Its latest offering is a generative AI audio tool that allows real-time voice translation from one language to another, while maintaining the vocal style, tone, and expressiveness of the original audio. SeamlessStreaming, the open-source code that makes this possible, is already available on GitHub. So how does SeamlessExpressive work, and how could you use it yourself?

What is SeamlessExpressive from Meta AI?

SeamlessExpressive is an AI tool from Meta that allows you to maintain the vocal style, tone, and expressiveness of your message. Intonation is an important aspect of vocal communication in most, if not all, languages. This aspect has been hereto missing from translation tools, and not due to complacency or lack of priority — it’s just very tricky to do.

The task of maintaining the inflection in your voice comes at the end of a very long series of other technological challenges, each state-of-the-art in their own right.

First, you have to master text-to-text translation, between “almost 100 languages” as Meta has done, despite the fact that not all languages share individual words with identical meanings and social implications. Then, to do so from text-to-speech, or from speech-to-text, requires voice recognition technology and a means to translate, not only between languages, but between modalities. In fact, SeamlessExpressive itself builds on a predecessor called SeamlessM4T, which handles everything but expressiveness.

In short, it’s very impressive and directly competes with similar technology from ElevenLabs.

Essential AI Tools

Editor’s pick
Only $0.00019 per word!

Content Guardian – AI Content Checker – One-click, Eight Checks

8 Market leading AI Content Checkers in ONE click. The only 8-in-1 AI content detector platform in the world. We integrate with leading AI content detectors to give unparalleled confidence that your content appear to be written by a human.
EXCLUSIVE DEAL 10,000 free bonus credits

Jasper AI

On-brand AI content wherever you create. 100,000+ customers creating real content with Jasper. One AI tool, all the best models.
TRY FOR FREE

WordAI

10x Your Content Output With AI. Key features – No duplicate content, full control, in built AI content checker. Free trial available.
TRY FOR FREE

Copy.ai

Experience the full power of an AI content generator that delivers premium results in seconds. 8 million users enjoy writing blogs 10x faster, effortlessly creating higher converting social media posts or writing more engaging emails. Sign up for a free trial.
TRY FOR FREE

Writesonic

Create SEO-optimized and plagiarism-free content for your blogs, ads, emails, and website 10X faster. Start for free. No credit card required.

How does real-time AI speech translation work?

SeamlessExpressive can translate from text-to-text, speech-to-text, text-to-speech, and speech-to-speech. It will do so while maintaining the emotional tone of the original input, and with less than two seconds of latency, making it usably real-time translation and or speech output, regardless of modality.

The data set that underpins this technology is also being released under an open-source license.

In keeping with our approach to open science, we’re publicly releasing SeamlessM4T under a research license to allow researchers and developers to build on this work. We’re also releasing the metadata of SeamlessAlign, the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments.

Meta AI

SeamlessExpressive output languages

The Meta AI real-time voice translator is capable of audio outputs in at least the following languages

  • English: English
  • Spanish: Español
  • Italian: Italiano
  • German: Deutsch
  • French: Français
  • Japanese: 日本語 (Nihongo)
  • Javanese: ꦧꦱꦗꦮ (Basa Jawa)
  • Croatian: Hrvatski
  • Hungarian: Magyar
  • Kamba: Kikamba
  • Vietnamese: Tiếng Việt
  • Luxembourgish: Lëtzebuergesch
  • Ganda: Luganda
  • Icelandic: Íslenska
  • Luo: Dholuo
  • Maltese: Malti
  • Welsh: Cymraeg
  • Finnish: Suomi

Steve is the AI Content Writer for PC Guide, writing about all things artificial intelligence. He currently leads the AI reviews on the website.