Meta Adds Indigenous Languages to Speech and Translation AI

Meta is seeking to make AI more inclusive with a program to support underserved languages “and help bring their speakers into the digital conversation.” Meta’s Fundamental AI Research (FAIR) unit has teamed with UNESCO to launch the Language Technology Partner Program, which is looking for people who can provide more than 10 hours of speech recordings (with transcriptions) and chunks of written text (200+ sentences, with translation) in diverse languages. “Partners will work with our teams to help integrate these languages into AI-driven speech recognition and machine translation models, which when released will be open sourced,” Meta said.

“Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background,” Meta explained in a blog post. Meta has already onboarded the government of Nunavut, Canada, which has agreed to work with the company to share data on two Inuit languages.

“Complementary to the new program, Meta said that it’s releasing an open source machine translation benchmark to evaluate the performance of language translation models,” TechCrunch points out, explaining that “the benchmark, composed of sentences crafted by linguists, supports seven languages and can be accessed — and contributed to — from the AI development platform Hugging Face.”

While Meta has couched both initiatives as philanthropic, “the company stands to benefit from upgraded speech-recognition and translation models,” TechCrunch notes.

PCMag observes that while Meta’s contributions in the field of translation and transcription are not as omnipresent as Alphabet’s Google Translate, “the company is devoting a lot of attention to it.”

In December 2023, Meta previewed a suite of models under the banner of Seamless Communication that it says can “translate speech from 101 different languages, which it presented as a key step toward a widely available speech-to-speech translation model.”

One such model, SeamlessExpressive, has a live demo that lets you “hear how you sound in another language.”

The tech giant says the new outreach is part of its No Language Left Behind (NLLB) project, launched in 2022. Through that effort the company has already collaborated with UNESCO and Hugging Face to build a language translator based on NLLB, which was announced during the United Nations General Assembly in September 2024.

Meta has posted an interest form for those who would like to participate in the Language Technology Partner Program. Applications to join “will be open until March 7, 2025, and the next steps will be discussed no later than April 15, 2025,” PCMag reports.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.