Google to develop AI model that can understand 1000 spoken languages
Google has announced a new project to build a single AI language model that supports the world’s 1,000 most spoken languages.
According to The Verge, as a first step toward this goal, the company has developed a Universal Speech Model (USM) that has been trained in more than 400 languages. Google says it has partnered with communities across the world to source representative speech data.
According to the company, more than 7,000 languages are spoken in the world but only a few are represented online today.
“…traditional approaches to training language models on text from the web fail to capture the diversity of how we communicate globally. This has historically been an obstacle in the pursuit of our mission to make the world’s information universally accessible and useful,” Google’s senior VP Jeff Dean wrote in a blog post.
Also, traditional models of language have some flaws, such as re-enacting harmful societal biases such as racism and xenophobia and failing to comprehend human-oriented language, said Dean, a reason, why he said that Google is building an “AI model that will support the 1,000 most spoken languages, bringing greater inclusion to billions of people in marginalised communities all around the world”.
Since the project is ambitious and requires inclusion from across the world, it could take many years to complete. However, Google is already working on the project and says it can see the “path clearly”, the blog said.
Besides, the company shared new research on text-to-video models, a prototype AI writing assistant called Wordcraft, and an update to its AI Test Kitchen app that allows users access to under-development AI models like Imagen, which converts text into images, it added.
Google is not the only company bullish on innovating new AI language models. At Nvidia’s Speech AI Summit held on November 2, the company announced its new speech AI ecosystem, which it developed through a partnership with Mozilla Common Voice. The ecosystem focuses on developing crowdsourced multilingual speech corpuses and open-source pretrained models. The companies aim to accelerate the growth of automatic speech recognition models that work universally for every language speaker worldwide.
Nvidia said that standard voice assistants, such as Amazon Alexa and Google Home, support fewer than 1% of the world’s spoken languages. To solve this problem, the company aims to improve linguistic inclusion in speech AI and expand the availability of speech data for global and low-resourced languages.