SpeechStew: Google Researchers Mixes All Available Voice Data For Better Speech Recognition

A team from Google Research and Brain departments has reportedly used “all” currently available output data for speech recognition to train a single, huge neural network — SpeechStew.

As reported by VentureBeat, in terms of speech recognition, the result achieved values ​​that could compete with other models with improved details.

Often, models for speech recognition are only trained with one set of initial data since these are often very homogeneous in terms of their annotation and, above all, speech quality. Ultimately, this also simplifies working with the data and optimizing a model.

In the case of the SpeechStew model, the researchers involved decided on a completely different approach. According to the information, the speech data from the following corpora for spoken language were combined for SpeechStew — “AMI, Broadcast News, Mozilla Common Voice, Librispeech, Switchboard/Fisher, Tedlium, and Wall Street Journal.

These were simply mixed together without specially weighting or coordinating individual components. Together, the data comprises more than 5,000 hours of annotated voice data.

According to the team’s statements, SpeechStew achieves, as mentioned, the speech recognition of other modern systems in some benchmarks or even exceeds it in some cases. In addition, the model should be able to adapt to different tasks.

Comparatively, little additional initial data was sufficient to ultimately achieve the results of specially trained and adapted models. This is likely to be due to the wide variety of the selected starting data.

When asked by Venture Beat about the practical application of these findings, the researchers involved respond cautiously. It is possible that work like SpeechStew could be used in the future as a kind of general model that would serve as the basis for other specialized tasks in speech recognition.

Avinash A
Avinash A
Meet Avinash, a tech editor with a Master's in Computer Science and a passion for futuristic tech, AI, and Machine Learning. Known for making complex tech easy to understand, he's a respected voice in leading tech publications and podcasts. When he's not deciphering the latest AI trends, Avinash indulges in building robots and dreaming up the next big tech breakthrough.


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More from this stream