Mozilla is building a massive repository of voice recordings for the voice apps of the future and it wants you to add yours to the collection.
The organization behind the Firefox browser is launching Common Voice, a project to crowdsource audio samples from the public. The goal is to collect about 10,000 hours of audio in various accents and make it publicly available for everyone.
Building a database large enough to recognize speech that will power apps like the digital assistants that have become such a big part of our daily lives requires a ton of audio so the company is asking the public to pitch in, just like Google’s, who’s AI experiment Quick! Draw uses a fun sketch game to train its computer vision AI.
Mozilla hopes to hand over the public dataset to independent developers so they can harness the crowdsourced audio to build the next generation of voice-powered apps and speech-to-text programs.
You can also help train the speech-to-text capabilities by validating the recordings already submitted to the project. Just listen to a short clip, and report back if text on the screen matches what you heard.
Common Voice will look to expand beyond just a simplistic dataset. Mozilla says it aims is to expand the tech beyond just a standard voice recognition experience, including multiple accents, demographics and eventually languages for more accessible programs.
The first step for the project is to amass 10,000 hours of validated audio, since it says that’s roughly the amount of data needed to train a production speak-to-text system. Mozilla will then release the full database to the public later this year, and doesn’t count out its inclusion in future versions of its products, like Firefox.