Google, Mozilla, And The Race To Make Voice Data For Everyone

A voice-controlled virtual assistant–Siri, Alexa, Cortana, or Google Home–is only as good as the data that powers it. Training these programs to understand what you are saying requires a whole lot of real-world examples of human speech.

That gives existing voice recognition companies a built-in advantage, because they have already amassed giant collections of sample speech data that can be used to train their algorithms. A startup with hopes of competing in this arena would have to acquire its own set of voice audio files, perhaps from an existing archive such as the roughly 300-hour corpus built from TED Talk transcriptions.

Google’s recordings were collected as part of the AIY do-it-yourself artificial intelligence program, designed to enable makers to experiment with machine learning.

Amazon’s Alexa transmits voice queries from its users to a server, where they’re used to further train the tool. Apple teaches Siri new languages and dialects by hiring speakers to read particular passages of known text, and by having humans transcribe snippets of audio from the service’s speech-to-text dictation mode. Microsoft has reportedly set up simulated apartments around the world to grab audio snippets in a homelike setting to train its Cortana digital assistant.

Source

Leave a Comment

Your email address will not be published. Required fields are marked *