In July of last year, Google announced two new APIs for their Google Cloud Platform customers that had to do with speech recognition. These were made available in beta form to a limited number of developers and the Cloud Natural Language API and the Cloud Speech API. Along with being limited to a certain number of developers, the functionality of these APIs were also quite limited as Google was still working to enhance and optimize their capabilities.
The Cloud Speech API is Google’s own Automatic Speech Recognition (ASR) service and it actually powers the speech recognition capabilities for a number of Google’s products (such as Google Search, Google Now, Google Assistant). Google took that technology and adapted it to fit the needs of Google Cloud customers and this was how the Cloud Speech API was born. Earlier this week, Google not only made this technology generally available to developers, but they also announced a big update for it as well.
This update includes a couple of enhancements to previous features and added support for some new file types. Some developers had complained that transcribing long-form audio wasn’t very accurate, so Google says this update will improve things on that end. The service is also faster in certain cases with developers seeing it being 3 times as fast as before for batch scenarios. The last highlight of this update is the added support for WAV, Opus and Speex file formats.
Since the launch of the Google Cloud Speech API, the company has seen a couple of popular use cases for the service. Naturally, Google has seen a number of developers adopting the service to add voice search, voice commands and Interactive Voice Response (IVR) to their product. But they’re also seeing it being used for speech analytics and this has proven useful for businesses who are looking for real-time insights from their call centers.