Voice recognition is a must-have on a smartphone these days and trying to improve it further for better usage, Google has now introduced an all-neural, on-device speech recognition system on Gboard.
Until now, Google has been using cloud-based processing. It uses a “decoder graph” — a component of the algorithm which matches spoken words to written words and consumed 2GB of storage.
Also, the voice recognition process was a long one (users said something and took a while to reach the servers and finally process it), and the latency rate was pretty high.
The new speech recognizer takes the entire process offline, meaning voice recognition will take place on your device, and therefore, a whole lot faster.
Using RNN transducer (RNN-T) technology, Google was able to compress its speech algorithms small enough to fit easily on a smartphone, with a small size of 80MB.
For those who don’t know, RNN-T is a type of sequence-to-sequence model which processes the entire input sequence by continuously working on the input samples to produce an output.
Due to this, the improved technology recognizes the speech character-by-character (it transcribes as you speak) with much more accuracy for better and quick conversion of voice to text.
However, this functionality is currently available on Gboard for Pixel users only and supports American English.
Google is expected to eventually make the new voice recognition system available in more languages to other non-Pixel smartphones as well.