US Patent:
20170148431, May 25, 2017
Inventors:
- Sunnyvale CA, US
Jingdong Chen - Beijing, CN
Mike Chrzanowski - Sunnyvale CA, US
Erich Elsen - Mountain View CA, US
Jesse Engel - Oakland CA, US
Christopher Fougner - Palo Alto CA, US
Xu Han - Sunnyvale CA, US
Awni Hannun - Palo Alto CA, US
Ryan Prenger - Oakland CA, US
Sanjeev Satheesh - Sunnyvale CA, US
Shubhabrata Sengupta - Menlo Park CA, US
Dani Yogatama - Sunnyvale CA, US
Chong Wang - Redmond WA, US
Jun Zhan - San Jose CA, US
Zhenyao Zhu - Mountain View CA, US
Dario Amodei - San Francisco CA, US
Assignee:
Baidu USA LLC - Sunnyvale CA
International Classification:
G10L 15/06
G10L 25/18
G10L 15/197
G10L 15/16
Abstract:
Embodiments of end-to-end deep learning systems and methods are disclosed to recognize speech of vastly different languages, such as English or Mandarin Chinese. In embodiments, the entire pipelines of hand-engineered components are replaced with neural networks, and the end-to-end learning allows handling a diverse variety of speech including noisy environments, accents, and different languages. Using a trained embodiment and an embodiment of a batch dispatch technique with GPUs in a data center, an end-to-end deep learning system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.