Description:  Automatic Speech Recognition on Mobile Devices and over Communication Networks

  Z.-H. Tan and B. Lindberg (Eds.)


  Springer-Verlag, London, 2008, Print ISBN: 978-1-84800-142-8, Electronic ISBN 978-1-84800-143-5.

  The book is available from SpringerAmazon (CA, DE, FR, JP, UK, US), Barnes & Noble. An online version of this book is available here.




About this book

The remarkable advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition on mobile devices and over communication networks. This trend is accelerating.
This book brings together leading academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks. It covers network, distributed and embedded speech recognition systems, which are expected to co-exist in the future. It offers a wide-ranging, unified approach to the topic and its latest development, also covering the most up-to-date standards and several off-the-shelf systems.

Table of Contents

List of Contributors

1. Network, Distributed and Embedded Speech Recognition: An Overview

Part I Network Speech Recognition

2. Speech Coding and Packet Loss Effects on Speech and Speaker Recognition
3. Speech Recognition over Mobile Networks
4. Speech Recognition over IP Networks

Part II Distributed Speech Recognition

5. Distributed Speech Recognition Standards
6. Speech Feature Extraction and Reconstruction
7. Quantization of Speech Features: Source Coding
8. Error Recovery- Channel Coding and Packetization
9. Error Concealment

Part III Embedded Speech Recognition

10. Algorithm optimizations: low computational complexity
11. Algorithm Optimizations: Low Memory Footprint
12. Fixed-Point Arithmetic

Part IV Systems and Applications

13. Software Architectures for Networked Mobile Speech Applications
14. Speech Recognition in Mobile Phones
15. Handheld Speech to Speech Translation System
16. Automotive Speech Recognition
17. Energy Aware Speech Recognition for Mobile Devices




AURORA Project Database: Aurora 2 (the Noisy TI digits database), Aurora 3 (a subset of the SpeechDat-Car database) and Aurora 4 (the Wall Street Journal (WSJ0) database and its noisy versions)

LDC - Linguistic Data Consortium. LDC supports language-related education, research and technology development by creating and sharing linguistic resources: data, tools and standards.

3GPP TS 26.243. ANSI-C code for the Fixed-Point Distributed Speech Recognition Extended Advanced Front-end, December, 2004


Zheng-Hua Tan and Miroslav Novak, "Speech Recognition on Mobile Devices: Distributed and Embedded Solutions," Tutorial at Interspeech 2008, Brisbane, Australia, 22-26 Sept. 2008.

Apple - iPhone 4S - Ask Siri to help you get things done. Voice recognition in the Cloud.

Promptu provides voice recognition-based search and navigation services for mobile phones. Promptu is based on a Distributed Speech Recognition (DSR) architecture. 

vlingo develops voice-enabled applications for mobile devices. vlingo technology includes two key components: hierarchical language model based speech recognition and adaptation.

Microsoft Response Point is a VoIP PBX phone system software designed for small businesses with simplicity in mind. It offers a breakthrough voice-activated user interface, and simplified setup and system management. 

MASTAR is a Multi-lingual Advanced Speech and Text reseARch Project aimed at speech to speech translation on mobile devices.

voice compass - the book referring to the voice market. "The voice compass is the compendium for Information and Communications Technology (ICT) and looks at the domain of voice, i. e speech applications on the telephone in full detail. Nowadays, every businessman, entrepreneur, manager or decision maker needs to be informed about the possibilities regarding voice applications. The voice compass gives a compact account thereof." The author of the book is Detlev Artelt. 

Automatic speech recognition on GPU (Graphics Processing Unit), see examples through Google

Cloud based speech recognition, see examples through Google.

Here is a list of related journals and conferences.


Zheng-Hua Tan

Department of Electronic Systems, Aalborg University,
Niels Jernes Vej 12, 9220 Aalborg, Denmark