September 22-25, 1997
This paper addresses the problem of speech recognition through telephonic networks. When the communication channel is unknown, the important mismatch between training data and signal encountered in recognition phase decreases drastically the performances of the recognition systems. In this context, we compare a classical approach: the noise compensation method with novel robust networks modellings aiming to incorporate and manage more variability in the training data. We introduce multi-HMMs and multi-transitions systems, trained with data recorded through analog switched network and cellular phone network. These architectures present best results and succeed in improving the recognizers robustness since they achieve up to 77 % reduction of the error rate for a system trained for switched telephonic network and used with cellular phone. Nevertheless, this modelling requires training data recorded in both environments; when such data are not available, noise cancellation or channel compensation are the only affordable solutions.
Bibliographic reference. Puel, Jean-Baptiste / André-Obrecht, Régine (1997): "Cellular phone speech recognition: noise compensation vs. robust architectures", In EUROSPEECH-1997, 1151-1154.