In the field of Automatic Speech Recognition (ASR) research, it is conventional to pursue those approaches that reduce the word error rate. However, it is the author's belief that this seemingly sensible strategy often effectively leads to the suppression of innovation. This is the case when the leading approaches have been tuned for years, effectively optimizing for a local minimum in the space of all possible techniques. In this case, almost any sufficiently new approach will necessarily hurt the accuracy of existing systems and thus increase the error rate. However, if progress is to be made against the remaining difficult problems, new approaches will most likely be necessary. In this paper, I discuss a few research issues for ASR which, when investigated, will most probably first lead to (significant) increase of error rate, but hold some promise for ultimately improving performance in the end-applications. Some of these examples will illustrate cases of successful improvement of ASR through increasing error rates (in one case, to 130%!), while other examples will just describe ongoing work (which are still increasing error rates) and, finally, some discussion of new directions. Problems that will be addressed in this paper include: merging the language and acoustic models, role of prior information in speech recognition, Markov models, discrimination, signal analysis and temporal information, and decoding procedures reflecting human perceptual properties.
Bibliographic reference. Bourlard, Hervé (1995): "Towards increasing speech recognition error rates", In EUROSPEECH-1995, 883-894.