Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Stress Independent Robust HMM Speech Recognition using Neural Network Stress Classification

Brian D. Womack, John H. L. Hansen

Robust Speech Processing Laboratory, Duke University Department of Electrical Engineering, Durham, North Carolina, USA

It is well known that the variability in speech production due to task induced stress contributes significantly to loss in speech recognition performance [6, 8]. If an algorithm could be formulated which estimates the speech stress condition, then such knowledge could be integrated to improve robustness of speech recognizers in adverse conditions. In this paper, the problem of automatic stressed speech recognition is addressed. The primary goal is to formulate a tandem HMM and neural network based algorithm for stress independent recognition. To motivate an effective stress .classifier, an analysis is performed of speech produced across eleven stress conditions (e.g. Angry, Clear, Fast, Lombard, Loud, Slow, Soft, etc.). Features that differentiate stress using a previously established stressed speech database (SUSAS) are employed. A neural network algorithm is formulated to estimate a speech stress condition probability vector (with classification rates on the order of 59-100%). The stress classification output probability vector is used to weight the outputs of a codebook of stress dependent HMM recognizers to generate an improved overall recognition score (for a 6-11% improvement over neutral or multi-style trained recognition systems). It is suggested that this approach will accommodate the intra-speaker variability due to task induced stress in adverse conditions.

Full Paper

Bibliographic reference.  Womack, Brian D. / Hansen, John H. L. (1995): "Stress independent robust HMM speech recognition using neural network stress classification", In EUROSPEECH-1995, 1999-2002.