Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Nonlinear Dynamical Invariants for Speech Recognition

S. Prasad, S. Srinivasan, M. Pannuri, G. Lazarou, Joseph Picone

Mississippi State University, USA

There is growing interest in modeling nonlinear behavior in the speech signal, particularly for applications such as speech recognition. Conventional tools for analyzing speech data use information from the power spectral density of the time series, and hence are restricted to the first two moments of the data. These moments do not provide a sufficient representation of a signal with strong nonlinear properties. In this paper, we investigate the use of features, known as invariants, that measure the nonlinearity in a signal. We analyze three popular measures: Lyapunov exponents, Kolmogorov entropy and correlation dimension. These measures quantify the presence (and extent) of chaos in the underlying system that generated the observable. We show that these invariants can discriminate between broad phonetic classes on a simple database consisting of sustained vowels using the Kullback-Leibler divergence measure. These features show promise in improving the robustness of speech recognition systems in noisy environments.

Full Paper

Bibliographic reference.  Prasad, S. / Srinivasan, S. / Pannuri, M. / Lazarou, G. / Picone, Joseph (2006): "Nonlinear dynamical invariants for speech recognition", In INTERSPEECH-2006, paper 1799-Thu2BuP.11.