INTERSPEECH 2006 - ICSLP
There is growing interest in modeling nonlinear behavior in the speech signal, particularly for applications such as speech recognition. Conventional tools for analyzing speech data use information from the power spectral density of the time series, and hence are restricted to the first two moments of the data. These moments do not provide a sufficient representation of a signal with strong nonlinear properties. In this paper, we investigate the use of features, known as invariants, that measure the nonlinearity in a signal. We analyze three popular measures: Lyapunov exponents, Kolmogorov entropy and correlation dimension. These measures quantify the presence (and extent) of chaos in the underlying system that generated the observable. We show that these invariants can discriminate between broad phonetic classes on a simple database consisting of sustained vowels using the Kullback-Leibler divergence measure. These features show promise in improving the robustness of speech recognition systems in noisy environments.
Bibliographic reference. Prasad, S. / Srinivasan, S. / Pannuri, M. / Lazarou, G. / Picone, Joseph (2006): "Nonlinear dynamical invariants for speech recognition", In INTERSPEECH-2006, paper 1799-Thu2BuP.11.