Linear Prediction (LP) analysis has proven to be very effective and successful in speech analysis and speech synthesis applications. This may be due to the fact that LP analysis captures implicitly the time-varying vocal tract area function. However, it captures only the second-order statistical relationships and only the linear dependencies in the sequence of samples of speech signals (and not the higher-order relations), as a result of which the LP residual is also intelligible. This paper studies the effectiveness of nonlinear prediction (NLP) of the speech signal by using the state-of-the-art Volterra-Wiener series and uses a novel chaotic titration method to analyze the chaotic characteristics of the residual obtained by both the LP and NLP methods. The experimental results demonstrate that the proposed NLP approach gives less prediction error, relatively flat residual spectrum, less PESQ score (i.e., objective evaluation of MOS to a certain extent) and less chaoticity than its LP counterpart. Finally, the L1 norm and L2 norm of NLP residual was found be relatively less than LP residual for five instances of voiced and unvoiced regions extracted from speakers of TIMIT database.
Bibliographic reference. Patil, Hemant A. / Patel, Tanvina B. (2013): "Nonlinear prediction of speech signal using volterra-wiener series", In INTERSPEECH-2013, 1687-1691.