4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
It is well known that the LSP coefficient which represents the speech spectrum envelope as one of the linear prediction coefficients, shows a good performance of spectral interpolation along the time axis, but it is also known that the duration of interpolation is limited up to 20 ~ 30 ms. This limitation makes it difficult to reduce the bit rate in very low bit rate speech coding. To resolve this problem, recurrent neural networks (RNN) were applied to interpolate LSP coefficients, and it was possible to increase the duration of interpolation to about 100 ms without so much degradation of the synthesized speech quality.
Bibliographic reference. Kohata, Minoru (1996): "An application of recurrent neural networks to low bit rate speech coding", In ICSLP-1996, 314-317.