Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Speech Analyzer Using a Joint Estimation Model of Spectral Envelope and Fine Structure

Hirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama

University of Tokyo, Japan

We have been working on a new speech analyzer based on a parametric representation of speech governed by the F0 parameter, towards practical human-machine interfaces. As a precise estimation of the frequency response of the vocal tract from a real speech signal requires the power of each component of the harmonic structure to be accurately estimated, one hopes to have a high-precision estimation of F0. At the same time, under the empirical constraint that speech spectral envelopes are usually smooth in the power domain, half pitch errors can be significantly avoided. Therefore, F0 and the envelope should be estimated jointly rather than separately through an optimal estimation of the spectral envelope and the spectral fine structure. In this article, we introduce a new speech analysis method using a spectral model with a composite function of envelope and fine structure models.

Full Paper

Bibliographic reference.  Kameoka, Hirokazu / Roux, Jonathan Le / Ono, Nobutaka / Sagayama, Shigeki (2006): "Speech analyzer using a joint estimation model of spectral envelope and fine structure", In INTERSPEECH-2006, paper 1641-Thu2BuP.7.