First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Proposal and Evaluation of a New Scheme for Reliable Pitch Extraction of Speech

Hiroya Fujisaki, Keikichi Hirose, Shigenobu Seto

Faculty of Engineering, University of Tokyo, Tokyo, Japan

Analysis using short frame length is necessary in order to realize correct tracking of time-varying features of quasi-periodic signals such as speech. However, when the frame length is reduced for the analysis of rapidly changing signal characteristics, the analysis results are strongly affected by the position of the frame and sometimes may lead to gross errors. In the pitch extraction schemes using the conventional definition of the short-time autocorrelation function, the value of its peak indicating the fundamental period varies with the frame position. In order to reduce these variations, we present a new definition for the normalized shorttime autocorrelation function that does not require the selection of frame length. A new scheme for pitch extraction of speech is proposed that assures high accuracy of results without adjusting the frame length for each speaker. The validity of the proposed scheme is confirmed by the experiments using speech materials recorded by both male and female radio announcers.

