Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

A Spectral-Temporal Method for Pitch Tracking

Stephen A. Zahorian, Princy Dikshit, Hongbing Hu

Old Dominion University, USA

In this paper, a new spectral/temporal method is described for robust pitch tracking for both high quality and telephone speech. A previous version of this algorithm was presented as YAAPT (Kasi and Zahorian, 2002) [10]. In the current paper, a novel method is presented for spectral pitch tracking, using nonlinear processing to partially restore the potentially missing fundamental frequency. A frequency domain modified autocorrelation is used to determine the spacing between harmonic peaks in the spectrum. The frequency domain spectral track is then used to refine time-domain pitch candidates obtained using the "NCCF or Normalized Cross Correlation" reported by Talkin [1]. Dynamic programming is used to find the "best" pitch track among all the candidates, using both local and transition costs. The algorithm was evaluated using the Keele pitch extraction reference database.

