13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Noise Robust Pitch Tracking by Subband Autocorrelation Classification

Byung Suk Lee (1), Daniel P. W. Ellis (1,2)

(1) LabROSA, Columbia University, New York, NY, USA
(2) International Computer Science Institute, Berkeley, CA, USA

Pitch tracking algorithms have a long history in various applications such as speech coding and extracting information, as well as other domains such as bioacoustics and music signal processing. While autocorrelation is a useful technique for detecting periodicity, autocorrelation peaks suffer ambiguity, leading to the classic "octave error" in pitch tracking. Moreover, additive noise can affect autocorrelation in ways that are difficult to model. Instead of explicitly using the most obvious features of autocorrelation, we present a trained classifier-based approach which we call Subband Autocorrelation Classification (SAcC). A multi-layer perceptron classifier is trained on the principal components of the autocorrelations of subbands from an auditory filterbank. Training on bandlimited and noisy speech (processed to simulate a low-quality radio channel) leads to a great increase in performance over state-of-the-art algorithms, according to both the traditional GPE measure, and a proposed novel Pitch Tracking Error which more fully reflects the accuracy of both pitch extraction and voicing detection in a single measure.

Index Terms: speech, pitch tracking, machine learning, subband, autocorrelation, principal components

Full Paper

Bibliographic reference.  Lee, Byung Suk / Ellis, Daniel P. W. (2012): "Noise robust pitch tracking by subband autocorrelation classification", In INTERSPEECH-2012, 707-710.