Robust Method for Estimating F0 of Complex Tone Based on Pitch Perception of Amplitude Modulated Signal

Kenichiro Miwa, Masashi Unoki


Estimating the fundamental frequency (F0) of a target sound in noisy reverberant environments is a challenging issue in not only sound analysis/synthesis but also sound enhancement. This paper proposes a method for robustly and accurately estimating the F0 of a time-variant complex tone on the basis of an amplitude modulation/demodulation technique. It is based on the mechanism of the pitch perception of amplitude modulated signal and the frame-work of power envelope restoration based on the concept of modulation transfer function. Computer simulations were carried out to discuss feasibility of the accuracy and robustness of the proposed method for estimating the F0 in heavy noisy reverberant environments. The comparative results revealed that the percentage correct rates of the estimated F0s using five recent methods (TEMPO2, YIN, PHIA, CmpCep, and SWIPE’) were drastically reduced as the SNR decreased and the reverberation time increased. The results also demonstrated that the proposed method robustly and accurately estimated the F0 in both heavy noisy and reverberant environments.


 DOI: 10.21437/Interspeech.2017-1061

Cite as: Miwa, K., Unoki, M. (2017) Robust Method for Estimating F0 of Complex Tone Based on Pitch Perception of Amplitude Modulated Signal. Proc. Interspeech 2017, 2311-2315, DOI: 10.21437/Interspeech.2017-1061.


@inproceedings{Miwa2017,
  author={Kenichiro Miwa and Masashi Unoki},
  title={Robust Method for Estimating F0 of Complex Tone Based on Pitch Perception of Amplitude Modulated Signal},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2311--2315},
  doi={10.21437/Interspeech.2017-1061},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1061}
}