This mixed speech fundamental frequency (f0) estimation algorithm is an extension of the classical AMDF (Average Magnitude Difference Function) algorithm for one voice. An exhaustive search of the parameter space of two cascaded time-domain comb filters yields an estimation of the periods of the component voices. The algorithm, which is computationally expensive but easily parallelizable, was tested on a database of continuous male and female speech. Segments of voiced speech, selected according to a "good periodicity" criterion to ensure that the reference single-voice f0 algorithm would not fail (this criterion rejected 25% of voiced speech frames), were paired and summed to simulate mixed speech. The search range of the algorithm was limited to a 3 octave range, and search was performed frame-by-frame without continuity constraints. The resulting estimates were compared to those of the reference algorithm and found to be within 3 % of target values for 90 % of all frames. Keywords: speech, fundamental frequency, pitch extraction, mixed speech separation, noise reduction, cocktail-party effect.
Bibliographic reference. Cheveigne, Alain de (1991): "A mixed speech F0 estimation algorithm", In EUROSPEECH-1991, 445-448.