A bank of filters has been implemented digitally to obtain, with running speech as input, energy values within well defined, Gaussian-shaped, frequency-time windows. The analysis concentrates on the correlation between the dB-outputs of pairs of different windows, with the frequency-spacing and/or the time-spacing between two such windows as parameters. The resulting correlation patterns reflect, in a global way, the statistics of the dynamic characteristics of running speech in both the frequency and the time domain. Various aspects of such correlation patterns will be considered briefly, illustrating interesting relations with some basic features in hearing and speech intelligibility. The main issue concerns the possible usefulness of this global measure for speech quality assessment. It is found that these correlation patters derived from natural speech have a typical structure, providing a basis for judging the degree of "naturalness" of a token of synthetic speech.
Bibliographic reference. Houtgast, Tammo / Verhave, Jan A. (1991): "A physical approach to speech quality assessment: correlation patterns in the speech spectrogram", In EUROSPEECH-1991, 285-288.