4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Extracting Speech Features from Human Speech-like Noise

Daisuke Kobayashi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Graduate School of Engineering, Nagoya University, Chikusa-ku, Nagoya, Japan

Human speech-like noise (HSLN) is a kind of bubble noise generated by superimposing independent speech signals typically more than one thousand times. Since the basic feature of HSLN varies from that of overlapped speech to stationary noise with keeping long time spectra in the same shape, we investigate perceptual discrimination of speech from stationary noise and its acoustic correlates using HSLN of various numbers of superposition. First we con- firm the perceptual score, i.e. how much the HSLN sounds like stationary noise, and that the number of superposition of HSLN is proportional by subjective tests. Then, we show that the amplitude distribution of difference signal of HSLN approaches the Gaussian distribution from the Gamma distribution as the number of superposition increases. The other subjective test to perceive three HSLN of different dynamic characteristics clarifys that the temporal change of spectral envelope plays an important roll in discriminating speech from noise.

Full Paper
Sound Examples:   #01   #02   #03   #04   #05   #06   #07   #08   #09   #10   #11  

Bibliographic reference.  Kobayashi, Daisuke / Kajita, Shoji / Takeda, Kazuya / Itakura, Fumitada (1996): "Extracting speech features from human speech-like noise", In ICSLP-1996, 418-421.