Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Unsupervised Detection of Whispered Speech in the Presence of Normal Phonation

Michael A. Carlin (1), Brett Y. Smolenski (2), Stanley J. Wenndt (3)

(1) Temple University, USA; (2) Research Associates for Defense Conversion, USA; (3) Air Force Research Laboratory, USA

The results of an investigation into unsupervised detection of whispered speech segments in the presence of normally phonated speech are presented. The Whispered Speech Detection system presented here extracts features which exploit both waveform energy and periodicity. Unsupervised classification of these features was performed to identify and label long segments (approx. 2-2.5 seconds) of whispered speech which is typically an indication of criminal activity over telephone networks, for instance, in a correctional facility environment. Experiments indicate that it is possible to automatically detect long segments of whispering in the presence of normally phonated speech; testing of the algorithm presented in this paper yields promising results in correct identification of whispered speech segments.

