INTERSPEECH 2006 - ICSLP
The results of an investigation into unsupervised detection of whispered speech segments in the presence of normally phonated speech are presented. The Whispered Speech Detection system presented here extracts features which exploit both waveform energy and periodicity. Unsupervised classification of these features was performed to identify and label long segments (approx. 2-2.5 seconds) of whispered speech which is typically an indication of criminal activity over telephone networks, for instance, in a correctional facility environment. Experiments indicate that it is possible to automatically detect long segments of whispering in the presence of normally phonated speech; testing of the algorithm presented in this paper yields promising results in correct identification of whispered speech segments.
Bibliographic reference. Carlin, Michael A. / Smolenski, Brett Y. / Wenndt, Stanley J. (2006): "Unsupervised detection of whispered speech in the presence of normal phonation", In INTERSPEECH-2006, paper 1990-Mon3CaP.13.