5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

An Acoustic Subword Unit Approach to Non-Linguistic Speech Feature Identification

Mohamed Afify (1), Yifan Gong (1,2), Jean-Paul Haton (1)

(1) CRIN/CNRS-INRIA-Lorraine, Vandeouvre, Nancy, France (2) Media Technologies Laboratory, Texas Instruments, Dallas, TX, USA

Automatic identification of non-linguistic speech features (e.g. the speaker or the language of an utterance) are currently of practical interest. In this paper, we first impose a set of requirements that we think a statistical model used in non-linguistic feature identification should satisfy. Namely, these requirements are capturing both short and long term correlations in addition to maintaining a certain acoustic resolution. A model satisfying these requirements, and in the same time having the attractive feature of requiring no transcribed speech material during training is proposed. Experimental evaluation of the approach in speaker recognition on the TIMIT database is presented, where recognition rates up to 99.2 % are achieved.

Full Paper

Bibliographic reference.  Afify, Mohamed / Gong, Yifan / Haton, Jean-Paul (1997): "An acoustic subword unit approach to non-linguistic speech feature identification", In EUROSPEECH-1997, 2291-2294.