INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Enhancements to the Training Process of Classifier-Based Speech Translator via Topic Modeling

Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth Narayanan

University of Southern California, USA

Classification of sentences based on their meaning (or concept) has been used as component in speech translation and spoken language understanding systems. Preparing training data for this type of classifiers is often a tedious task. In our previous work, we presented a method of clustering sentences as a step toward automated annotation of concepts. To measure the distance between two sentences, that method relied on the local lexical dependencies in their translations. In this work, we apply Topic Modeling to enhance the previously proposed distance metric so that it includes information from semantic associations among the words. Our experiments on the DARPA USC Transonics and BBN Transtac data sets show the advantage of incorporating this information as performance improvements in a set of clustering tasks.

Full Paper

Bibliographic reference.  Ettelaie, Emil / Georgiou, Panayiotis G. / Narayanan, Shrikanth (2011): "Enhancements to the training process of classifier-based speech translator via topic modeling", In INTERSPEECH-2011, 2109-2112.