Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Portland, Oregon, USA
September 14, 2012

Discriminative Spoken Term Detection with Limited Data

Rohit Prabhavalkar (1), Joseph Keshet (2), Karen Livescu (2), Eric Fosler-Lussier (1)

(1) Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
(2) TTI-Chicago, USA

We study spoken term detection:the task of determining whether and where a given word or phrase appears in a given segment of speech:in the setting of limited training data. This setting is becoming increasingly important as interest grows in porting spoken term detection to multiple lowresource languages and acoustic environments. We propose a discriminative algorithm that aims at maximizing the area under the receiver operating characteristic curve, often used to evaluate the performance of spoken term detection systems. We implement the approach using a set of feature functions based on multilayer perceptron classifiers of phones and articulatory features, and experiment on data drawn from the Switchboard database of conversational telephone speech. Our approach outperforms a baseline HMM-based system by a large margin across a number of training set sizes.

Index Terms: spoken term detection, discriminative training, AUC, structural SVM

Full Paper    

Bibliographic reference.  Prabhavalkar, Rohit / Keshet, Joseph / Livescu, Karen / Fosler-Lussier, Eric (2012): "Discriminative spoken term detection with limited data", In MLSLP-2012, 22-25.