13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance

Luis Javier Rodríguez-Fuentes (1), Mikel Penagarikano (1), Amparo Varona (1), Mireia Diez (1), Germán Bordel (1), Alberto Abad (2), David Martínez (3), Jesus Villalba (3), Alfonso Ortega (3), Eduardo Lleida (3)

(1) GTTS, Department of Electricity and Electronics, University of the Basque Country UPV/EHU, Spain
(2) L2F - Spoken Language Systems Lab, INESC-ID Lisboa, Portugal
(3) Aragon Institute for Engineering Research (I3A), University of Zaragoza, Spain

This paper describes the most relevant features of a collaborative multi-site submission to the NIST 2011 Language Recognition Evaluation (LRE), consisting of one primary and three contrastive systems, each fusing different combinations of 13 state-of-the-art (acoustic and phonotactic) language recognition subsystems. The collaboration focused on collecting and sharing training data for those target languages for which few development data were provided by NIST, and on defining a common development dataset to train backend and fusion parameters and select the best fusions. Official and post-key results are presented and compared, revealing that the greedy approach applied to select the best fusions provided suboptimal but very competitive performance. Several factors contributed to the high performance attained by BLZ systems, including the availability of training data for low resource target languages, the reliability of the development dataset (consisting only of data audited by NIST), the diversity of modeling approaches, features and datasets in the systems considered for fusion, and the effectiveness of the search for optimal fusions.

Index Terms: Spoken Language Recognition, NIST 2011 LRE, Multiclass Discriminative Fusion, Greedy Search

Full Paper

Bibliographic reference.  Rodríguez-Fuentes, Luis Javier / Penagarikano, Mikel / Varona, Amparo / Diez, Mireia / Bordel, Germán / Abad, Alberto / Martínez, David / Villalba, Jesus / Ortega, Alfonso / Lleida, Eduardo (2012): "The BLZ submission to the NIST 2011 LRE: data collection, system development and performance", In INTERSPEECH-2012, 38-41.