Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Combining Greedy Algorithms with Expert Guided Manipulation for the Definition of a Balanced Prosodic Spanish-Catalan Radio News Corpus

David Escudero-Mancebo (1), C. González-Ferreras (1), Juan María Garrido Almiñana (2), E. Rodero (3), Lourdes Aguilar (4), Antonio Bonafonte (5)

(1) Department of Computer Science, Universidad de Valladolid
(2) Department of Translation and Language Sciences, Universidad Pompeu Fabra
(3) Department of Communication, Universidad Pompeu Fabra
(4) Department of Spanish Philology, Universidad Autonoma de Barcelona
(5) Department of Signal Theory, Universidad Politécnica de Catalunya, Spain

This article reports the process of building a bilingual (Spanish-Catalan) text corpus balanced in parallel taking into account prosodic features for both languages. We propose an expert guideline for text manipulation that in combination with greedy algorithms significantly improves the quality of the selected corpus. The application of this methodology to a radio news corpus empirically supports the proposed strategy.

Full Paper

Bibliographic reference.  Escudero-Mancebo, David / González-Ferreras, C. / Garrido Almiñana, Juan María / Rodero, E. / Aguilar, Lourdes / Bonafonte, Antonio (2010): "Combining greedy algorithms with expert guided manipulation for the definition of a balanced prosodic Spanish-catalan radio news corpus", In SP-2010, paper 061.