Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

C-PROM: An Annotated Corpus for French Prominence Study

Mathieu Avanzi (1,2), Anne-Cathérine. Simon (3), Jean-Philippe Goldman (3,4), Antoine Auchlin (4)

(1) Chaire de linguistique française, Neuchâtel University, Switzerland
(2) MoDyCo, Université Paris Ouest Nanterre, France
(3) Institut Langage & Communication / Valibel Discours & Variation, Université catholique de Louvain, Belgium
(4) Department of Linguistics, Geneva University, Switzerland

This paper presents C-PROM, an annotated corpus for French prominence studies. The corpus, including different regional varieties of French (Belgian, Swiss and metropolitan French) and various discourse-genres (from oral reading to spontaneous conversations) for a total duration of 70 minutes, was annotated by two phonetics experts. The two experts in charge of the coding followed a strict protocol, which takes into account both the previous mistakes encountered by prior research into prominence detection in French and elements of the methodology followed by scholars working on other languages. We conclude by discussing the average consistency between the two transcribers. The results obtained are quite encouraging, as the F-measure between the two annotators reaches 82.8%, and the kappa-score 0.77.

Full Paper

Bibliographic reference.  Avanzi, Mathieu / Simon, Anne-Cathérine. / Goldman, Jean-Philippe / Auchlin, Antoine (2010): "C-PROM: an annotated corpus for French prominence study", In SP-2010, paper 2005.