13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Development and Evaluation of Automatic Punctuation for French and English Speech-to-Text

Jáchym Kolář, Lori Lamel

Spoken Language Processing Group, LIMSI-CNRS, Orsay, France

Automatic punctuation of speech is important to make speech-to-text output more readable and easier for downstream language processing. We describe the development of an automatic punctuation system for French and English. The punctuation model using both textual information and acoustic (prosodic) information is based on adaptive boosting. The system is evaluated on a difficult speech database under real-application conditions using output from a state-of-the-art speech-to-text system and automatic audio segmentation and speaker diarization. Unlike previous work, we score automatic punctuation based on two independent manual references. We also compare the two languages and the performance of the automatic system with inter-annotator agreement.

Index Terms: automatic punctuation, rich transcription, prosody

Full Paper

Bibliographic reference.  Kolář, Jáchym / Lamel, Lori (2012): "Development and evaluation of automatic punctuation for French and English speech-to-text", In INTERSPEECH-2012, 1376-1379.