13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction

Jian Luan (1), Bolei He (1), Hairong Xia (1), Linfang Wang (1), Daniela Braga (2), Sheng Zhao (1)

(1) Speech Team OSD, Microsoft (China) Corp., Beijing, China
(2) Information Platform & Experiences Group, Microsoft Corp., Redmond, WA, USA

Intonation phrase length distribution is important information for prosodic break prediction. However, existing CRF frameworks cannot make full use of it. An expanded CRF is proposed in this paper to tackle this problem. Its lattice carries the location of previous intonation phrase (L3) break, and consequently makes it possible to support various dynamic features, such as the number of syllables from the previous L3 break and the POS of word after the previous L3 break. Remarkable improvements are obtained with the expanded CRF for L3 break prediction task. It is also promising to benefit other tasks containing long distance dependencies.

Index Terms: CRF, intonation phrase, prosodic break prediction, speech prosody

Full Paper

Bibliographic reference.  Luan, Jian / He, Bolei / Xia, Hairong / Wang, Linfang / Braga, Daniela / Zhao, Sheng (2012): "Expand CRF to model long distance dependencies in prosodic break prediction", In INTERSPEECH-2012, 2530-2533.