Manual prosodic labelling is tedious and time-consuming. To overcome this problem, we propose a semi-automatic method. To support the human transriber we generate a prototypical prosodic description by statistical methods. The prosodic description includes the word boundaries in the speech signal, the accents, and the phrase boundaries. We detect the phone boundaries (including the word boundaries) in the speech signal by the use of an HMM speech recognizer and predict accents and phrase boundaries from the text using a categorical language model (trained with prosodically labelled text data). We combine the results of the segmentation and the prediction to locate the predicted accents and boundaries in the speech signal. This prototypical information about the prosody can be integrated into the human labelling strategy.
Bibliographic reference. Lehning, Michael (1995): "Statistical methods for the automatic labelling of German prosody", In EUROSPEECH-1995, 2089-2092.