13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Automatic Speech Segmentation Using Probabilistic Latent Component Modeling

Sayan Ghosh, Thippur V. Sreenivas

Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India

Latent variable methods, such as PLCA (Probabilistic Latent Component Analysis) have been successfully used for analysis of non-negative signal representations.In this paper, we formulate PLCS (Probabilistic Latent Component Segmentation), which models each time frame of a spectrogram as a spectral distribution. Given the signal spectrogram, the segmentation boundaries are estimated using a maximum-likelihood approach. For an efficient solution, the algorithm imposes a hard constraint that each segment is modelled by a single latent component. The hard constraint facilitates the solution of ML boundary estimation using dynamic programming. The PLCS framework does not impose a parametric assumption unlike earlier ML segmentation techniques. PLCS can be naturally extended to model coarticulation between successive phones. Experiments on the TIMIT corpus show that the proposed technique is promising compared to most state of the art speech segmentation algorithms.

Index Terms: Speech segmentation, PLCA, Spectrograms, Coarticulation, Dynamic Programming

Full Paper

Bibliographic reference.  Ghosh, Sayan / Sreenivas, Thippur V. (2012): "Automatic speech segmentation using probabilistic latent component modeling", In INTERSPEECH-2012, 2262-2265.