13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Combining Temporal and Cepstral Features for the Automatic Perceptual Categorization of Disordered Connected Speech

Ali Alpan (1), Jean Schoentgen (1,2), Francis Grenez (1)

(1) Laboratory of Images, Signals & Telecommunication Devices, Université Libre de Bruxelles, Brussels, Belgium
(2) National Fund for Scientific Research, Belgium

The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into multiple (modal, moderately hoarse, severely hoarse) categories. Support vector machines, used for the classification, have been fed with temporal signal-todysperiodicity ratios, the first rahmonic amplitude as well as mel-frequency cepstral coefficients. The signal-to-dysperiodicity ratio complements the first rahmonic amplitude when categorizing voice samples according to the degree of hoarseness yielding 77% of correct classification.

Index Terms: automatic perceptual categorization of disordered connected speech, variogram analysis, signal-to-dysperiodicity ratio, first amplitude rahmonic, mel-frequency cepstral coefficients, support vector machine

Full Paper

Bibliographic reference.  Alpan, Ali / Schoentgen, Jean / Grenez, Francis (2012): "Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech", In INTERSPEECH-2012, 1624-1627.