Third International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA 2003)
This paper describes some methodological issues to be considered while facing the task of the objective assessment of voice quality from patients with laryngeal cancer. Earlier research works showed that the automatic assessment of voice quality could be addressed by means of short-term and long-term time-domain, and frequency-domain parameters extracted from electroglotographic (EGG) signals, and using Artificial Neural Networks (ANN) such as Multi-layer Perceptron (MLP). However, despite the good results, further research has showed that the choice of cross-validation techniques used for the pattern recognition can greatly influence the ability of the system to learn and to generalise. In particular, this paper is concerned with the effects of intra and inter speaker variability during cross-validation and hence on the reliability of pathological voice quality assessment. For this study, a database of male subjects steadily phonating the vowel /i/ was used, and the quality of their voices was independently assessed by a speech and language therapist (SALT) according to their 7-point ranking of subjective voice quality. Although it is found that by carefully selecting the datasets used to train and validate the ANN to minimise intra speaker variability reduces the classification accuracy, most of the time the ANN only misclassifies by only one point.
Full Paper (reprinted with permission from Firenze University Press)
Bibliographic reference. Godino-Llorente, Juan Ignacio / Ritchings, Tim / Berry, Carl (2003): "The effects of inter and intra speaker variability on pathological voice quality assessment", In MAVEBA-2003, 157-160.