The objective is to analyze vocal dysperiodicities in perceptually assessed synthetic speech sounds. The analysis involves a variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-to-dysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the speaker. The stimuli have been generated by a synthesizer of disordered voices that has been shown to generate natural-sounding speech fragments comprising diverse vocal perturbations. The speech stimuli have been perceptually assessed by nine listeners according to grade, breathiness and roughness. In previous studies, signal-to-dysperiodicity ratios have been correlated with perceived degrees of hoarseness. The objective here is to extend the analysis to roughness and breathiness. A second objective is to analyze the dependance of the signal-to-dysperiodicity ratio on the signal properties fixed by the synthesizer parameters. Results show a good correlation between signal-to-dysperiodicity ratios and perceptual scores. At most two frequency bands are necessary to predict the perceptual scores. Additive noise contributes most followed by jitter. The interaction between noise parameters, vocal frequency and vowel category contribute moderately or feebly.
Bibliographic reference. Alpan, Ali / Grenez, Francis / Schoentgen, Jean (2011): "Dysperiodicity analysis of perceptually assessed synthetic speech stimuli", In INTERSPEECH-2011, 521-524.