Quality Estimation Based on Regular Perception

Christoph Norrenbrock, Florian Hinterleitner, Ulrich Heute, Sebastian Möller

We present a novel approach for speech-quality prediction based on the observation that perception-relevant properties may cover a wide range of different values without significantly changing perceptual quality. To account for this nonlinear phenomenon, a semi-supervised discretization concept is proposed which is applied to temporal as well as aggregated properties as a preprocessing step prior to conventional modelling. The idea of perceptual regularization allows for integrating an assumed perceptual reference into the modelling process without neglecting the empirical cognition effects coded in the subjective test data. For the example of synthetic-speech quality we will demonstrate how the two main goals of quality estimation, namely interpretability and robustness, are addressed through the presented approach.

