A new model of speech quality under delay is presented that includes conversational interactivity. It is based on two previously reported narrowband telephony conversation tests involving different delays, with subject-pairs judging overall quality after each conversation. The tests were conducted with different conversation scenarios targeting different levels of interactivity. The instructions given prior to the tests were varied in their emphasis on speed of task completion. Based on the test results, the paper proposes an extension of a widely used conversational speech quality model, the so-called E-model (ITU-T Rec. G.107), to cover the joint effect of interactivity and delay. To this aim, two new parameters are introduced, one of which represents the minimum perceivable delay, and the other expresses in how far users will attribute the delay-effect to the conversational quality of the line. Based on the analysis of the recorded test conversations in terms of its surface structure (turns, speaker activities, etc.), prominent differences and delay-dependencies of a number of conversation parameters were found that characterize the impact of delay on the conversational flow and on perceived quality.
Bibliographic reference. Raake, Alexander / Schoenenberg, Katrin / Skowronek, Janto / Egger, Sebastian (2013): "Predicting speech quality based on interactivity and delay", In INTERSPEECH-2013, 1384-1388.