Universal Adversarial Attacks on Spoken Language Assessment Systems

Vyas Raina, Mark J.F. Gales, Kate M. Knill

There is an increasing demand for automated spoken language assessment (SLA) systems, partly driven by the performance improvements that have come from deep learning based approaches. One aspect of deep learning systems is that they do not require expert derived features, operating directly on the original signal such as a speech recognition (ASR) transcript. This, however, increases their potential susceptibility to adversarial attacks as a form of candidate malpractice. In this paper the sensitivity of SLA systems to a universal black-box attack on the ASR text output is explored. The aim is to obtain a single, universal phrase to maximally increase any candidate’s score. Four approaches to detect such adversarial attacks are also described. All the systems, and associated detection approaches, are evaluated on a free (spontaneous) speaking section from a Business English test. It is shown that on deep learning based SLA systems the average candidate score can be increased by almost one grade level using a single six word phrase appended to the end of the response hypothesis. Although these large gains can be obtained, they can be easily detected based on detection shifts from the scores of a “traditional” Gaussian Process based grader.

 DOI: 10.21437/Interspeech.2020-1890

Cite as: Raina, V., Gales, M.J., Knill, K.M. (2020) Universal Adversarial Attacks on Spoken Language Assessment Systems. Proc. Interspeech 2020, 3855-3859, DOI: 10.21437/Interspeech.2020-1890.

  author={Vyas Raina and Mark J.F. Gales and Kate M. Knill},
  title={{Universal Adversarial Attacks on Spoken Language Assessment Systems}},
  booktitle={Proc. Interspeech 2020},