Confidence Measure for Speech-to-Concept End-to-End Spoken Language Understanding

Antoine Caubrière, Yannick Estève, Antoine Laurent, Emmanuel Morin


Recent studies have led to the introduction of Speech-to-Concept End-to-End (E2E) neural architectures for Spoken Language Understanding (SLU) that reach state of the art performance. In this work, we propose a way to compute confidence measures on semantic concepts recognized by a Speech-to-Text E2E SLU system. We investigate the use of the hidden representations of our CTC-based SLU system to train an external simple classifier. We experiment two kinds of external simple classifiers to analyze subsequences of hidden representations involved in recognized semantic concepts. The first external classifier is based on a MLP while the second one is based on a bLSTM neural network. We compare them to a baseline confidence measure computed directly from the softmax outputs of the E2E system. On the French challenging MEDIA corpus, when the confidence measure is used to reject, experiments show that using an external BLSTM significantly outperforms the other approaches in terms of precision/recall. To evaluate the additional information provided by this confidence measure, we compute the value of Normalised Cross-Entropy (NCE). Reaching a value equal to 0.288, we show that our best proposed confidence measure brings relevant information about the reliability of a recognized concept.


 DOI: 10.21437/Interspeech.2020-2298

Cite as: Caubrière, A., Estève, Y., Laurent, A., Morin, E. (2020) Confidence Measure for Speech-to-Concept End-to-End Spoken Language Understanding. Proc. Interspeech 2020, 1590-1594, DOI: 10.21437/Interspeech.2020-2298.


@inproceedings{Caubrière2020,
  author={Antoine Caubrière and Yannick Estève and Antoine Laurent and Emmanuel Morin},
  title={{Confidence Measure for Speech-to-Concept End-to-End Spoken Language Understanding}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={1590--1594},
  doi={10.21437/Interspeech.2020-2298},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2298}
}