Deep Attentive End-to-End Continuous Breath Sensing from Speech

Alexis Deighton MacIntyre, Georgios Rizos, Anton Batliner, Alice Baird, Shahin Amiriparian, Antonia Hamilton, Björn W. Schuller

Modelling of the breath signal is of high interest to both healthcare professionals and computer scientists, as a source of diagnosis-related information, or a means for curating higher quality datasets in speech analysis research. The formation of a breath signal gold standard is, however, not a straightforward task, as it requires specialised equipment, human annotation budget, and even then, it corresponds to lab recording settings, that are not reproducible in-the-wild. Herein, we explore deep learning based methodologies, as an automatic way to predict a continuous-time breath signal by solely analysing spontaneous speech. We address two task formulations, those of continuous-valued signal prediction, as well as inhalation event prediction, that are of great use in various healthcare and Automatic Speech Recognition applications, and showcase results that outperform current baselines. Most importantly, we also perform an initial exploration into explaining which parts of the input audio signal are important with respect to the prediction.

 DOI: 10.21437/Interspeech.2020-2832

Cite as: MacIntyre, A.D., Rizos, G., Batliner, A., Baird, A., Amiriparian, S., Hamilton, A., Schuller, B.W. (2020) Deep Attentive End-to-End Continuous Breath Sensing from Speech. Proc. Interspeech 2020, 2082-2086, DOI: 10.21437/Interspeech.2020-2832.

  author={Alexis Deighton MacIntyre and Georgios Rizos and Anton Batliner and Alice Baird and Shahin Amiriparian and Antonia Hamilton and Björn W. Schuller},
  title={{Deep Attentive End-to-End Continuous Breath Sensing from Speech}},
  booktitle={Proc. Interspeech 2020},