Incorporating Acoustic Features for Spontaneous Speech Driven Content Retrieval

Hiroto Tasaki, Tomoyosi Akiba


A speech-driven information retrieval system is expected to be useful for gathering information with greater ease. In a conventional system, users have to decide on the contents of their utterance before speaking, which takes quite a long time when their request is complicated. To overcome that problem, it is required for the retrieval system to handle a spontaneously spoken query directly. In this work, we propose an extension technique of spoken content retrieval (SCR) for effectively using spontaneously spoken queries. Acoustic features of meaningful terms in the retrieval may have prominence compared to other terms. Also, those terms will have linguistic specificity. From this assumption, we predict the contribution of terms included in spontaneously spoken queries using acoustic and linguistic features, and incorporate it in the query likelihood model (QLM) which is a probabilistic retrieval model. We verified the effectiveness of the proposed method through experiments. Our proposed method was successful in improving retrieval performance under various conditions.


 DOI: 10.21437/Interspeech.2017-893

Cite as: Tasaki, H., Akiba, T. (2017) Incorporating Acoustic Features for Spontaneous Speech Driven Content Retrieval. Proc. Interspeech 2017, 2894-2898, DOI: 10.21437/Interspeech.2017-893.


@inproceedings{Tasaki2017,
  author={Hiroto Tasaki and Tomoyosi Akiba},
  title={Incorporating Acoustic Features for Spontaneous Speech Driven Content Retrieval},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2894--2898},
  doi={10.21437/Interspeech.2017-893},
  url={http://dx.doi.org/10.21437/Interspeech.2017-893}
}