Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection

Gajan Suthokumar, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Chamith Wijenayake, Eliathamby Ambikairajah


Spoofing detection systems for automatic speaker verification have moved from only modelling voiced frames to modelling all speech frames. Unvoiced speech has been shown to carry information about spoofing attacks and anti-spoofing systems may further benefit by treating voiced and unvoiced speech differently. In this paper, we separate speech into low and high energy frames and independently model the distributions of both to form two spoofing detection systems that are then fused at the score level. Experiments conducted on the ASVspoof 2015, BTAS 2016 and Spoofing and Anti-Spoofing (SAS) corpora demonstrate that the proposed approach of fusing two independent high and low energy spoofing detection systems consistently outperforms the standard approach that does not distinguish between high and low energy frames.


 DOI: 10.21437/Interspeech.2017-836

Cite as: Suthokumar, G., Sriskandaraja, K., Sethu, V., Wijenayake, C., Ambikairajah, E. (2017) Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection. Proc. Interspeech 2017, 2606-2610, DOI: 10.21437/Interspeech.2017-836.


@inproceedings{Suthokumar2017,
  author={Gajan Suthokumar and Kaavya Sriskandaraja and Vidhyasaharan Sethu and Chamith Wijenayake and Eliathamby Ambikairajah},
  title={Independent Modelling of High and Low Energy Speech Frames for Spoofing Detection},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2606--2610},
  doi={10.21437/Interspeech.2017-836},
  url={http://dx.doi.org/10.21437/Interspeech.2017-836}
}