Data Augmentation, Missing Feature Mask and Kernel Classification for Through-the-Wall Acoustic Surveillance

Huy Dat Tran, Wen Zheng Terence Ng, Yi Ren Leng


This paper deals with sound event classification from poor quality signals in the context of “through-the-wall” (TTW) surveillance. The task is extremely challenging due to the high level of distortion and attenuation caused by complex sound propagation and modulation effect from signal acquisition. Another problem, facing in TTW surveillance, is the lack of comprehensive training data as the recording is much more complicated than conventional approaches using audio microphones. To address that challenge, we employ a recurrent neural network, particularly the Long Short-Term Memory (LSTM) encoder, to transform conventional clean and noisy audio signals into TTW signals to augment additional training data. Furthermore, a novel missing feature mask kernel classification is developed to optimize the classification accuracy of TTW sound event classification. Particularly, Wasserstein distance is calculated from reliable intersection regions between pair-wise sound image representations and embedded into a probabilistic distance Support Vector Machine (SVM) kernel to optimize the TTW data separation. The proposed missing feature mask kernel allows effective training with inhomogeneously distorted data and the experimental results show promising results on TTW audio recordings, outperforming several state-of-art methods.


 DOI: 10.21437/Interspeech.2017-685

Cite as: Tran, H.D., Ng, W.Z.T., Leng, Y.R. (2017) Data Augmentation, Missing Feature Mask and Kernel Classification for Through-the-Wall Acoustic Surveillance. Proc. Interspeech 2017, 3807-3811, DOI: 10.21437/Interspeech.2017-685.


@inproceedings{Tran2017,
  author={Huy Dat Tran and Wen Zheng Terence Ng and Yi Ren Leng},
  title={Data Augmentation, Missing Feature Mask and Kernel Classification for Through-the-Wall Acoustic Surveillance},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3807--3811},
  doi={10.21437/Interspeech.2017-685},
  url={http://dx.doi.org/10.21437/Interspeech.2017-685}
}