Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals

Hirotoshi Takeuchi, Kunio Kashino, Yasunori Ohishi, Hiroshi Saruwatari


Convolutional neural networks have been successfully applied to a variety of audio signal processing tasks including sound source separation, speech recognition and acoustic scene understanding. Since many pitched sounds have a harmonic structure, an operation, called harmonic convolution, has been proposed to take advantages of the structure appearing in the audio signals. However, the computational cost involved is higher than that of normal convolution. This paper proposes a faster calculation method of harmonic convolution called Harmonic Lowering. The method unrolls the input data to a redundant layout so that the normal convolution operation can be applied. The analysis of the runtimes and the number of multiplication operations show that the proposed method accelerates the harmonic convolution 2 to 7 times faster than the conventional method under realistic parameter settings, while no approximation is introduced.


 DOI: 10.21437/Interspeech.2020-3185

Cite as: Takeuchi, H., Kashino, K., Ohishi, Y., Saruwatari, H. (2020) Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals. Proc. Interspeech 2020, 185-189, DOI: 10.21437/Interspeech.2020-3185.


@inproceedings{Takeuchi2020,
  author={Hirotoshi Takeuchi and Kunio Kashino and Yasunori Ohishi and Hiroshi Saruwatari},
  title={{Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={185--189},
  doi={10.21437/Interspeech.2020-3185},
  url={http://dx.doi.org/10.21437/Interspeech.2020-3185}
}