Lightweight LPCNet-Based Neural Vocoder with Tensor Decomposition

Hiroki Kanagawa, Yusuke Ijima


This paper proposes a lightweight neural vocoder based on LPCNet. The recently proposed LPCNet exploits linear predictive coding to represent vocal tract characteristics, and can rapidly synthesize high-quality waveforms with fewer parameters than WaveRNN. For even greater speeds, it is necessary to reduce the time-heavy two GRUs and the DualFC. Although the original work only pruned the first GRU weight, there is room for improvements in the other GRU and DualFC. Accordingly, we use tensor decomposition to reduce these remaining parameters by more than 80%. For the proposed method we demonstrate that 1) it is 1.26 times faster on a CPU, and 2) it matched naturalness of the original LPCNet for acoustic features extracted from natural speech and for those predicted by TTS.


 DOI: 10.21437/Interspeech.2020-1642

Cite as: Kanagawa, H., Ijima, Y. (2020) Lightweight LPCNet-Based Neural Vocoder with Tensor Decomposition. Proc. Interspeech 2020, 205-209, DOI: 10.21437/Interspeech.2020-1642.


@inproceedings{Kanagawa2020,
  author={Hiroki Kanagawa and Yusuke Ijima},
  title={{Lightweight LPCNet-Based Neural Vocoder with Tensor Decomposition}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={205--209},
  doi={10.21437/Interspeech.2020-1642},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1642}
}