The objective of this work is to demonstrate the significance of instants of significant excitation for source modeling. Instants of significant excitation correspond to the glottal closure, glottal opening, onset of burst, frication and a small number of excitation instants around them. The speech signal is processed independently by zero frequency filtering (ZFF) to obtain epochs. The epochs are used as anchor points for extracting the instants of significant excitation from different representations of speech. The different representations include sequence of strength weighted epochs, small range of samples around epochs from the linear prediction (LP) residual, Hilbert envelope (HE) of LP residual and the cosine of phase sequence. The strength weighted epoch sequence generates speech which is intelligible, but synthetic in nature. By considering a small region of instants of significant excitation around the epochs, the naturalness of synthesized speech increases significantly.
Bibliographic reference. Adiga, Nagaraj / Prasanna, S. R. M. (2013): "Significance of instants of significant excitation for source modeling", In INTERSPEECH-2013, 1677-1681.