Speech Prosody 2010
Chicago, IL, USA
Acoustic-articulatory inversion mapping is a process that converts the signal of acoustic data to articulatory features. Most research focused on finding the best model for this mapping process but less attention on finding appropriate representation of articulatory & acoustic signals. This paper suggests two feature extraction methods, including Logarithm of square Hanning Critical Bank filterbank & Discrete Wavelet Transform that have better operation in contrast with conventional feature extraction based on Mel- Frequency Cepstral coefficients. For inversion mapping process an standard feed forward neural network is used. Appling a Time Delay Neural Network for phone recognition. The results show the efficiency of two new feature extraction methods.
Index Terms: Discrete Wavelet Transform, Time Delay Neural Networks (TDNNs), MOCHA-TIMIT database, Acoustic- Articulatory Inversion Mapping, Logarithm of square Hanning Critical Bank filterbank (LHCB), Mel Frequency Cepstral Coefficients(MFCC)
Bibliographic reference. Behbood, Hossein / SeyyedSalehi, Seyyed Ali / Tohidypour, Hamid Reza (2010): "A novel feature extraction for neuralbased modes in acoustic-articulatory inversion mapping", In SP-2010, paper 582.