While DNN-HMM acoustic models have replaced GMM-HMMs in the standard ASR pipeline due to performance improvements, one unrealistic assumption that remains in these models is the conditional independence assumption of the Hidden Markov Model (HMM). In this work, we explore the extent to which depth of neural networks helps compensate for these poor conditional independence assumptions. Using a bootstrap resampling framework that allows us to control the amount of data dependence in the test set while still using real observations from the data, we can determine how robust neural networks, and particularly deeper models, are to data dependence. Our conclusions are that if the data were to match the conditional independence assumptions of the HMM, there would be little benefit from using deeper models. It is only when data become more dependent that depth improves ASR performance. That performance substantially degrades, however, as the data becomes more realistic suggests that better temporal modeling is still needed for ASR.

DOI: `10.21437/Interspeech.2016-283`

Cite as

Ravuri, S., Wegmann, S. (2016) How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models. Proc. Interspeech 2016, 2736-2740.

Bibtex

@inproceedings{Ravuri+2016, author={Suman Ravuri and Steven Wegmann}, title={How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models}, year=2016, booktitle={Interspeech 2016}, doi={10.21437/Interspeech.2016-283}, url={http://dx.doi.org/10.21437/Interspeech.2016-283}, pages={2736--2740} }