ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Convolutional deep rectifier neural nets for phone recognition

László Tóth

Rectifier neurons differ from standard ones only in that the sigmoid activation function is replaced by the rectifier function, max(0,x). Several recent studies suggest that rectifier units may be more suitable building units for deep nets. For example, we found that with deep rectifier networks one can attain a similar speech recognition performance than that with sigmoid nets, but without the need for the time-consuming pre-training procedure. Here, we extend the previous results by modifying the rectifier network so that it has a convolutional structure. As convolutional networks are inherently deep, rectifier neurons seem to be an ideal choice as their building units. Indeed, on the TIMIT phone recognition task we report a 6% relative error reduction compared to our earlier results, giving an 18.6% error rate on the core test set. Then, with the application of the recently proposed edropoutf training method we reduce the error rate further to 17.8%, which, to our knowledge, is the best result to date on this database.

doi: 10.21437/Interspeech.2013-429

Cite as: Tóth, L. (2013) Convolutional deep rectifier neural nets for phone recognition. Proc. Interspeech 2013, 1722-1726, doi: 10.21437/Interspeech.2013-429

  author={László Tóth},
  title={{Convolutional deep rectifier neural nets for phone recognition}},
  booktitle={Proc. Interspeech 2013},