INTERSPEECH 2013
14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Convolutional Deep Rectifier Neural Nets for Phone Recognition

László Tóth

Hungarian Academy of Sciences, Hungary

Rectifier neurons differ from standard ones only in that the sigmoid activation function is replaced by the rectifier function, max(0,x). Several recent studies suggest that rectifier units may be more suitable building units for deep nets. For example, we found that with deep rectifier networks one can attain a similar speech recognition performance than that with sigmoid nets, but without the need for the time-consuming pre-training procedure. Here, we extend the previous results by modifying the rectifier network so that it has a convolutional structure. As convolutional networks are inherently deep, rectifier neurons seem to be an ideal choice as their building units. Indeed, on the TIMIT phone recognition task we report a 6% relative error reduction compared to our earlier results, giving an 18.6% error rate on the core test set. Then, with the application of the recently proposed edropoutf training method we reduce the error rate further to 17.8%, which, to our knowledge, is the best result to date on this database.

Full Paper

Bibliographic reference.  Tóth, László (2013): "Convolutional deep rectifier neural nets for phone recognition", In INTERSPEECH-2013, 1722-1726.