ISCA - International Speech
Communication Association

SCOOT: ASR Toolkits


The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models developed at Cambridge, UK and first released in 1989 .  The researchers most closely associated with HTK are Steve Young, Phil Woodland and Mark Gales. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.

·        The HTK book  is a comprehensive guide, including tutorials and recipes.

·        A shorter tutorial by Giampiero Salvi is available from KTH.

·        Recipes for the Wall Street journal Task are here 


The Sphinx toolkit, originally developed from Kai-Fu Lee’s thesis, embodies more than 20 year’s work at Carnegie Mellon University. There are tools designed for low-resource platforms.  It’s free.

·        There is a tutorial here 

·        A recipe for the Wall Street Journal task is here


Kaldi is a toolkit for speech recognition written in C++ and licensed under the Apache License v2.0. Kaldi development began in 2009, led by Dan Povey. Kaldi  is intended for use by speech recognition researchers and is free. 

There is an introduction to Kaldi here 

Start with the ‘Kaldi for dummies’ tutorial

A detailed tutorial by Sanjeev Khudanpur, Dan Povey and Jan Trm is here 

There are Speech Kitchen Kaldi demos here and here

2 Kaldi Repositories with tutorials and scripts:  from Ankit and Yoav Ramon

 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by Wild Apricot Membership Software