ISCA - International Speech
Communication Association

SCOOT: ASR Toolkits

HTK

The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models developed at Cambridge, UK and first released in 1989 . The researchers most closely associated with HTK are Steve Young, Phil Woodland and Mark Gales. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.

· The HTK book is a comprehensive guide, including tutorials and recipes.

· A shorter tutorial by Giampiero Salvi is available from KTH.

· Recipes for the Wall Street journal Task are here

CMUSphinx

The Sphinx toolkit, originally developed from Kai-Fu Lee’s thesis, embodies more than 20 year’s work at Carnegie Mellon University. There are tools designed for low-resource platforms. It’s free.

· There is a tutorial here

· A recipe for the Wall Street Journal task is here

Kaldi

Kaldi is a toolkit for speech recognition written in C++ and licensed under the Apache License v2.0. Kaldi development began in 2009, led by Dan Povey. Kaldi is intended for use by speech recognition researchers and is free.

There is an introduction to Kaldi here

Start with the ‘Kaldi for dummies’ tutorial

A detailed tutorial by Sanjeev Khudanpur, Dan Povey and Jan Trm is here

There are Speech Kitchen Kaldi demos here and here

2 Kaldi Repositories with tutorials and scripts: from Ankit and Yoav Ramon

Organisation	Events	Membership	Help
> Board	> Interspeech	> Join - renew	> Sitemap
> Legal documents	> Workshops	> Membership directory	> Contact
> Logos			> FAQ
			> Privacy policy