4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In , we develop statistical model of speech recognition where emphasis is placed on the perceptually-relevant and information-rich portion of the speech signal. In that model, speech is viewed as a sequence of elementary decisions or Auditory Events (avents) that are made in response to loci of significant spectral change. These decision points are interleaved with periods during which insufficient information has been accumulated to make the next decision. We have called this a Stochastic Perceptual Avent Model, or SPAM. In the work reported here, we have extended our initial experimental implementation  to include other probabilistic dependencies specified in the original theory, particularly the dependence on the time from the current frame back to the previous hypothesized avent.
Bibliographic reference. Bilmes, Jeff / Morgan, Nelson / Wu, Su-Lin / Bourlard, Hervé (1996): "Stochastic perceptual speech models with durational dependence", In ICSLP-1996, 1301-1304.