This paper presents a feature-based representation (FBR) of speech that is motivated by phonetic-feature theory. Presently, only the manner features: sono-rant, syllabic, nonsyllabic, non-continuant and fricated are considered. The objectives of such a representation are to directly target the linguistic information in the signal and to minimize other extra-linguistic information that may yield large speech variability. To aid in this goal, FBR was defined in a relational manner across time and frequency. Preliminary results using FBR and the Hidden Markov model (HMM) approach in a broad-class recognition task suggest that FBR was able to target much of the phonetically relevant information. Results comparable to an HMM using a cepstral-based representation (CBR) were obtained when one-mixture probability density functions were used. As the number of mixtures was increased, CBR outperformed FBR primarily because fine phonetic details, that go beyond the manner features, were captured in the multi-mixture CBR-HMM's.
Bibliographic reference. Bitar, Nabil N. / Wilson, Carol Y. Espy- (1995): "Speech parameterization based on phonetic features: application to speech recognition", In EUROSPEECH-1995, 1411-1414.