First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Neural Network Based Segmentation of Continuous Speech

Pinaki Poddar, P. V. S. Rao

Computer Systems & Communications Group, Tata Institute of Fundamental Research Bombay, India

This paper describes a technique to mark phonetic boundaries by locating points of maximal change from an acoustic criterion and refining them by means of phonetic knowledge encoded in a neural network architecture. A threshold was set so that no boundaries are revised at the cost of a few spurious segments. A three layer network was trained using the average spectrum of each segment and used to label segments in the testing phase. Segmentation accuracy was tested by comparison with manual segmentation by an expert phonetician. The consistency of the segmentation and labelling were tested by a recognition experiment on 40 Hindi words. Recognition scores were 84% (first choice only) and 98% (first three choices).

Full Paper

Bibliographic reference.  Poddar, Pinaki / Rao, P. V. S. (1990): "Neural network based segmentation of continuous speech", In ICSLP-1990, 1365-1368.