First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Integration of Speech Recognition, Text-To-Speech Synthesis, and Talker Verification into a Hands-Free Audio/Image Teleconferencing System (Humanet)

D. A. Berkley, James L. Flanagan

Information Principles Research Laboratory, AT&T Bell Laboratories, Murray Hill, NJ, USA

This report describes the design and implementation of a digital teleconferencing system that integrates a number of speech technologies together with image and data facilities. The aim is to provide a variety of sophisticated communication features that are easy to learn and use. The system is called HuMaNet, for Human/Machine Network. The system is controlled totally and interactively hands-free by natural speech. The system combines the technologies of speech recognition, text synthesis, and talker verification with autodirective microphone arrays, image compression, data and hypertext management to provide high-quality audio and image conferencing over basic-rate ISDN (Integrated Services Digital Network). The present public-switched transport capacity provides "2B+D", or two 64 k bits/sec circuit-switched channels (2B), and one 16 k bits/sec packet-switched channel (D).

Full Paper

Bibliographic reference.  Berkley, D. A. / Flanagan, James L. (1990): "Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet)", In ICSLP-1990, 861-864.