First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

A National Database of Spoken Language: Concept, Design, and Implementation

J. Bruce Millar (1), P. Dermody (2), M. Harrington (3), Julie Vonwiller (4)

(1) Computer Sciences Laboratory, Australian National University
(2) National Acoustic Laboratories, Chatswood, Sydney
(3) Speech, Hearing and Language Research Centre, Macquarie University
(4) School of Electrical Engineering, University of Sydney, Australia

A model is proposed for the building of a national resource of spoken language data in the form of a cluster of compatible databases. Each component of the cluster will have its own linguistic characteristics dependent on the primary purpose behind its collection. However each component corpus will have the same structure and the same standards of data description. The emphasis is on adequate description of the data rather than on conformity to a standard of recording conditions, data storage, or linguistic content. This paper outlines the rationale for such a database and proposes principles for the structuring of data storage, and for the description of important dimensions of such spoken language data. Some attention is also given to the management of such a data base within the speech and language technology community.

Full Paper

Bibliographic reference.  Millar, J. Bruce / Dermody, P. / Harrington, M. / Vonwiller, Julie (1990): "A national database of spoken language: concept, design, and implementation", In ICSLP-1990, 1281-1284.