First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Multilingual Speech Data Base for Evaluating Quality of Digitized Speech

Hiroshi Irii, Kenzo Ito, Nobuhiko Kitawaki

NTT Telecommunication Networks Labs, and Human Interface Labs., Tokyo, Japan

This paper proposes a multilingual set of speech samples collected to standardize an artificial voice as a data base in order to evaluate the language and talker dependency of a digital coding algorithm. Speech recordings are made by telecommunication laboratories of telecommunication administration or operating companies participating in the International Telegraph and Telephone Consultative Committee (CCITT) under uniform conditions. The number of languages is 20. Each language is , composed of at least 16 short sentences and the duration of each sentence is about 8 seconds. To investigate the impartiality of the data base, the fundamental statistical speech characteristics of the speech samples are analyzed. It is confirmed that the results agree with those of previous researches taking into account the dispersions. The speech quality dependency on talker and language is investigated when this set of speech samples is coded and applied to three typical digital coding algorithms. This set of speech samples reduces the bias in the evaluation due to the limited speech samples. Speech samples are stored on CD-ROM and are publicly available.

