Third International Conference on Spoken Language Processing (ICSLP 94)
We describe a system designed to recognize the language of an utterance spoken by any native speaker over the telephone. Our previous work based on language-specific phonemes  is extended to include sequences of all lengths of language-independent speech units. These units are derived by clustering phonemes across all languages in the system (Hindi, Spanish, English, German, Japanese, and Mandarin). Our language-identification results based on broad-phoneme occurrence statistics indicate 90% accurate distinction between English and Japanese, which is comparable to results obtained when using language-specific phonemes. By relaxing the precision of language-dependent phonemes into language-independent broad phonemes we thus retain language discriminative power. The degree to which the precision can be relaxed while retaining sequences of broad phonemes that can discriminate between languages is an indication of the accuracy with which the phoneme segmenter and recognizer have to recognize the incoming speech.
Bibliographic reference. Berkling, Kay M. / Barnard, Etienne (1994): "Language identification of six languages based on a common set of broad phonemes", In ICSLP-1994, 1891-1894.