First International Conference on Spoken Language Processing (ICSLP 90)
A prototype for a speech-to-text transcription system is described. This system recognizes continuous phrasal speech and transcribes it in Japanese text. This paper outlines methods for acoustic and linguistic processing, and describes the system configuration and results of performance evaluation tests. As a text is spoken phrase by phrase, it is recognized by a word-spotting method using a continuous dynamic programming technique. High frequency words and CVs in continuous phrasal speech are detected using established CV and word templates. The CV and word candidates are converted to phrase candidates using a word dictionary, inflection table, post positional word dictionary, compound word table, and phrase syntactic pattern table. Frequent phrase co-occurrence patterns are used to select feasible phrase candidates. A performance evaluation test is carried out for Japanese X-ray CT scanning reports. Conversion accuracies of 80% and 65% are obtained for normal and abnormal medical findings, at input speeds of 100 Chinese characters/minute and 50 Chinese characters/minute respectively. These input speeds equal those of a professional transcriber and a novice transcriber after 20 days of training.
Bibliographic reference. Tsuboi, Toshiaki / Sugamura, Noboru (1990): "A prototype for a speech-to-text transcription system", In ICSLP-1990, 889-892.