INTERSPEECH 2006 - ICSLP
Aiming at practical speech recognition systems, we are developing speech databases representing the situation in which the application is used, and evaluating various techniques using the database. Such methodology is expected to contribute to bridge the expectations of the developers and the reactions of the users. We start with the applications in automotive environments, or car navigation systems more precisely. During the data collection, special attention was paid to maintain the spontaneousness of the speaker, to cover failed utterances, and to use the hardware setup suitable for microphone array techniques. After the database is prepared, various techniques are evaluated. In some cases, oracle information is used to find the upper limit of the improvement of a specific module. In other cases, typical improving algorithms are tested. Recognition experiments using two separate decoders indicate that endpoint detection, feature normalization, speaker adaptation, and parallel decoding are promising fields. We also present some modifications of parallel decoding to reduce the computational cost and to realize practical applications.
Bibliographic reference. Obuchi, Yasunari / Hataoka, Nobuo (2006): "Development and evaluation of speech database in automotive environments for practical speech recognition systems", In INTERSPEECH-2006, paper 1168-Thu1CaP.4.