For all the time invested in meetings, very little of the wealth of information that is exchanged is explicitly preserved. In this paper, we propose a novel platform for meeting transcription using cellular phones for recognition. As most meeting participants carry cellular phones with them, this platform will allow meetings to be transcribed wherever they take place, without requiring any additional infrastructure. In this paper, we introduce our proposed platform, and compare three approaches for combining audio from multiple devices: microphone selection, either at signal or feature level, and combination of decoder outputs via confusion network combination. We evaluated the effectiveness of our cellular phone based platform on speech collected in a meeting environment, and found that the early microphone selection at signal level obtained a 16% improvement in speech recognition accuracy compared to using a single recording device. Moreover, this approach offered a comparable performance to multi-system confusion network combination, while requiring significantly lower computational cost.
Bibliographic reference. Cossalter, Michele / Sundararajan, Priya / Lane, Ian (2011): "Ad-hoc meeting transcription on clusters of mobile devices", In INTERSPEECH-2011, 2881-2884.