First Workshop on Speech, Language and Audio in Multimedia (SLAM 2013)

Marseille, France
August 22-23, 2013

Automatic Speech Recognition in the BBC

Sam Davies

BBC R&LD, London, UK

In this talk we will present an overview of our work on the BBC’s World Service Archive. This project uses automatic speech recognition and a novel technique for topic identification & disambiguation from noisy transcripts to enable automatic semantic tagging of programmes. Of course such processing across large archives is hard to scale so we’ll present some of our work towards tackling this issue. We will also present our research on attempts to classify the entire BBC’s archive of radio and television programmes broadcast since 1922. This project looks to classify programmes by creating new metadata from an analysis of programme content. This will primarily focus on the work we have done in identifying semantic content from audio (including music), along with a brief overview of our work on affective indexing. Here we will briefly introduce our multimodal work on identifying the mood or emotional component of a programme, focussing on mood identification from music, speech and non-speech sounds.

Biography Sam Davies joined BBC R&D in 2007 working on a variety of projects including high frame rate television, object tracking in sporting events and image recognition. Since 2009 he has been working on the Multimedia Classification project, which has been identifying new techniques for metadata generation from audio and video content in the BBC archive. This work has resulted in prototypes which offer unique ways to analyse the semantic and affective, or emotional, content of audio, visual and text documents.

Bibliographic reference.  Davies, Sam (2013): "Automatic speech recognition in the BBC", In SLAM-2013 (abstract).