12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

A Robust Approach to Mining Repeated Sequence in Audio Stream

Jiansong Chen, Lei Zhu, Bailan Feng, Peng Ding, Bo Xu

Chinese Academy of Sciences, China

In multimedia stream, repeated sequences, e.g., commercials, jingles, usually imply potentially significant information. Therefore, mining repeated sequence is an important approach to analyzing multimedia content. This paper reports on a robust unsupervised technique of discovering repeated sequence in audio stream. Different from former research, our approach transforms the repeated sequence detection task into a Hidden Markov Model (HMM) decoding problem in a similarity trellis. To resist the false and missing matches in real application, we present a soft definition of repeated sequence, termed as maximal loosely repeated sequence (MLRS), as the objective for detection, and use a Viterbi-like algorithm to mine all the MLRSs in the stream. In addition, we propose a novel metric to evaluate the repeated sequence detection algorithm. Experiments both on simulated data and real broadcast data demonstrate the effectiveness of our method.

Full Paper

Bibliographic reference.  Chen, Jiansong / Zhu, Lei / Feng, Bailan / Ding, Peng / Xu, Bo (2011): "A robust approach to mining repeated sequence in audio stream", In INTERSPEECH-2011, 2277-2280.