Third International Conference on Spoken Language Processing (ICSLP 94)
This paper describes a technique to automatically locate emphasized segments of a speech recording based on pitch. These salient portions can be used in a variety of applications, but were originally designed to be used in an interactive system that enables high-speed skimming and browsing of speech recordings. Previous techniques to detect emphasis have used Hidden Markov Models; emphasized regions in close temporal proximity were found to successfully create useful summaries of the recordings. The new research described herein presents a simpler technique to detect salient segments and summarize a recording without using statistical models that require large amounts of training data. The algorithm adapts to the pitch range of a speaker, then automatically selects the regions of highest pitch activity as a measure of emphasis.
Bibliographic reference. Arons, Barry (1994): "Pitch-based emphasis detection for segmenting speech recordings", In ICSLP-1994, 1931-1934.