INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Single-Channel Head Orientation Estimation Based on Discrimination of Acoustic Transfer Function

Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki

Kobe University, Japan

This paper presents a talker's head orientation estimation method using only a single microphone, where phoneme HMMs (Hidden Markov Models) of clean speech are introduced to separate the acoustic transfer function at the user's position and head orientation. The frame sequence of the acoustic transfer function is estimated by maximizing the likelihood of training data uttered from a given position with a given head orientation. Using the separated frame sequence data, the user's position and the head orientation are trained by Support Vector Machine (SVM) in advance. Then, for each test utterance, the frame sequence of the acoustic transfer function is separated based on the maximum likelihood estimation using the label sequence obtained from the phoneme recognition, and the user's position and head orientation are estimated by discriminating the separated acoustic transfer function using SVM. The effectiveness of this method has been confirmed by talker localization and head orientation estimation experiments performed in a real environment.

Full Paper

Bibliographic reference.  Takashima, Ryoichi / Takiguchi, Tetsuya / Ariki, Yasuo (2011): "Single-channel head orientation estimation based on discrimination of acoustic transfer function", In INTERSPEECH-2011, 2721-2724.