Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

High-Rate Data Embedding in Unvoiced Speech

Konrad Hofbauer (1), Gernot Kubin (2)

(1) EEC, France; (2) Graz University of Technology, Austria

We propose a blind speech watermarking algorithm which allows high-rate embedding of digital side information into speech signals. We exploit the fact that the well-known LPC vocoder works very well for unvoiced speech. Using an auto-correlation based pitch tracking algorithm, a voiced/unvoiced segmentation is carried out. In the unvoiced segments, the linear prediction residual is replaced by a data sequence. This substitution does not cause perceptual degradation as long as the residualís power is matched. The signal is resynthesised using the unmodified LPC filter coefficients. The watermark is decoded by a linear prediction analysis of the received signal and the information is extracted from the sign of the residual. The watermark is nearly imperceptible and provides a channel capacity of up to 2000 bit/s in an 8 kHz-sampled speech signal.

Full Paper

Bibliographic reference.  Hofbauer, Konrad / Kubin, Gernot (2006): "High-rate data embedding in unvoiced speech", In INTERSPEECH-2006, paper 1906-Mon1FoP.10.