EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Combined Speech and Audio Coding with Bit Rate and Bandwidth Scalability

Maria Farrugia, Ahmet M. Kondoz

University of Surrey, UK

The growing demand for streaming multimedia services over the Internet and recently also over mobile networks has initiated a great interest in coding algorithms which are able to adapt to different transmission environments and to operate under multiple constraints of bit rate, complexity, delay, robustness to bit errors and diversity of input signals. In the light of these recent developments, we present a novel scalable representation for speech and audio signals with low delay. The algorithm operates in four modes, each based on backward-adaptive linear predictive coding (BA LPC). The first mode is referred to as the base-line narrowband (0--4kHz) coder. Wideband speech and audio signals (0--8kHz) are efficiently represented by the second mode which employs a QMF to split the spectrum into two equal bands. The remaining two modes use a two-stage QMF structure to decompose the bandwidth of 32kHz sampled signals into four bands. Scalability is achieved by means of discrete quantisation layers representing various levels of enhancements for each band and also flexibility in terms of complexity and bit allocation requirements depending on the particular application and on the network resources. The resulting bit rates range from 12 to 64kb/s. The performance of the coder is evaluated by comparing it to MPEG and ITU standards.

Bibliographic reference.  Farrugia, Maria / Kondoz, Ahmet M. (2001): "Combined speech and audio coding with bit rate and bandwidth scalability", In EUROSPEECH-2001, 2307-2310.