ISCA Archive Eurospeech 1997 Sessions Booklet
  ISCA Archive Sessions Booklet
top

5th European Conference on Speech Communication and Technology

Rhodes, Greece
22-25 September 1997

General Chair: George Kokkinakis

Acoustic Modelling


Using multiple time scales in a multi-stream speech recognition system
Stéphane Dupont, Hervé Bourlard

Speech recognition using HMM-state confusion characteristics
Yumi Wakita, Harald Singer, Yoshinori Sagisaka

Bottom-up and top-down state clustering for robust acoustic modeling
Cristina Chesta, Pietro Laface, Franco Ravera

Comparison of optimization methods for discriminative training criteria
Ralf Schlüter, W. Macherey, S. Kanthak, Hermann Ney, Lutz Welling

Clustering beyond phoneme contexts for speech recognition
Clark Z. Lee, Douglas O'Shaughnessy

Influence of outliers in training the parametric trajectory models for speech recognition
Rathinavelu Chengalvarayan

Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition
Trym Holter, Torbjorn Svendsen

Modelling and decoding of crossword context dependent phones in the Philips large vocabulary continuous speech recognition system
Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox

Modelling inter-frame dependence with preceeding and succeeding frames
Philip Hanna, Ji Ming, Peter O'Boyle, F. Jack Smith

Continuous speech recognition using syllables
Rhys James Jones, Simon Downey, John S. Mason

A new approach to generalized mixture tying for continuous HMM-based speech recognition
Daniel Willett, Gerhard Rigoll

State tying for context dependent phoneme models
Klaus Beulen, Elmar Bransch, Hermann Ney

A novel node splitting criterion in decision tree construction for semi-continuous HMMs
Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle

Creating unseen triphones by phone concatenation in the spectral, cepstral and formant domains
Mats Blomberg

Creating large subword units for speech recognition
Thilo Pfau, Manfred Beham, W. Reichl, Günther Ruske

Segmental modeling using a continuous mixture of non-parametric models
Jacob Goldberger, David Burshtein, Horacio Franco

Segmentation and modeling in segment-based recognition
Jane W. Chang, James R. Glass

Using syllables in a hybrid HMM-ANN recognition system
Alfred Hauenstein

Noise robust segment-based word recognition using vector quantisation
Ramalingam Hariharan, Juha Hakkinen, Kari Laurila, Janne Suontausta

Viterbi based splitting of phoneme HMMs
Luis Javier Rodriguez, Ines M. Torres

The demiphone: an efficient subword unit for continuous speech recognition
José B. Marino, Albin Nogueiras, Antonio Bonafonte

Organizing phone models based on piecewise linear segment lattices of speech samples
Hiroaki Kojima, Kazuyo Tanaka

Automatic architecture design by likelihood-based context clustering with crossvalidation
Ivica Rogina

Towards articulatory speech recognition: learning smooth maps to recover articulator information
Sam Roweis, Abeer Alwan

Selection of the most effective set of subword units for an HMM-based speech recognition system
Anastasios Tsopanoglou, Nikos Fakotakis

Multi-band continuous speech recognition
Christophe Cerisara, Jean-Paul Haton, Jean-Francois Mari, Dominique Fohr

The design of acoustic parameters for speaker-independent speech recognition
Nabil N. Bitar, Carol Y. Espy-Wilson





Training Techniques; Efficient Decoding in ASR


Acoustic modeling based on the MDL principle for speech recognition
Koichi Shinoda, Takao Watanabe

Discriminative utterance verification using multiple confidence measures
Piyush Modi, Mazin Rahim

Subspace distribution clustering for continuous observation density hidden Markov models
Enrico Bocchieri, Brian Mak

A comparative study of methods for phonetic decision-tree state clustering
H. J. Nock, M. J. F. Gales, Steve J. Young

Comparing Gaussian and polynomial classification in SCHMM-based recognition systems
Alfred Kaltenmeier, Jürgen Franke

Maximum likelihood successive state splitting algorithm for tied-mixture HMNET
Alexandre Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura

String-level MCE for continuous phoneme recognition
Erik McDermott, Shigeru Katagiri

HMM state clustering across allophone class boundaries
Ze'ev Rivlin, Ananth Sankar, Harry Bratt

Weighted determinization and minimization for large vocabulary speech recognition
Mehryar Mohri, Michael Riley

Parallel speech recognition
Steven Phillips, Anne Rogers

Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition
Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney

A static lexicon network representation for cross-word context dependent phones
Kris Demuynck, Jacques Duchateau, Dirk van Compernolle

Decision-tree based quantization of the feature space of a speech recognizer
Mukund Padmanabhan, L. R. Bahl, D. Nahamoo, Pieter de Souza

Sub-vector clustering to improve memory and speed performance of acoustic likelihood computation
Mosur Ravishankar, R. Bisiani, E. Thayer

The incorporation of path merging in a dynamic network recogniser
Simon Hovell

Improvement on connected digits recognition using duration constraints in the asynchronous decoding scheme
Miroslav Novak

Explicit word error minimization in n-best list rescoring
Andreas Stolcke, Yochai Konig, Mitchel Weintraub

Efficient 2-pass n-best decoder
Long Nguyen, Richard Schwartz

A memory management method for a large word network
Tomohiro Iwasaki, Yoshiharu Abe


Prosody


Persistence of prosodic features between dialectal and standard Italian utterances in six sub-varieties of a region of southern Italy (salento): first assessments of the results of a recognition test and an instrumental analysis
Antonio Romano

Improving the phonetic annotation by means of prosodic phrasing
Halewijn Vereecken, Annemie Vorstermans, Jean-Pierre Martens, Bert van Coile

A descriptive study of prosodic phenomena in Mpur (west Papuan Phylum)
Cecilia Ode

Automated quantitative analysis of F0 contours of utterances from a German ToBI-labeled speech database
Hansjörg Mixdorff, Hiroya Fujisaki

Identification and automatic generation of prosodic contours for a text-to-speech synthesis system in French
Stéphanie de Tournemire

Quantitative analysis and formulation of tone concatenation in Chinese F0 contours
Jin-Fu Ni, Ren-Hua Wang, Keikichi Hirose

An environment for the labelling and testing of melodic aspects of speech
Christel Brindöpke, Arno Pahde, Franz Kummert, Gerhard Sagerer

PROPAUSE: a syntactico-prosodic system designed to assign pauses
David Casacuberta, Lourdes Aguilar, Rafael Marin

Integrated dialog act segmentation and classification using prosodic features and language models
Volker Warnke, Ralf Kompe, Heinrich Niemann, Elmar Nöth

Evaluation of prosodic characteristics in retold stories in Dutch by means of semantic scales
Monique E. van Donzel, Florien J. Koopmans-van Beinum

Text-to-intonation in spontaneous Swedish
Gosta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House

Synthesising attitudes with global rhythmic and intonation contours
Yann Morlec, Gérard Bailly, Véronique Auberge

Prosody-particle pairs as discourse control signs
Dafydd Gibbon, Claudia Sassen

Focus detection with additional information of phrase boundaries and sentence mode
Anja Elsner

The role of prosody in infants' native-language discrimination abilities: the case of two phonologically close languages
Laura Bosch, Nuria Sebastian-Galles

Prosodic cycles and interpersonal synchrony in American English and Swedish
Eugene H. Buder, Anders Eriksson

Relating prosody to syntax: boundary signalling in Swedish
Eva Strangert

On representation of fundamental frequency of speech for prosody analysis using reliability function
Mitsuru Nakai, Hiroshi Shimodaira

Efficient method of establishing words tone dictionary for Korean TTS system
Seong-Hwan Kim, Jin-Young Kim

Perception of questions and statements in Neapolitan Italian
Mariapaola D'Imperio, David House



Robustness in Recognition and Signal Processing


Cyclic autocorrelation-based linear prediction analysis of speech
Kuldip K. Paliwal, Yoshinori Sagisaka

Novel filler acoustic models for connected digit recognition
Ilija Zeljkovic, Shrikanth Narayanan

A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments
Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano

Discriminative feature extraction for speech recognition in noise
Angel de la Torre, Antonio M. Peinado, Antonio J. Rubio, Pedro Garcia

Noise robust recognition using feature selective modeling
Michael K. Brendborg, Borge Lindberg

Mixture input transformations for adaptation of hybrid connectionist speech recognizers
Victor Abrash

Adaptation of time differentiated cepstrum for noisy speech recognition
Tai-Hwei Hwang, Lee-Min Lee, Hsiao-Chuan Wang

On the importance of various modulation frequencies for speech recognition
Noboru Kanedera, Takayuki Arai, Hynek Hermansky, Misha Pavel

A robust RNN-based pre-classification for noisy Mandarin speech recognition
Wei-Tyng Hong, Sin-Horng Chen

A parallel environment model (PEM) for speech recognition and adaptation
Mazin Rahim

Adaptive model combination for robust speech recognition in car environments
Volker Schless, Fritz Class

A comparative study of speech detection methods
Stefaan Van Gerven, Fei Xie

Voice activity detection using source separation techniques
Nikos Doukas, Patrick Naylor, Tania Stathaki

Voice activity detection using source separation techniques
Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Multiresolution channel normalization for ASR in reverberant environments
Carlos Avendano, Sangita Tibrewala, Hynek Hermansky

A speech pre-processing technique for end-point detection in highly non-stationary environments
Rafael Martinez, Agustin Alvarez, Vilda Pedro Gomez, Mercedes Perez, Victor Nieto, Victoria Rodellar

Application of several channel and noise compensation techiques for robust speaker recognition
Laura Docio-Fernandez, Carmen Garcia-Mateo

Knowing the wheat from the weeds in noisy speech
Hany Agaiby, Thomas J. Moir

Model-based approach for robust speech recognition in noisy environements with multiple noise sources
Do Yeong Kim, Nam Soo Kim, Chong Kwan Un

Normalization of speaker variability by spectrum warping for robust speech recognition
Y.C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee

LPC poles tracker for music/speech/noise segmentation and music cancellation
Stephane H. Maes

Comparative evaluations of several front-ends for robust speech recognition
Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Le, Rhee M. Kil

Speaker normalization through formant-based warping of the frequency scale
Evandro B. Gouvea, Richard M. Stern

The use of cepstral means in conversational speech recognition
Martin Westphal

Compensation for environmental and speaker variability by normalization of pole locations
Juan M. Huerta, Richard M. Stern

Cellular phone speech recognition: noise compensation vs. robust architectures
Jean-Baptiste Puel, Régine André-Obrecht

Speech recognition in noise using on-line HMM adaptation
Tung-Hui Chiang






Feature Estimation, Pitch, and Prosody


Acoustic parameters optimised for recognition of phonetic features
Anya Varnich Hansen

Heterogeneous acoustic measurements for phonetic classification 1
Andrew K. Halberstadt, James R. Glass

Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy
Ben Milner

Data-driven design of RASTA-like filters
Sarel van Vuuren, Hynek Hermansky

Evaluating feature set performance using the f-ratio and j-measures
Simon Nicholson, Ben Milner, Stephen Cox

Robust speech parameters located in the frequency domain
Javier Hernando, Climent Nadeu

A modified zero-crossing method for pitch detection in presence of interfering sources
Francois Gaillard, Frederic Berthommier, Gang Feng, Jean-Luc Schwartz

Using simulated annealing expectation maximization algorithm for hidden Markov model parameters estimation
Jacques Simonin, Chafic Mokbel

Covariation of subglottal pressure, F0 and glottal parameters
Gunnar Fant, Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants

The fractal behaviour of unvoiced plosives: a means for classification
Anastasios Delopoulos, Maria Rangoussi

A method for analysis of the local speech rate using an inventory of reference units
Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi

Analysis and modeling of fundamental frequency contours of Greek utterances
Hiroya Fujisaki, Sumio Ohno, Takashi Yagi

Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition
Fernando Martinez, Daniel Tapias, Jorge Alvarez, Paloma Leon

Analysis of children's speech: duration, pitch and formants
Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan

A method of measuring formant frequencies at high fundamental frequencies
Hartmut Traunmüller, Anders Eriksson

Analysis of speaking rate variations in stress-timed languages
Tom Brondsted, Jens Printz Madsen

Automatic identification of phoneme boundaries using a mixed parameter model
Paul Micallef, Ted Chilton

Pitch detection reliability assessment for forensic applications
Serguei Koval, Veronika Bekasova, Michael Khitrov, Andrey Raev

Efficient estimation of perceptual features for speech recognition
Zhihong Hu, Etienne Barnard

Towards decomposing the sources of variability in speech
Narendranath Malayath, Hynek Hermansky, Alexander Kain

Use of vector-valued dynamic weighting coefficients for speech recognition: maximum likelihood approach
Rathinavelu Chengalvarayan

Automatic segmentation: data-driven units of speech
S. W. Beet, L. Baghai-Ravary

On robust time-varying AR speech analysis based on t-distribution
Dejan Bajic

A simple phoneme energy model for the Greek language and its application to speech recognition
Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis

A macroscopic analysis of an emotional speech corpus
James E. H. Noad, Sandra P. Whiteside, Phil Green

Restoration of pitch pattern of speech based on a pitch generation model
Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata

The research of correlation between pitch and skin galvanic reaction at change of human emotional state
A. V. Agranovski, O. Y. Berg, D. A. Lednov

K-NN versus Gaussian in HMM-based recognition system
Claude Montacié, Marie-José Caraty, Fabrice Lefèvre

Spectral methods for voice source parameters estimation
Boris Doval, Christophe d'Alessandro, Benoit Diard


Speech Coding


A simple and efficient algorithm for the compression of MBROLA segment databases
Olivier van der Vrecken, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere

A segmental formant vocoder based on linearly varying mixture of Gaussians
Parham Zolfaghari, Tony Robinson

Voice mimic system using an articulatory codebook for estimation of vocal tract shape
Samir Chennoukh, Daniel Sinder, Gael Richard, James L. Flanagan

Adaptive transform coding for linear predictive residual
Damith J. Mudugamuwa, Alan B. Bradley

Performance evaluation of objective quality measures for coded speech
Akira Takahashi, Nobuhiko Kitawaki, Paolino Usai, David Atkinson

Between recognition and synthesis - 300 bits/second speech coding
Mohamed Ismail, Keith Ponting

High quality split-band LPC vocoder and its fixed point real time implementation
Stephane Villette, Milos Stefanovic, Ian Atkinson, Ahmet Kondoz

Missing packet recovery techniques for DM coded speech
Wen-Whei Chang, Hwai-Tsu Chang, Wan-Yu Meng

Spectral sensitivity of LSP parameters and their transformed coefficients
Hai Le Vu, Laszlo Lois

Reducing the complexity of the LPC vector quantizer using the k-d tree search algorithm
V. Ramasubramanian, Kuldip K. Paliwal

Quantization using wavelet based temporal decomposition of the LSF
Aweke N. Lemma, W. Bastiaan Kleijn, Ed F. Deprettere

A novel 1.7/2.4 kb/s DCT based prototype interpolation speech coding system
Costas S. Xydeas, Gokhan H. Ilk

Improved regular pulse VSELP coding of speech at low bit-rates
Yong-Soo Choi, Hong-Goo Kang, Sang-Wook Park, Jae-Ha Yoo, Dae-Hee Youn

Joint estimation of pitch, band magnitudes, and v\UV decisions for MBE vocoder
Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim

A new distance measure in LPC coding: application for real time situations
Balazs Kovesi, Samir Saoudi, Jean Marc Boucher, Gábor Horvath

Consideration of processing strategies for very-low-rate compression of wideband speech signals with known text transcription
Peter Vepyek, Alan B. Bradley

Zero-redundancy error protection for CELP speech codecs
Norbert Görtz

Low bit rate speech coding using an improved HSX model
Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul

Phonetic vocoding with speaker adaptation
Carlos M. Ribeiro, Isabel Trancoso

Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate
Geneviève Baudoin, Jan Cernocky, Gérard Chollet

On modeling event functions in temporal decomposition based speech coding
Shahrokh Ghaemmaghami, Mohamed Deriche, Boualem Boashash

Phase quantization by pitch-cycle waveform coding in low bit rate sinusoidal coders
Soledad Torres, F. Javier Casajús-Quirós

A perceptual study of the greek vowel space using synthetic stimuli
Antonis Botinis, Marios Fourakis, John W. Hawks

Mixed multi-band excitation coder using frequency domain mixture function (FDMF) for a low-bit rate speech coding
Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh

Robust GSM speech decoding using the channel decoder's soft output
Tim Fingscheidt, Olaf Scheufen

A low-bit-rate speech coder using adaptive line spectral frequency prediction 1319
Carl W. Seymour, Tony A. Robinson


Speech Synthesis Techniques


Optimising unit selection with voice source and formants in the CHATR speech synthesis system
Wen Ding, Nick Campbell

A new framework to provide high-controllability speech signal and the development of a workbench for it
Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima

Shape-invariant prosodic modification algorithm for concatenative text-to-speech synthesis
Eduardo R. Banga, Carmen Garcia-Mateo, Xavier Fernandez-Salgado

An RNN-based spectral information generation for Mandarin text-to-speech
Shaw-Hwa Hwang, Sin-Horng Chen, Saga Chang

Methods for optimal text selection
Jan P. H. van Santen, Adam L. Buchsbaum

High resolution prosody modification for speech synthesis
Francisco M. Gimenez de los Galanes, David Talkin

Text-to-speech conversion with neural networks: a recurrent TDNN approach
Orhan Karaali, Gerald Corrigan, Ira Gerson, Noel Massey

Data driven formant synthesis
Jesper Högberg

Speech synthesis using non-uniform units in the Verbmobil project
Simon King, Thomas Portele, Florian Höfer

On the pronunciation mode of acronyms in several European languages
Isabel Trancoso, M. Ceu Vianna

Evaluation of speech synthesis systems for Dutch in tele-communication applications in GSM and PSTN networks
Toni Rietveld, Joop Kerkhoff, M. J. W. M. Emons, E.J. Meijer, Angelien A. Sanderman, Agaath M. C. Sluijter

Automatic diphone extraction for an Italian text-to-speech synthesis system
Bianca Angelini, Claudia Barolo, Daniele Falavigna, Maurizio Omologo, Stefano Sandri

Simplification of TTS architecture vs. operational quality
Eric Keller

Felix - a TTS system with improved pre-processing and source signal generation
Georg Fries, Antje Wirth

Investigating the limitations of concatenative synthesis
Mike Edgington

Speech coding and synthesis using parametric curves
Luis Miguel Teixeira de Jesus, Gavin C. Cawley

Automatically clustering similar units for unit selection in speech synthesis
Alan W. Black, Paul Taylor

Improvements on a trainable letter-to-sound converter
Li Jiang, Hsiao-Wuen Hon, Xuedong Huang

On a cepstral pitch alteration technique for prosody control in the speech synthesis system with high quality
Myungjin Bae, Kyuhong Kim, Woncheol Lee

Diphone concatenation using a harmonic plus noise model of speech
Yannis Stylianou, Thierry Dutoit, Juergen Schroeter


Technology for S&L Acquisition, Speech Processing Tools


The "sketchboard": a dynamic interpretative memory and its use for spoken language understanding
Gérard Sabah

Speech technology integration and research platform: a system study
Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew Pargellis

Speech recognition on SPHERIC - an IC for command and control applications
Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke

MUSE: a scripting language for the development of interactive speech analysis and recognition tools
Michael K. McCandless, James R. Glass

Language learning based on non-native speech recognition
Silke Witt, Steve J. Young

Task modelling by sentence templates
Ute Kilian, Klaus Bader

Extraction and representation rhythmic components of spontaneous speech
Shigeyoshi Kitaazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma

Automatic pronunciation scoring of specific phone segments for language instruction
Yoon Kim, Horacio Franco, Leonardo Neumeyer

Automatic detection of mispronunciation for language instruction
Orith Ronen, Leonardo Neumeyer, Horacio Franco

Continuous formant-tracking applied to visual representations of the speech and speech recognition
Agustin Alvarez, Rafael Martinez, Victor Nieto, Victoria Rodellar, Pedro Gomez

A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents
Goh Kawai, Keikichi Hirose

An educational and experimental workbench for visual processing of speech data
Jan Nouza, Miroslav Holada, Daniel Hajek

A 3 channel digital CVSD bit-rate conversion system using a general purpose DSP
Yong-Soo Choi, Hong-Goo Kang, Sung-Youn Kim, Young-Cheol Park, Dae-Hee Youn

SLIM prosodic module for learning activities in a foreign language
Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu

Barge-in revised
Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes

Waveedit, an interactive speech processing environment for microsoft windows platform
Mohammad Akbar

Subarashii: Japanese interactive spoken language education
Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic

Deploying speech applications over the web
David Goddeau, William Goldenthal, Chris Weikart

CSLUsh: an extendible research environment
Johan Schalkwyk, Jacques de Villiers, Sarel van Vuuren, Pieter Vermeulen

A flexible client-server model for multilingual CTS/TTS development
Tibor Ferenczi, Geza Nemeth, Gabor Olaszy, Zoltan Gaspar

Critically sampled PR filterbanks of nonuniform resolution based on block recursive FAMlet transform
Unto K. Laine

Automatic detection of accent in English words spoken by Japanese students
Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa

An English conversation and pronunciation CAI system using speech recognition technology
Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa

Bringing spoken language systems to the classroom
Stephen Sutton, Ed Kaiser, A. Cronk, Ron Cole

Automatic assessment of foreign speakers' pronunciation of dutch
Catia Cucchiarini, Lou Boves

Use of low power EM radar sensors for speech articulator measurements
John F. Holzrichter, Greg C. Burnett

Real time measurements of the vocal tract resonances during speech
Julien Epps, Annette Dowd, John Smith, Joe Wolfe


Phonetics and Phonology


Linguistic criteria for building and recording units for concatenative speech synthesis in brazilian portuguese
Eleonora Cavalcante Albano, Patricia Aparecida Aquino

four-and-twenty, twenty-four. what's in a number?
Knut Kvale, Arne Kjell Foldvik

Vowel nasalization in Brazilian Portuguese: an articulatory investigation
Joao Antonio de Moraes

Rhythmic organization pecularities of the spoken text
Elena Steriopolo

Obtaining confidence measures from sentence probabilities
Bernhard Rueber

Sentence design for speech synthesis and speech recognition database by phonetic rules
Yiqing Zu

Identification of regional variants of high German from digit sequences in German telephone speech
Christoph Draxler, Susanne Burger

Aerodynamic constraints on the production of palatalized trills: the case of the Slavic trilled [r]
Darya Kavitskaya

An experimental phonetic study of the interrelationship between prosodic phrase and syntactic structure
Cheol-jae Seong, Sanghun Kim

Individual differences between vowel systems of German speakers
Sebastian J. G. G. Heid

Tempo and its change in spontaneous speech
Anton Batliner, Andreas Kießling, Ralf Kompe, Heinrich Niemann, Elmar Nöth

A corpus-based approach to diphthong analysis of standard Slovenian
Bojan Petek, Rastislav Sustarsic

Catalan vowel duration
Lourdes Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marin, Montse Riera

The intonation of vocatives in spoken Neapolitan Italian
Maria Rosaria Caputo

A comparative acoustic study of spontaneous and read Italian speech
Emanuela Magno Caldognetto, Claudio Zmarich, Franco Ferrero

A contribution to the estimation of naturalness in the intonation of Italian spontaneous speech
Mario Refice, Michelina Savino, Martine Grice

Diphthongs and the process of monophthongization in Austrian German: a first approach
Sylvia Moosmüller

The prosody of broad and narrow focus in English: two experiments
Steve Hoskins

The domain of accentual lengthening in Scottish English
Alice Turk, Laurence White

Spontaneous dialogue: some results about the F0 predictions of a pragmatic model of information processing
Mariette Bessac, Geneviève Caelen-Haumont

Phonetic characteristics of double articulations in some Mangbutu-efe languages
Didier Demolin, Bernard Teston

Intonation modeling for the southern dialects of the Basque language
Inmaculada Hernaez, Inaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria

From phone identification to phone clustering using mutual information
Peter O'Boyle, Ji Ming, Marie Owens, F. Jack Smith

Phonetic code emergence in a society of speech robots: explaining vowel systems and the MUAF principle
Ahmed-Reda Berrah, Rafael Laboissiere

Effects of voicing on /t,d/ tongue/palate contact in English and norwegian
Inger Moen, Hanne Gram Simonsen

Fieldwork techniques for relating formant frequency, amplitude and bandwidth
Peter Ladefoged, Gunnar Fant

Word juncture modelling based on the TIMIT database
Xue Wang, Louis C.W. Pols

The phonology and phonetics of second language intonation: the case of "Japanese English"
Motoko Ueyama





Applications of Speech Technology


Webgalaxy - integrating spoken language and hypertext navigation
Raymond Lau, Giovanni Flammia, Christine Pao, Victor W. Zue

Pitch estimation of singing for re-synthesis and musical transcription
Michael J. Carey, Eluned S. Parris, Graham D. Tattersall

Automated lip synchronisation for human-computer interaction and special effect animation
Christian Martyn Jones, Satnam Singh Dlay

Developing web-based speech applications
Charles T. Hemphill, Yeshwant K. Muthusamy

Automatic post-synchronization of speech utterances
Werner Verhelst

Automatic generation of hyperlinks between audio and transcript
Jordi Robert-Ribes, Rami G. Mukhtar

Analysis of infant cries for the early detection of hearing impairment
Sebastian Möller, Rainer Schonweiler

Optical logo-therapy (OLT): a computer-based real time visual feedback application for speech training
A. Hatzis, P.D. Green, S.J. Howard

Intelligent retrieval of very large Chinese dictionaries with speech queries
Sung-Chien Lin, Lee-Feng Chien, Ming-Chiuan Chen, Lin-Shan Lee, Ker-Jiann Chen

Preliminary results of a multilingual interactive voice activated telephone service for people-on-the-move
Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra

Assessment of an operational dialogue system used by a blind telephone switchboard operator
Jean-Christophe Dubois, Yolande Anglade, Dominique Fohr

STACC: an automatic service for information access using continuous speech recognition through telephone line
Antonio J. Rubio, Pedro Garcia, Angel de la Torre, Jose C. Segura, Jesus Diaz-Verdejo, Maria C. Benitez, Victoria Sanchez, Antonio M. Peinado, Juan M. Lopez-Soler, Jose L. Perez-Cordoba

A voice activated dialogue system for fast-food restaurant applications
Ramon Lopez-Cozar, Pedro Garcia, Jesus Diaz-Verdejo, Antonio J. Rubio

Multi-microphone sub-band adaptive signal processing for improvement of hearing aid performance
Paul W. Shields, Douglas R. Campbell

Tactile transmission of intonation and stress
Hans Georg Piroth, Thomas Arnhold

Hearing impairment simulation: an interactive multimedia programme on the internet for students of speech therapy
Kerttu Huttunen, Pentti Korkko, Martti Sorri

Analysis of dysarthric speech by means of formant-to-area mapping
Sorin Ciocea, Jean Schoentgen, Lisa Crevier-Buchman

An intelligent telephone answering system using speech recognition
Boris M. Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja

Speedata: a prototype for multilingual spoken data-entry
Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Heinrich Niemann

Applications for the hearing-impaired: evaluation of finnish phoneme recognition methods
Matti Karjalainen, Peter Boda, Panu Somervuo, Toomas Altosaar

Applications for the hearing-impaired: comprehension of finnish text with phoneme errors
Nina Alarotu, Mietta Lennes, Toomas Altosaar, Anja Malm, Matti Karjalainen

Access - automated call center through speech understanding system
Ute Ehrlich, Gerhard Hanrieder, Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann

Integrating a radio model with a spoken language interface for military simulations
E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan Tammaro

On field experiments of continuous digit recognition over the telephone network
Daniele Falavigna, Roberto Gretter

An HMM-based phoneme recognizer applied to assessment of dysarthric speech
Xavier Menendez-Pidal, James B. Polikoff, H.Timothy Bunnell

Multiapplication platform based on technology for mobile telephone network services
Celinda de la Torre, Gonzalo Alonso

Field test of a calling card service based on speaker verification and automatic speech recognition
Els den Os, Lou Boves, David James, Richard Winski, Kurt Fridh

Speech: a privileged modality
Luc E. Julia, Adam J. Cheyer






Speech Analysis and Modelling


Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate
Hisao Kuwabara

New results in vowel production: MRI, EPG, and acoustic data
Shrikanth Narayanan, Abeer Alwan, Yong Song

The temporal properties of spoken Japanese are similar to those of English
Takayuki Arai, Steven Greenberg

The amplitudes of the peaks in the spectrum: data from /a/ context
Anna Esposito

Acoustical characteristics of speech and voice in speech pathology
Natalija Bolfan-Stosic, Mladen Hedjever

Pronuncation modeling applied to automatic segmentation of spontaneous speech
Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel

Dynamic and static improvements to lexical baseforms
Simon Downey, Richard Wiseman

Signal driven generation of word baseforms from few examples
Andreas Hauenstein

Modeling the acoustic differences between L1 and L2 speech: the short vowels of africaans and south-african English
Elizabeth C. Botha, Louis C. W. Pols

Laryngeal movements and speech rate: an x-ray investigation
Béatrice Vaxelaire, Rudolph Sock

How flexible is the human voice? - a case study of mimicry
Anders Eriksson, Pär Wretling

The effect of low-pass filtering on estimated voice source parameters
Helmer Strik

Vowel development of /i/ and /u/ in 15-36 month old children at risk and not at risk to stutter
Susan M. Fosnot

Optopalatograph: development of a device for measuring tongue movement in 3D
Alan Wrench, Alan McIntosh, William Hardcastle

Speech synthesis and prosody modification using segmentation and modelling of the excitation signal
Juana M. Gutierrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, José M. Pardo

How can the control of the vocal tract limit the speaker's capability to produce the ultimate perceptive objectives of speech? 1063
Christophe Savariaux, Louis-Jean Boë, Pascal Perrier

A step toward general model for symbolic description of the speech signal 1067
Goran S. Jovanovic

Referring in long term speech by using orientation patterns obtained from vector field of spectrum pattern
Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Ryuichi Oka




Speech Enhancement and Noise Mitigation


Residual noise suppression using psychoacoustic criteria
Tim Haulick, Klaus Linhard, Peter Schrogmeier

Processing linear prediction residual for speech enhancement
B. Yegnanarayana, Carlos Avendano, Hynek Hermansky, P. Satyanarayana Murthy

Combined acoustic echo control and noise reduction for mobile communications
Stefan Gustafsson, Rainer Martin

A nonstationary autoregressive HMM and its application to speech enhancement
Ki Yong Lee, Jae Yeol Rheem

Spectral subtraction and mean normalization in the context of weighted matching algorithms
Nestor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack

Improving the intelligibility of noisy speech using an audible noise suppression technique
D. E. Tsoukalas, J. Mourjopoulos, George Kokkinakis

Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions
Laurent Girin, Gang Feng, Jean-Luc Schwartz

Spectral subtraction using a non-critically decimated discrete wavelet transform
Andreas Engelsberg, Thomas Gulzow

Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition
Jen-Tzung Chien, Hsiao-Chuan Wang, Chin-Hui Lee

Integrated bias removal techniques for robust speech recognition \lambda
Craig Lawrence, Mazin Rahim

Acoustic front ends for speaker-independent digit recognition in car environments
Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele

Signal bias removal using the multi-path stochastic equalization technique
Lionel Delphin-Poulat, Chafic Mokbel

Subband echo cancellation in automatic speech dialog systems
Andrej Miksic, Bogomir Horvat

Speech enhancement via energy separation
Hesham Tolba, Douglas O'Shaughnessy

A method of signal extraction from noisy signal
Masashi Unoki, Masato Akagi

Multi-channel noise reduction using wavelet filter bank
Jiri Sika, Vratislav Davidek

Speech signal detection in noisy environement using a local entropic criterion
Imad Abdallah, Silvio Montresor, Marc Baudry

A new algorithm for robust speech recognition: the delta vector taylor series approach
Pedro J. Moreno, Brian Eberman

Robust enhancement of reverberant speech using iterative noise removal
David Cole, Miles Moody, Sridha Sridharan

A network speech echo canceller with comfort noise
D. J. Jones, Scott D. Watson, K. G. Evans, B. M. G. Cheetham, R. A. Reeve

A new metric for selecting sub-band processing in adaptive speech enhancement systems
Amir Hussain, Douglas R. Campbell, Thomas J. Moir

Estimation of LPC cepstrum vector of speech contaminated by additive noise and its application to speech enhancement
Hidefumi Kobatake, Hideta Suzuki

Multi-band and adaptation approaches to robust speech recognition
Sangita Tibrewala, Hynek Hermansky

Non-quadratic criterion algorithms for speech enhancement
Enrique Masgrau, Eduardo Lleida, Luis Vicente






Speech Recognition in Adverse Environments CSR and Error Analysis


A comparative analysis of blind channel equalization methods for telephone speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang

HMM retraining based on state duration alignment for noisy speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang

Fast parallel model combination noise adaptation processing
Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada

Speech recognition module for CSCW using a microphone array
Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuichi Oka

Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition
Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, Wen Gao

Robust speech detection method for speech recognition system for telecommunication networks and its field trial
Seiichi Yamamoto, Masaki Naito, Shingo Kuroiwa

The tuning of speech detection in the context of a global evaluation of a voice response system
Laurent Mauuary, Lamia Karray

New methods in continuous Mandarin speech recognition
C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael A. Picheny, Katherine Shen

Automatic transcription of general audio data: effect of environment segmentation on phonetic recognition 1
Michelle S. Spina, Victor W. Zue

Automatic recognition of continuous Cantonese speech with very large vocabulary
Alfred Ying Pang Ng, L. W. Chan, P. C. Ching

Source normalization training for HMM applied to noisy telephone speech recognition
Yifan Gong

The development of a speaker independent continuous speech recognizer for portuguese
Joao P. Neto, Ciro A. Martins, Luis B. Almeida

Blame assignment for errors made by large vocabulary speech recognizers
Lin Chase

Predicting speech recognition performance
Atsushi Nakamura

A voice activity detector for the ITU-t 8kbit/s speech coding standard g.729
Scott D. Watson, Barry M.G. Cheetham, P.A. Barrett, W.T.K. Wong, A.V. Lewi

Vocabulary-independent recognition of American Spanish phrases and digit strings
Yeshwant K. Muthusamy, John J. Godfrey

Recognition of spoken and spelled proper names
Michael Meyer, Hermann Hild

HMM compensation for noisy speech recognition based on cepstral parameter generation
Takao Kobayashi, Takashi Masuko, Keiichi Tokuda

On the robustness of the critical-band adaptive filtering method for multi-source noisy speech recognition
George Nokas, Evangelos Dermatas, George Kokkinakis

A space transformation approach for robust speech recognition in noisy environments
Cun-tai Guan, Shu-hung Leung, Wing-hong Lau

Robust isolated word recognition using WSP-PMC combination
Tzur Vaich, Arnon Cohen


Multimodal Speech Processing, Emerging Techniques and Applications


Fuzzy logic for rule-based formant speech synthesis
Spyros Raptis, George V. Carayannis

Integrating acoustic and labial information for speaker identification and verification
Pierre Jourlin, Juergen Luettin, Dominique Genoud, Hubert Wassner

Subword unit representations for spoken document retrieval
Kenney Ng, Victor W. Zue

Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise
Pascal Teissier, Jean-Luc Schwartz, Anne Guerin-Dugue

Securized flexible vocabulary voice messaging system on unix workstation with ISDN connection
Philippe Renevey, Andrzej Drygajlo

Automatic derivation of multiple variants of phonetic transcriptions from acoustic signals
Houda Mokbel, Denis Jouvet

Improved bimodal speech recognition using tied-mixture HMMs and 5000 word audio-visual synchronous database
Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano

On the use of phone duration and segmental processing to label speech signal
Philippe Depambour, Regine Andre-Obrecht, Bernard Delyon

Automatic detection of disturbing robot voice- and ping pong-effects in GSM transmitted speech
Martin Paping, Thomas Fahnle

Speech synthesis using phase vocoder techniques
Joseph Di Martino

Integration of eye fixation information with speech recognition systems
Ramesh R. Sarukkai, Craig Hunter

Generation of broadband speech from narrowband speech using piecewise linear mapping
Yoshihisa Nakatoh, M. Tsushima, T. Norimatsu

An assessment of the benefits active noise reduction systems provide to speech intelligibility in aircraft noise environments
Ian E.C. Rogers

OLGA - a dialogue system with an animated talking agent
Jonas Beskow, Kjell Elenius, Scott McGlashan

Towards usable multimodal command languages: definition and ergonomic assessment of constraints on users' spontaneous speech and gestures
Sandrine Robbe, Noelle Carbonell, Claude Valot

Exploiting repair context in interactive error recovery
Bernhard Suhm, Alex Waibel

An hybrid image processing approach to liptracking independent of head orientation
Lionel Reveret, Frederique Garcia, Christian Benoit, Eric Vatikiotis-Bateson

Automatic modeling of coarticulation in text-to-visual speech synthesis
Bertrand Le Goff

A multimedia platform for audio-visual speech processing
Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Reveret, Christian Benoit

An intelligent system for information retrieval over the internet through spoken dialogue
Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe

Data hiding in speech using phase coding
Yasemin Yardimci, A. Enis Cetin, Rashid Ansari

CAVE: an on-line procedure for creating and running auditory-visual speech perception experiments-hardware, software, and advantages
Denis Burnham, John Fowler, Michelle Nicol


Databases, Tools and Evaluations


The bavarian archive for speech signals: resources for the speech community
Florian Schiel, Christoph Draxler, Hans G. Tillmann

WWWTranscribe - a modular transcription system based on the world wide web
Christoph Draxler

Design, recording and verification of a danish emotional speech database
Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard

Issues in database creation: recording new populations, faster and better labelling
Maxine Eskenazi, C. Hogan, J. Allen, R. Frederking

Design and analysis of a German telephone speech database for phoneme based training
Stefan Feldes, Bernhard Kaspar, Denis Jouvet

The design of a large vocabulary speech corpus for portuguese
Joao P. Neto, Ciro A. Martins, Hugo Meinedo, Luis B. Almeida

Continued investigations of laryngectomee speech in noise - measurements and intelligibility tests
Lennart Nord, Britta Hammarberg, Elisabet Lundstrom

An appreciation study of an ASR inquiry system
L.J.M. Rothkrantz, W.A.Th. Manintveld, M.M.M. Rats, R.J. van Vark, J.P.M. de Vreught, H. Koppelaar

Object-oriented modeling of articulatory data for speech research information systems
Kamel Bensaber, Paul Munteanu, Jean-Francois Serignat, Pascal Perrier

A Korean speech corpus for train ticket reservation aid system based on speech recognition
Woosung Kim, Myoung-Wan Koo

Recall memory for earcons
Dawn Dutton, Candace Kamm, Susan Boyce

Semi-automatic phonetic labelling of large corpora
O. Mella, D. Fohr

CORPORA - speech database for Polish diphones
Stefan Grocholewski

Multilingual speech interfaces (MSI) and dialogue design environments for computer telephony services
Christel Müller, Thomas Ziem

Getting started with SUSAS: a speech under simulated and actual stress database
John H. L. Hansen, Sahar E. Bou-Ghazale

A markup language for text-to-speech synthesis richard sproat
Paul Taylor, Michael Tanenblatt, Amy Isard

Several measures for selecting suitable speech CORPORA
Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto

Greek speech database for creation of voice driven teleservices
Irene Chatzi, Nikos Fakotakis, George Kokkinakis









Front-Ends and Adaptation to Acoustics Speaker Adaptation


Speaker adaptation for context-dependent HMM using spatial relation of both phoneme context hierarchy and speakers
Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto

Fast algorithm for speech recognition using speaker cluster HMM
Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto

A comparison of novel techniques for instantaneous speaker adaptation
Timothy J. Hazen, James R. Glass

Fast adaptation of acoustic models to environmental noise using jacobian adaptation algorithm
Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama

Unsupervised HMM adaptation based on speech-silence discrimination
Ilija Zeljkovic, Shrikanth Narayanan, Alexandros Potamianos

Correlation based predictive adaptation of hidden Markov models
Mohamed Afify, Yifan Gong, Jean-Paul Haton

Adaptation of hidden Markov models using multiple stochastic transformations
Vassilios Diakoloukas, Vassilios Digalakis

Transformation smoothing for speaker and environmental adaptation
M. J. F. Gales

Nonlinear discriminant analysis for improved speech recognition
Vincent Fontaine, Christophe Ris, Jean-Marc Boite

On the interplay between auditory-based features and locally recurrent neural networks for robust speech recognition in noise
Jurgen Tchorz, Klaus Kasper, Herbert Reininger, Bilger Kollmeier

Speech recognition using on-line estimation of speaking rate
Nelson Morgan, Eric Fosler, Nikki Mirghafori

Using formant frequencies in speech recognition
John N. Holmes, Wendy J. Holmes, Philip N. Garner

Speaker normalization and speaker adaptation - a combination for conversational speech recognition
Puming Zhan, Martin Westphal, Michael Finke, Alex Waibel

Speaker adaptation based on pre-clustering training speakers
Yuqing Gao, Mukund Padmanabhan, Michael Picheny

A fast method of speaker normalisation using formant estimation
Mike Lincoln, Stephen Cox, Simon Ringland

Acoustic front-end optimization for large vocabulary speech recognition
Lutz Welling, N. Haberland, Hermann Ney

Improving autoregressive hidden Markov model recognition accuracy using a non-linear frequency scale with application to speech enhancement
B. T. Logan, A. J. Robinson

Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training
Tsuneo Nitta, Akinori Kawamura

Speaker adaptation by correlation (ABC)
Scott Shaobing Chen, Peter DeSouza


Speech Perception


Preliminary experiments on the perception of double semivowels
William A. Ainsworth, Georg F. Meyer

Does syllable frequency affect production time in a delayed naming task?
Niels O. Schiller

Human and machine identification of consonantal place of articulation from vocalic transition segments
Andrew C. Morris, Gerrit Bloothooft, William J. Barry, Bistra Andreeva, Jacques Koreman

Modelling the recognition of spectrally reduced speech
Jon Barker, Martin Cooke

Prosodic structure and phonetic processing: a cross-linguistic study
Christophe Pallier, Anne Cutler, Nuria Sebastian-Galles

The correlation between consonant identification and the amount of acoustic consonant reduction
Rob J. J. H. van Son, Louis C. W. Pols

Relevant spectral information for the identification of vowel features from bursts
Anne Bonneau

Perceptual study of intersyllabic formant transitions in synthesized V1-V2 in standard Chinese
Aijun Li

Role of perception of rhythmically organized speech in consolidation process of long-term memory traces (LTM-traces) and in speech production controlling
Oleg P. Skljarov

Sequential probabilities as a cue for segmentation
Arie H. van der Lugt

Perception and acoustics of emotions in singing
Susan Jansens, Gerrit Bloothooft, Guus de Krom

Phonemes and syllables in speech perception: size of attentional focus in French
Christophe Pallier

Quality of a vowel with formant undershoot: a preliminary perceptual study
Shinichi Tokuma

Segmental and suprasegmental contributions to spoken-word recognition in dutch
Mariette Koster, Anne Cutler

Perception of vowel duration and spectral characteristics in Swedish
Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan

Relative contributions of noise burst and vocalic transitions to the perceptual identification of stop consonants
Adrien Neagu, Gerard Bailly

Effect of speaker familiarity and background noise on acoustic features used in speaker identification
Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi

Dynamic versus static specification for the perceptual identity of a coarticulated vowel
Michel Pitermann

Asymmetries in consonant confusion
Madelaine Plauche, Cristina Delogu, John J. Ohala

Rime and syllabic effects in phonological priming between French spoken words
Nicolas Dumay, Monique Radeau

Roles of static and dynamic features of formant trajectories in the perception of talk indedivduality
Weizhong Zhu, Hideki Kasuya


Dialogue Systems: Linguistic Structures, Modelling and Evaluation


Database management and analysis for spoken dialog systems: methodology and tools
Chih-mei Lin, Shrikanth Narayanan, Russell Ritenour

Evaluating spoken dialog systems for telecommunication services
Candace Kamm, Shrikanth Narayanan, Dawn Dutton, Russell Ritenour

Robust spoken dialogue management for driver information systems
Xavier Pouteau, Emiel Krahmer, Jan Landsbergen

Using acoustic and prosodic cues to correct Chinese speech repairs
Yue-Shi Lee, Hsin-Hsi Chen

Integrating domain specific focusing in dialogue models
Nils Dahlbäck, Arne Jönsson

Evaluating competing agent strategies for a voice email agent
Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel

Discourse marker use in task-oriented spoken dialog \lambda
Donna K. Byron, Peter A. Heeman

From interface to content: translingual access and delivery of on-line information
Victor W. Zue, Stephanie Seneff, James Glass, Lee Hetherington, Edward Hurley, Helen Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid

Learning dialogue structures from a corpus
Jan Alexandersson, Norbert Reithinger

Dialogue act classification using language models
Norbert Reithinger, Martin Klesen

User's multiple goals in spoken dialogue
Didier Pernel

Chatting with interactive agent
Noriko Suzuki, Seiji Inokuchi, K. Ishii, Michio Okada

Generic template for the evaluation of dialogue management systems
Gavin E. Churcher, Eric S. Atwell, Clive Souter

Analysis of interactive strategy to recover from misrecognition of utterances including multiple information items
Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi

A referential approach to reduce perplexity in the vocal command system comppa
Francois-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel

Linguistic processor for a spoken dialogue system based on island parsing techniques
Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis

Modelling of speech-based user interfaces
Brian Mellor, Chris Baber

Can you predict responses to yes/no questions? yes, no, and stuff
Beth Ann Hockey, Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, Stephen Isard

Dia-moLE: an unsupervised learning approach to adaptive dialogue models for spoken dialogue systems
Jens-Uwe Möller

How do system questions influence lexical choices in user answers?
Joakim Gustafson, Anette Larsson, Rolf Carlson, K. Hellman


Speaker Recognition and Language Identification


Gaussian mixture models with common principal axes and their application in text-independent speaker identification
Kuo-Hwei Yuo, Hsiao-Chuan Wang

Speaker models designed from complete data sets: a new approach to text-independent speaker verification
Dominik R. Dersch, Robin W. King

A double Gaussian mixture modeling approach to speaker recognition
Rivarol Vergin, Douglas O'Shaughnessy

An acoustic subword unit approach to non-linguistic speech feature identification
Mohamed Afify, Yifan Gong, Jean-Paul Haton

N-best GMM's for speaker identification
Chakib Tadj, Pierre Dumouchel, Yu Fang

Model dependent spectral representations for speaker recognition
Guillaume Gravier, Chafic Mokbel, Gerard Chollet

Equalizing sub-band error rates in speaker recognition
Roland Auckenthaler, John S. Mason

Automatic gender identification under adverse conditions
Stefan Slomka, Sridha Sridharan

Acoustic features and perceptive processes in the identification of familiar voices
Yizhar Lavner, Isak Gath, Judith Rosenhouse

On the use of acoustic segmentation in speaker identification
Leandro Rodriguez-Linares, Carmen Garcia-Mateo

Speaker recognition by humans and machines
Herman J. M. Steeneken, David A. van Leeuwen

Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks
Karsten Kumpf, Robin W. King

A comparison of human and machine in speaker recognition
Li Liu, Jialong He, Günther Palm

Evaluation of second language learners' pronunciation using hidden Markov models
Simo M. A. Goddijn, Guus de Krom

Delta vector taylor series environment compensation for speaker recognition
Brian Eberman, Pedro J. Moreno

Wavelet-like regression features in the cepstral domain for speaker recognition
Jonathan Hume

Minimum classification error linear regression (MCELR) for speaker adaptation using HMM with trend functions
Rathinavelu Chengalvarayan

A continuous HMM text-independent speaker recognition system based on vowel spotting
Nikos Fakotakis, Kallirroi Georgila, Anastasios Tsopanoglou

On the independence of digits in connected digit strings
Johan W. Koolwaaij, Lou Boves

A new procedure for classifying speakers in speaker verification systems
Johan W. Koolwaaij, Lou Boves

Sound channel video indexing
Claude Montacié, Marie-José Caraty

CDHMM speaker recognition by means of frequency filtering of filter-bank energies
Javier Hernando, Climent Nadeu









F0 and Duration Modelling, Spoken language processing


Modeling segmental duration with multivariate adaptive regression splines
Marcel Riedi

High-quality speech synthesis for phonetic speech segmentation
Fabrice Malfrere, Thierry Dutoit

Factors affecting perceived quality and intelligibility in the CHATR concatenative speech synthesiser
Nick Campbell, Yoshiharu Itoh, Wen Ding, Norio Higuchi

Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system
Christoph Neukirchen, Daniel Willett, Gerhard Rigoll

A stochastic model of intonation for French text-to-speech synthesis
Jean Veronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue

Phonetic rules for a phonetic-to-speech system
Angelien A. Sanderman, Renè Collier

Multi-lingual duration modeling
Jan van Santen, Chilin Shih, Bernd Möbius, Evelyne Tzoukermann, Michael Tanenblatt

A model of segment (and pause) duration generation for Brazilian Portuguese text-to-speech synthesis
Plinio A. Barbosa

Parsing strategy for spoken language interfaces with a lexicalized tree grammar
Ariane Halber, David Roussel

What's in a word graph evaluation and enhancement of word lattices?
Jan W. Amtrup, Henrik Heine, Uwe Jost

Accelerated DP based search for statistical translation
C. Tillmann, S. Vogel, Hermann Ney, A. Zubiaga, H. Sawaf

Use of pitch pattern improvement in the CHATR speech synthesis system
Ken Fujisawa, Toshio Hirai, Norio Higuchi

Generating segment durations in a text-zo-speech system: a hybrid rule-based/neural network approach
G. Corrigan, N. Massey, O. Karaali

On the global FO shape model using a transition network for Japanese text-to-speech systems
Yasushi Ishikawa, Takashi Ebihara

An alternative and flexible approach in robust information retrieval systems
José Colás, Juan M. Montero, Javier Ferreiros, José M. Pardo

A probabilistic approach to analogical speech translation
Keiko Horiguchi, Alexander Franz

Dynamic lexicon for a very large vocabulary vocal dictation
Marie-José Caraty, Claude Montacié, Fabrice Lefèvre


Language Modelling


Construction of language models using the morphic generator grammatical inference (MGGI) methodology
E. Segarra, L. Hurtado

An integrated language modeling with n-gram model and WA model for speech recognition
Shuwu Zhang, Taiyi Huang

Statistical analysis of dialogue structure
Ye-Yi Wang, Alex Waibel

Statistical language modeling using the CMU-cambridge toolkit
Philip Clarkson, Ronald Rosenfeld

Text normalization and speech recognition in French
Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel

A novel tree-based clustering algorithm for statistical language modeling
G. Damnati, J. Simonin

Variable-length language modeling integrating global constraints
Shoichi Matsunaga, Shigeki Sagayama

An hybrid language model for a continuous dictation prototype
K. Smaili, I. Zitouni, F. Charpillet, Jean-Paul Haton

Dealing with pronunciation variants at the language model level for the continuous automatic speech recognition of French
Guy Pérennou, L. Pousse

Rational interpolation of maximum likelihood predictors in stochastic language modeling
Ernst Günter Schukat-Talamazzini, Florian Gallwit, Stefan Harbeck, Volker Warnke

N-gram language model adaptation using small corpus for spoken dialog recognition
Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda

Variable n-gram language modeling and extensions for conversational speech
Manhung Siu, Mari Ostendorf

Fuzzy class rescoring: a part-of-speech language model
Petra Geutner

Speech understanding based on integrating concepts by conceptual dependency
Akito Nagai, Yasushi Ishikawa

Dynamic language models for interactive speech applications
Fabio Brugnara, Marcello Federico

Large-scale lexical semantics for speech recognition support
George Demetriou, Eric Atwell, Clive Souter

Integration of grammar and statistical language constraints for partial word-sequence recognition
Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka

Using intonation to constrain language models in speech recognition
Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jacqueline Kowtko

Incorporating POS tagging into language modeling
Peter A. Heeman, James F. Allen

Confidence metrics based on n-gram language model backoff behaviors
C. Uhrik, W. Ward

Structure and performance of a dependency language model
Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, Harry Printz, Eric Ristad, Ronald Rosenfeld, Andreas Stolcke, Dekai Wu

Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech
Andreas Stolcke

Hybrid language models: is simpler better?
P. E. Kenne, Mary O'Kane

Internal and external tagsets in part-of-speech tagging
Thorsten Brants


Auditory Modelling and Psychoacoustics, Neural Networks for Speech Processing and Recognition


A probabilistic model of double-vowel segregation
Laurent Varin, Frédéric Berthommier

Stimulus signal estimation from auditory-neural transduction inverse processing
Habibzadeh V. Houshang, Kitazawa Shigeyoshi

FDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing
Chakib Tadj, Pierre Dumouchel, Franck Poirier

The initial time Span of auditory processing used for speaker attribution of the speech signal
V. V. Lublinskaja, Christian Sappok

Sparse connection and pruning in large dynamic artificial neural networks
Nikko Ström

A modular initialization scheme for better speech recognition performance using hybrid systems of MLPs/HMMs
Roxana Teodorescu, Dirk Van Compernolle, Ioannis Dologlou

Lateralization for auditory perception of foreign words
Tatiana V. Chernigovskaya

The structural weighted sets method for continuous speech and text recognition
Yuri Kosarev, Pavel Jarov, Alexander Osipov

Lateral inhibitory networks for auditory processing
C. J. Sumner, D. F. Gillies

Missing fundamentals: a problem of auditory or mental processing?
Henning Reetz

Predictive neural networks applied to phoneme recognition
F. Freitag, E. Monte, J. Salavedra

Empirical comparison of two multilayer perceptron-based keyword speech recognition algorithms
_ Suhardi, Klaus Fellbaum

Segment boundary estimation using recurrent neural networks
Toshiaki Fukada, Sophie Aveline, Mike Schuster, Yoshinori Sagisaka

Incorporation of HMM output constraints in hybrid NN/HMM systems during training
Mike Schuster

Principles of the hearing periphery functioning in new methods of pitch detection and speech enhancement
Ludmila Babkina, Sergey Koval, Alexander Molchanov

The locus of the syllable effect: prelexical or lexical?
Christine Meunier, Alain Content, Uli H. Frauenfelder, Ruth Kearns

On not remembering disfluencies
Robin J. Lickley, Ellen G. Bard

Using an auditory model and leaky autocorrelators to tune in to speech
T. Andringa


×

Keynotes

Acoustic Modelling

Dynamic Articulatory Measurements

Language Identification

Neural Networks for Speech and Language Processing

Training Techniques; Efficient Decoding in ASR

Prosody

Keyword and Topic Spotting

Robustness in Recognition and Signal Processing

Modelling of Prosody

Microphone Arrays for Speech Enhancement

Multilingual Recognition

Language Specific Speech Analysis

Feature Estimation, Pitch, and Prosody

Speech Coding

Speech Synthesis Techniques

Technology for S&L Acquisition, Speech Processing Tools

Phonetics and Phonology

Confidence Measures in ASR

Speaker and Language Identification

Perception of Prosody

Applications of Speech Technology

Spontaneous Speech Recognition

Language Specific Segmental Features

Speaker Recognition

Speech Synthesis: Linguistic Analysis

Speech Analysis and Modelling

Dialogue Systems: Design and Applications

Speech Production Modelling

Speech Enhancement and Noise Mitigation

Spoken Language Understanding

Language Model Adaptation

Prosody and Speech Recognition/Understanding

Wideband Speech Coding

Speech Recognition in Adverse Environments CSR and Error Analysis

Multimodal Speech Processing, Emerging Techniques and Applications

Databases, Tools and Evaluations

Speaker Adaptation I

Assessment Methods

Education for Language and Speech Communication

Hybrid Systems for ASR

Topic and Dialogue Dependent Language Modelling

Lipreading

Articulatory Modelling

Front-Ends and Adaptation to Acoustics Speaker Adaptation

Speech Perception

Dialogue Systems: Linguistic Structures, Modelling and Evaluation

Speaker Recognition and Language Identification

Style and Accent Recognition

Towards Robust ASR for Car and Telephone Applications

Language-Specific Systems

Pronunciation Models

Auditory Modelling and Psychoacoustics

Voice Conversion and Data Driven F0-Models

Vocal Tract Analysis

F0 and Duration Modelling, Spoken language processing

Language Modelling

Auditory Modelling and Psychoacoustics, Neural Networks for Speech Processing and Recognition