ISCA Archive ICSLP 1996 Sessions Booklet
  ISCA Archive Sessions Booklet
top

4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
3-6 October 1996

General Chair: H. Timothy Bunnell




Phonetics, Transcription, and Analysis


Whole-word phonetic distances and the PGPfone alphabet
Patrick Juola, Philip Zimmermann

Automatic vowel quality description using a variable mapping to an eight cardinal vowel reference set
Shuping Ran, J. Bruce Millar, Phil Rose

Automatic detection and segmentation of pronunciation variants in German speech corpora
Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel

ANGIE: a new framework for speech analysis based on morpho-phonological modelling
Stephanie Seneff, Raymond Lau, Helen Meng

Perceptual contrast in the Korean and English vowel system normalized
Byunggon Yang

On phonetic characteristics of pause in the Korean read speech
Yong-Ju Lee, Sook-Hyang Lee

Cross-language effects of lexical stress in word recognition: the case of Arabic English bilinguals
Sami Boudelaa, Mehdi Meftah

Automatic generation of German pronunciation variants
Maria-Barbara Wesenick

Estimating the quality of phonetic transcriptions and segmentations of speech signals
Maria-Barbara Wesenick, Andreas Kipp

An acoustic analysis of contemporary vowels of the standard slovenian language
Bojan Petek, Rastislav Sustarsic, Smiljana Komar

Using decision trees to construct optimal acoustic cues
Sandrine Robbe, Anne Bonneau, Sylvie Coste, Yves Laprie

Maximum jaw displacement in contrastive emphasis
Donna Erickson, Osamu Fujimura

Subglottal pressure and final lowering in English
Rebecca Herman, Mary Beckman, Kiyoshi Honda

Phonological variation: epenthesis and deletion of schwa in Dutch
Cecile Kuijpers, Wilma van Donselaar, Anne Cutler



Dialogue Special Sessions


Modeling of spoken dialogue with and without visual information
Katsuhiko Shirai

Multimodal discourse modelling in a multi-user multi-domain environment
Stephanie Seneff, David Goddeau, Christine Pao, Joseph Polifroni

Automatic acquisition of probabilistic dialogue models
Kenji Kita, Yoshikazu Fukui, Masaaki Nagata, Tsuyoshi Morimoto

Units of dialogue management: an example
Paul Heisterkamp, Scott McGlashan

Error resolution during multimodal human-computer interaction
Sharon Oviatt, Robert VanGent

Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting
Ramesh R. Sarukkai, Dana H. Ballard

Speech recognition for spontaneously spoken German dialogues
Kai Hübener, Uwe Jost, Henrik Heine

Using prosodic information to constrain language models for spoken dialogue
Paul Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, Jacqueline Kowtko

Combining the detection and correction of speech repairs
Peter A. Heeman, Kyung-ho Loken-Kim, James F. Allen

Generating spontaneous elliptical utterance
Yuji Sagawa, Wataru Sugimoto, Noboru Ohnishi

Developing the modelling of Swedish prosody in spontaneous dialogue
Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House, Birgitta Lastow, Paul Touati

Spoken language generation in a multimedia system
Shimei Pan, Kathleen R. McKeown

Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features
Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami

Spoken dialogue interface in a dual task situation
Shuichi Tanaka, Shu Nakazato, Keiichiro Hoashi, Katsuhiko Shirai

A dialogue control strategy based on the reliability of speech recognition
Yasuhisa Niimi, Yutaka Kobayashi

Speechwear: a mobile speech system
Alexander I. Rudnicky, Stephen Reed, Eric H. Thayer

WHEELS: a conversational system in the automobile classifieds domain
Helen Meng, Senis Busayapongchai, James Glass, David Goddeau, Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue

Effective human-computer cooperative spoken dialogue: the AGS demonstrator
M. D. Sadek, A. Ferrieux, A. Cozannet, P. Bretier, F. Panaget, J. Simonin

Dialog in the RAILTEL telephone-based system
S. K. Bennacef, L. Devillers, S. Rosset, Lori Lamel

Dialogue processing in a conversational speech translation system
Alon Lavie, Lori Levin, Yan Qu, Alex Waibel, Donna Gates, Marsal Gavaldà, Laura Mayfield, Maite Taboada





Speech Coding / HMMs and NNs in ASR


On the effects of accent and language on low rate speech coders
I. S. Burnett, J. J. Parry

VQ codevector index assignment using genetic algorithms for noisy channels
J. S. Pan, Fergus R. McInnes, Mervyn A. Jack

An improved vector quantization algorithm for speech transmission over noisy channels
Gavin C. Cawley

Very low delay and high quality coding of 20 hz-15 khz speech signals at 64 kbit/s
C. Murgia, G. Feng, A. Le Guyader, C. Quinquis

Application of speaker modification techniques to phonetic vocoding
Carlos M. Ribeiro, Isabel M. Trancoso

Entropy coded vector quantization with hidden Markov models
Tadashi Yonezaki, Kiyohiro Shikano

An application of recurrent neural networks to low bit rate speech coding
Minoru Kohata

CELP coding system based on mel-generalized cepstral analysis
Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai

Wideband re-synthesis of narrowband CELP-coded speech using multiband excitation model
Cheung-Fat Chan, Wai-Kwong Hui

Recurrent neural networks for phoneme recognition
Takuya Koizumi, Mikio Mori, Shuji Taniguchi, Mitsutoshi Maruya

A model for the acoustic phonetic structure of arabic language using a single ergodic hidden Markov model
M. A. Mokhtar, A. Zein-el-Abddin

Modelling long term variability information in mixture stochastic trajectory framework
Yifan Gong, Irina Illina, Jean-Paul Haton

Segmental phonetic features recognition by means of neural-fuzzy networks and integration in an n-best solutions post-processing
T. Moudenc, R. Sokol, Guy Mercier

Stochastic trajectory model with state-mixture for continuous speech recognition
Irina Illina, Yifan Gong

Recognition of spelled names over the telephone
Hermann Hild, Alex Waibel

Optimal tying of HMM mixture densities using decision trees
Gilles Boulianne, Patrick Kenny

Speech recognition using an enhanced FVQ based on a codeword dependent distribution normalization and codeword weighting by fuzzy objective function
Hwan Jin Choi, Yung Hwan Oh

Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs
Mikko Kurimo, Panu Somervuo



NNs and Stochastic Modeling


Integrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing
Geunbae Lee, Jong-Hyeok Lee, Kyubong Park, Byung-Chang Kim

Towards ASR on partially corrupted speech
Hynek Hermansky, Sangita Timberwala, Misha Pavel

Parametric trajectory models for speech recognition
Herbert Gish, Kenney Ng

Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs
K. M. Knill, M. J. F. Gales, S. J. Young

Cross phone state clustering using lexical stress and context
J. Hogberg, Kare Sjölander

Likelihood ratio decoding and confidence measures for continuous speech recognition
Eduardo Lleida-Solano, Richard C. Rose

A study on continuous Chinese speech recognition based on stochastic trajectory models
Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean-Paul Haton

A proposal for a new algorithm of reference interval-free continuous DP for real-time speech or text retrieval
Yoshiaki Itoh, Jiro Kiyama, Hiroshi Kojima, Susumu Seki, Ryuichi Oka

Language modeling by string pattern n-gram for Japanese speech recognition
Akinori Ito, Masaki Kohda

Statistical language modeling using a variable context length
Reinhard Kneser

A comparison of hybrid HMM architectures using global discriminative training
Finn Tore Johansen

Improved probability estimation with neural network models
Wei Wei, Etienne Barnard, Mark Fanty

A neural network using acoustic sub-word units for continuous speech recognition
Ha-Jin Yu, Yung-Hwan Oh

On the error criteria in neural networks as a tool for human classification modelling
Louis F. M. ten Bosch, Roel Smits

A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm
Gordon Ramsay

A tool for automated design of language models
Y. P. Yang, J. R. Deller Jr.

Acoustic-phonetic decoding based on elman predictive neural networks
F. Freitag, E. Monte

On improving discrimination capability of an RNN based recognizer
Tan Lee, P. C. Ching

An evaluation of statistical language modeling for speech recognition using a mixed category of both words and parts-of-speech
Yumi Wakita, Jun Kawai, Hitoshi Iida






Spoken Language Dialogue and Conversation


Predicting dialogue acts for a speech-to-speech translation system
Norbert Reithinger, Ralf Engel, Michael Kipp, Martin Klesen

Automatic speech translation based on the semantic structure
Johannes Müller, Holger Stahl, Manfred Lang

A methodology for application development for spoken language systems
Lewis M. Norton, Carl E. Weir, K. W. Scholz, Deborah A. Dahl, Ahmed Bouzid

A new restaurant guide conversational system: issues in rapid prototyping for specialized domains
Stephanie Seneff, Joseph Polifroni

Semantic interpretation of a Japanese complex sentence in an advisory dialogue - focused on the postpositional word "KEDO," which works as a conjunction between clauses
Tadahiko Kumamoto, Akira Ito

A Korean morphological analyzer for speech translation system
Youngkuk Hong, Myoung-Wan Koo, Gijoo Yang

Generic and domain-specific aspects of the waxholm NLP and dialog modules
Rolf Carlson, Sheri Hunnicutt

A real-time system for summarizing human-human spontaneous spoken dialogues
Megumi Kameyama, Goh Kawai, Isao Arima

Evaluation of spoken language understanding and dialogue systems
Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer

Inter-speaker interaction of F0 in dialogs
Kuniko Kakita

A robust dialogue system for making an appointment
Hans Brandt-Pook, Gernot A. Fink, Bernd Hildebrandt, Franz Kummert, Gerhard Sagerer

Segmentation of spoken dialogue by interjections, disfluent utterances and pauses
Kazuyuki Takagi, Shuichi Itahashi

A form-based dialogue manager for spoken language applications
David Goddeau, Helen Meng, Joseph Polifroni, Stephanie Seneff, Senis Busayapongchai

The design of complex telephony applications using large vocabulary speech technology
S. J. Whittaker, D. J. Attwater

Building 10,000 spoken dialogue systems
Stephen Sutton, David G. Novick, Ronald A. Cole, Pieter Vermeulen, Jacques de Villiers, Johan Schalkwyk, Mark Fanty, Mark Fanty

Speaker intention modeling for large vocabulary Mandarin spoken dialogues
Yen-Ju Yang, Lee-Feng Chien, Lin-Shan Lee

Hybrid language models and spontaneous legal discourse
P. E. Kenne, Mary O'Kane

Topic change and local perplexity in spoken legal dialogue
P. E. Kenne, Mary O'Kane

Intonational cues to discourse structure in Japanese
Jennifer J. Venditti, Marc Swerts

Principles for the design of cooperative spoken human-machine dialogue
Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær

Development and comparison of three syllable stress classifiers
Karen L. Jenkin, Michael S. Scordilis


Speech Disorders


Interaction of speech disorders with speech coders: effects on speech intelligibility
D. G. Jamieson, Li Deng, M. Price, Vijay Parsa, J. Till

Detecting arytenoid cartilage misplacement through acoustic and electroglottographic jitter analysis
Maurílio N. Vieira, Arnold G. D. Maran, Fergus R. McInnes, Mervyn A. Jack

Robust F0 and jitter estimation in pathological voices
Maurílio N. Vieira, Fergus R. McInnes, Mervyn A. Jack

Speech monitoring of infective laryngitis
F. Plante, H. Kessler, B. M. G. Cheetham, J. Earis

Searching for nonlinear relations in whitened jitter time series
Jean Schoentgen, Raoul de Guchteneere

Vocal fold pathology assessment using AM autocorrelation analysis of the teager energy operator
Liliana Gavidia-Ceballos, John H. L. Hansen, James F. Kaiser

Continuous positive airway pressure (CPAP) in the treatment of hypernasality
David P. Kuehn

Enhancement of alaryngeal speech by adaptive filtering
Carol Y. Espy-Wilson, Venkatesh R. Chari, Caroline B. Huang

Simulation of disordered speech using a frequency-domain vocal tract model
Li Deng, Xuemin Shen, D. G. Jamieson, J. Till

A stochastic model of fundamental period perturbation and its application to perception of pathological voice quality
Yasuo Endo, Hideki Kasuya

A screening test for speech pathology assessment using objective quality measures
Eric J. Wallen, John H. L. Hansen

Recent advances in hypernasal speech detection using the nonlinear teager energy operator
Douglas A. Cairns, John H. L. Hansen, James F. Kaiser






Speech Enhancement and Robust Processing


H-infinity filtering for speech enhancement
Xuemin Shen, Li Deng, Anisa Yasmin

A comparitive analysis of channel-robust features and channel equalization methods for speech recognition
Saeed V. Vaseghi, Ben Milner

Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum
Jia-lin Shen, Wen-liang Hwang, Lin-shan Lee

Durational modelling for improved connected digit recognition
Kevin Power

Study on the dereverberation of speech based on temporal envelope filtering
Carlos Avendano, Hynek Hermansky

Estimating Markov model structures
Thorsten Brants

A fertility channel model for post-correction of continuous speech recognition
Eric K. Ringger, James F. Allen

Restoration of wide band signal from telephone speech using linear prediction error processing
Hiroshi Yasukawa

Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition
Hiroshi Matsumoto, Noboru Naitoh

A simple architecture for using multiple cues in sound separation
William S. Woods, Martin Hansen, Thomas Wittkop, Birger Kollmeier

On the robust automatic segmentation of spontaneous speech
Bojan Petek, Ove Andersen, Paul Dalsgaard

Bayesian adaptation of speech recognizers to field speech data
C. G. Miglietta, C. Mokbel, D. Jouvet, J. Monné

Sub-band adaptive filtering applied to speech enhancement
A. J. Darlington, D. J. Campbell

Noise robust estimate of speech dynamics for speaker recognition
J. P. Openshaw, John S. Mason

Overview of speech enhancement techniques for automatic speaker recognition
Javier Ortega-García, Joaquín González-Rodríguez

Dynamic features for segmental speech recognition
Naomi Harte, Saeed V. Vaseghi, Ben Milner

Speech recognition based on a model of human auditory system
Takuya Koizumi, Mikio Mori, Shuji Taniguchi

APVQ encoder applied to wideband speech coding
J. M. Salavedra, E. Masgrau

Simple fast vector quantization of the line spectral frequencies
Jin Zhou, Yair Shoham, Ali Akansu


Speaker Adaptation and Normalization I


N-best-based instantaneous speaker adaptation method for speech recognition
Tomoko Matsui, Sadaoki Furui

Mixture splitting technic and temporal control in a HMM-based recognition system
C. Montacié, M.-J. Caraty, C. Barras

A unified spectral transformation adaptation approach for robust speech recognition
Lei Yao, Dong Yu, Taiyi Huang

On-line adaptive learning of the correlated continuous density hidden Markov models for speech recognition
Qiang Huo, Chin-Hui Lee

Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
Nikko Ström

An enquiring system of unknown words in TV news by spontaneous repetition (application of speaker normalization by speaker subspace projection)
Yasuo Ariki, Shigeaki Tagashira

Adaptive recognition method based on posterior use of distribution pattern of output probabilities
Jin-Song Zhang, Beiqian Dai, Changfu Wang, Hingkeung Kwan, Keikichi Hirose

Iterative unsupervised adaptation using maximum likelihood linear regression
P. C. Woodland, D. Pye, M. J. F. Gales

A compact model for speaker-adaptive training
Tasos Anastasakos, John McDonough, Richard Schwartz, John Makhoul

Iterative unsupervised speaker adaptation for batch dictation
Shigeru Homma, Jun-ichi Takahashi, Shigeki Sagayama

Rapid unsupervised adaptation to children's speech on a connected-digit task
Daniel C. Burnett, Mark Fanty

Speaker adaptation using tree structured shared-state HMMs
Jun Ishii, Masahiro Tonomura, Shoichi Matsunaga




Acoustic Modeling


Learning pronunciation dictionary from speech data
Christian-Michael Westendorf, Jens Jelitto

The trended HMM with discriminative training for phonetic classification
C. Rathinavelu, Li Deng

Improving decision trees for acoustic modeling
Ariane Lazaridès, Yves Normandin, Roland Kuhn

An improved training algorithm in HMM-based speech recognition
Gongjun Li, Taiyi Huang

Speech recognition using a strong correlation assumption for the instantaneous spectra
J. Ming, P. O'Boyle, J. McMahon, F. J. Smith

On parameter filtering in continuous subword-unit-based speech recognition
Pau Pachès-Leal, Climent Nadeu

Estimation of statistical phoneme center considering phonemic environments
Shigeki Okawa, Katsuhiko Shirai

Integration of context-dependent durational knowledge into HMM-based speech recognition
Xue Wang, Louis F. M. ten Bosch, Louis C. W. Pols

Speech recognition based on acoustically derived segment units
T. Fukada, M. Bacchiani, Kuldip K. Paliwal, Yoshinori Sagisaka

Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification
Rivarol Vergin, Azarshid Farhat, Douglas O'Shaughnessy

A codebook adaptation algorithm for SCHMM using formant distribution
Tae Young Yang, Won Ho Shin, Weon Goo Kim, Dae Hee Youn

Parameter tying for flexible speech recognition
J. Simonin, S. Bodin, D. Jouvet, K. Bartkova

Word-spotting based on inter-word and intra-word diphone models
Tsuneo Nitta, Shin'ichi Tanaka, Yasuyuki Masai, Hiroshi Matsu'ura

Duration modeling with expanded HMM applied to speech recognition
Antonio Bonafonte, Josep Vidal, Albino Nogueiras

Different strategies for distribution clustering using discrete, semicontinuous and continuous HMMs in CSR
Ricardo de Córdoba, José M. Pardo

Improved HMM phone and triphone models for realtime ASR telephony applications
Ilija Zeljkovic, Shrikanth Narayanan

Improved extended HMM composition by incorporating power variance
Yasuhiro Minami, Sadaoki Furui

Optimal filtering and smoothing for speech recognition using a stochastic target model
Gordon Ramsay, Li Deng

Speech recognition using syllable-like units
Zhihong Hu, Johan Schalkwyk, Etienne Barnard, Ronald A. Cole

Context modeling and clustering in continuous speech recognition
Jean-Claude Junqua, Lorenzo Vassallo

Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition
Li Deng, Jim Jian-Xiong Wu

A fuzzy acoustic-phonetic decoder for speech recognition
Olivier Oppizzi, David Fournier, Philippe Gilles, Henri Méloni

Syllable-level desynchronisation of phonetic features for speech recognition
Katrin Kirchhoff

A probabilistic framework for feature-based speech recognition
James Glass, Jane Chang, Michael McCandless

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
Jim Jian-Xiong Wu, Li Deng, Jacky Chan




Acoustic Analysis


A probabilistic approach to AMDF pitch detection
Goangshiuan S. Ying, Leah H. Jamieson, Carl D. Michell

From sagittal cut to area function: an RMI investigation
Alain Soquet, Véronique Lecuit, Thierry Metens, Didier Demolin

Pitch detection and voiced/unvoiced decision algorithm based on wavelet transforms
Léonard Janer, Juan José Bonet, Eduardo Lleida-Solano

Decomposition of speech signals into a deterministic and a stochastic part
Yannis Stylianou

Improved glottal closure instant detector based on linear prediction and standard pitch concept
Cheol-Woo Jo, Ho-Gyun Bang, William A. Ainsworth

Analysis of speech segments using variable spectral/temporal resolution
Xihong Wang, Stephen A. Zahorian, Stefan Auberg

Time-based clustering for phonetic segmentation
Brian Eberman, William Goldenthal

Formant analysis using mixtures of Gaussians
Parham Zolfaghari, Tony Robinson

Deriving articulatory representations from speech with various excitation modes
Hywel B. Richards, John S. Mason, Melvyn J. Hunt, John S. Bridle

blind speech segmentation: automatic segmentation of speech without linguistic knowledge
Manish Sharma, Richard J. Mammone

Speech synthesis using a nonlinear energy damping model for the vocal folds vibration effect
Hiroshi Ohmura, Kazuyo Tanaka

Neural networks learning with L1 criteria and its efficiency in linear prediction of speech signals
Munehiro Namba, Hiroyuki Kamata, Yoshihisa Ishida

Preprocessing and neural classification of English stop consonants [b,d,g,p,t,k]
Anna Esposito, C. E. Ezin, M. Ceccarelli

A comparison of modified k-means(MKM) and NN based real time adaptive clustering algorithms for articulatory space codebook formation
K. S. Ananthakrishnan

A novel approach to the estimation of voice source and vocal tract parameters from speech signals
Wen Ding, Hideki Kasuya

Syllable detection in read and spontaneous speech
Hartmut R. Pfitzinger, Susanne Burger, Sebastian Heid

Maximum likelihood learning of auditory feature maps for stationary vowels
Kuansan Wang, Chin-Hui Lee, Biing-Hwang Juang

Explicit segmentation of speech using Gaussian models
Antonio Bonafonte, Albino Nogueiras, Antonio Rodriguez-Garrido

A comparison of several recent methods of fundamental frequency and voicing decision estimation
E. Mousset, William A. Ainsworth, José A. R. Fonollosa

Robust pitch estimation with harmonics enhancement in noisy environments based on instantaneous frequency
Toshihiko Abe, Takao Kobayashi, Satoshi Imai

Integrated polispectrum on speech recognition
Asunción Moreno, Miquel Rutllán





Speech Synthesis


Multilingual text analysis for text-to-speech synthesis
Richard Sproat

Spoken-style explanation generator for Japanese kanji using a text-to-speech system
Yoshifumi Ooyama, Hisako Asano, Koji Matsuoka

A method for estimating prosodic symbol from text for Japanese text-to-speech synthesis
Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura

Statistical methods in data-driven modeling of Spanish prosody for text to speech
E. López-Gonzalo, J. M. Rodríguez-García

Intonation processing for TTS using stylization and neural network learning method
Jung-Chul Lee, Youngjik Lee, Sang-Hun Kim, Minsoo Hahn

Generating F0 contours from toBI labels using linear regression
Alan W. Black, Andrew J. Hunt

The broad study of homograph disambiguity for Mandarin speech synthesis
Wern-Jun Wang, Shaw-Hwa Hwang, Sin-Horng Chen

The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes
Thierry Dutoit, Vincent Pagel, N. Pierret, F. Bataille, O. Van der Vrecken

Training data selection for voice conversion using speaker selection and vector field smoothing
Makoto Hashimoto, Norio Higuchi

A new voice transformation method based on both linear and nonlinear prediction analysis
Ki Seung Lee, Dae Hee Youn, Il Whan Cha

On the transformation of the speech spectrum for voice conversion
G. Baudoin, Yannis Stylianou

Spectral analysis of synthetic speech and natural speech with noise over the telephone line
Cristina Delogu, Andrea Paoloni, Susanna Ragazzini, Paola Ridolfi

A new speech synthesis system based on the ARX speech production model
Weizhong Zhu, Hideki Kasuya

Speech synthesis using the CELP algorithm
Geraldo Lino de Campos, Evandro Bacci Gouvêa

A Mandarin text-to-speech system
Shaw-Hwa Hwang, Sin-Horng Chen, Yih-Ru Wang

Residual-based speech modification algorithms for text-to-speech synthesis
Mike D. Edgington, A. Lowry

A generalized LR parser for text-to-speech synthesis
Per Olav Heggtveit

Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis
M. P. Pollard, B. M. G. Cheetham, C. C. Goodyear, Mike D. Edgington, A. Lowry

An excitation synchronous pitch waveform extraction method and its application to the VCV-concatenation synthesis of Japanese spoken words
Yasuhiko Arai, Ryo Mochizuki, Hirofumi Nishimura, Takashi Honda

A new Chinese text-to-speech system with high naturalness
Ren-Hua Wang, Qinfeng Liu, Difei Tang

Voice conversion based on topological feature maps and time-variant filtering
Ansgar Rinscheid







Production and Prosody Posters


A frequency domain method for parametrization of the voice source
Paavo Alku, Erkki Vilkman

Glottal correlates of the word stress and the tense/lax opposition in German
Krzysztof Marasek

Coarticulatory stability in american English /r/
Suzanne Boyce, Carol Y. Espy-Wilson

An MRI-based analysis of the English /r/ and /l/ articulations
Shinobu Masaki, Reiko Akahane-Yamada, Mark K. Tiede, Yasuhiro Shimada, Ichiro Fujimoto

Does lexical stress or metrical stress better predict word boundaries in Dutch?
David van Kuijk

Optopalatograph (OPG): a new apparatus for speech production analysis
Alan A. Wrench, A. D. McIntosh, William J. Hardcastle

Prediction of vowel systems using a deductive approach
René Carré

Distinctions between [t] and [tch] using electropalatography data
Sheila J. Mair, Celia Scully, Christine H. Shadle

Relating formants and articulation in intelligibility test words
Michiko Hashi, Raymond D. Kent, John R. Westbury, Mary J. Lindstrom

The role of coarticulation in the perception of vowel quality in modern standard Arabic
Imad Znagui, Mohamed Yeou

Updating the reading EPG
Simon Arnfield, Wilf Jones

Lexical stress detection on stress-minimal word pairs
Goangshiuan S. Ying, Leah H. Jamieson, Ruxin Chen, Carl D. Mitchell

An acoustic study of the interaction between stressed and unstressed syllables in spoken Mandarin
Jing Wang

Automatic detection of accent nuclei at the head of words for speech recognition
Nobuaki Minematsu, Seiichi Nakagawa

Automatic generation of prosodic structure for high quality Mandarin speech synthesis
Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee

A study on Japanese prosodic pattern and its modeling in restricted speech
Tomoki Hamagami, Ken-ichi Magata, Mitsuo Komura

A phonetic study of focus in intransitive verb sentences
Steve Hoskins

Goethe for prosody
Stefan Rapp

Prosodic cues in syntactically ambiguous strings; an interactive speech planning mechanism
K. A. Straub

A functional model for generation of the local components of F0 contours in Chinese
Jinfu Ni, Ren-Hua Wang, Deyu Xia

The acquisition of voiceless stops in the interlanguage of second language learners of English and Spanish
Marie Fellbaum





Speaker/Language Identification and Verification


Automatic accent classification of foreign accented australian English speech
Karsten Kumpf, Robin W. King

Discriminative adaptation for speaker verification
F. Korkmazskiy, Biing-Hwang Juang

Perceptual features of unknown foreign languages as revealed by multi-dimensional scaling
V. Stockmal, D. Muljani, Z. S. Bond

On-line incremental adaptation for speaker verification using maximum likelihood estimates of CDHMM parameters
Kin Yu, John S. Mason

Combining methods to improve speaker verification decision
Dominique Genoud, Frédéric Bimbot, Guillaume Gravier, Gérard Chollet

Incremental speaker adaptation with minimum error discriminative training for speaker identification
Cesar Martín del Alamo, J. Alvarez, C. de la Torre, F. J. Poyatos, Lúis Hernández

Frame level likelihood normalization for text-independent speaker identification using Gaussian mixture models
Konstantin P. Markov, Seiichi Nakagawa

On using prosodic cues in automatic language identification
Ann E. Thymé-Gobbel, Sandra E. Hutchins

Speaker recognition model using two-dimensional mel-cepstrum and predictive neural network
Tadashi Kitamura, Shinsai Takei

Unknown language rejection in language identification system
Hingkeung Kwan, Keikichi Hirose

Spoken language identification using large vocabulary speech recognition
James L. Hieronymus, Shubha Kadambe

Accent identification
Carlos Teixeira, Isabel M. Trancoso, António Serralheiro

Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch
Sarel van Vuuren

On the sources of inter- and intra-speaker variability in the acoustic dynamics of speech
Xue Yang, J. Bruce Millar, Iain Macleod

Language identification with inaccurate string matching
Kay M. Berkling, Etienne Barnard

Robust prosodic features for speaker identification
M. J. Carey, E. S. Parris, H. Lloyd-Thomas, S. J. Bennett

Text independent speaker identification on noisy environments by means of self organizing maps
E. Monte, J. Hernando, X. Miró, A. Adolf

Language identification using language-dependent phonemes and language-independent speech units
Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek






Databases and Tools


BABEL: an eastern european multi-language database
Peter Roach, Simon Arnfield, William J. Barry, J. Baltova, Marian Boldea, Adrian Fourcin, W. Gonet, Ryszard Gubrynowicz, E. Hallum, Lori Lamel, Krzysztof Marasek, Alain Marchal, E. Meister, Klára Vicsi

USTC95---a putonghua corpus
Ren-Hua Wang, Deyu Xia, Jinfu Ni, Bicheng Liu

Telephone data collection using the world wide web
Edward Hurley, Joseph Polifroni, James Glass

The "SIVA" speech database for speaker verification: description and evaluation
M. Falcone, A. Gallo

A multi-level description of date expressions in German telephone speech
Christoph Draxler

Viterbi search visualization using vista: a generic performance visualization tool
Robert H. Jr. Halstead, Ben Serridge, Jean-Manuel Van Thong, William Goldenthal

A multilingual phonetic representation and analysis system for different speech databases
Toomas Altosaar, Matti Karjalainen, Martti Vainio

FRESCO: the French telephone speech data collection - part of the european Speechdat(m) project
D. Langmann, Reinhold Haeb-Umbach, Louis Boves, E. den Os

Predicting the out-of-vocabulary rate and the required vocabulary size for speech processing applications
Johannes Müller, Holger Stahl, Manfred Lang

AMULET: automatic MUltisensor speech labelling and event tracking: study of the spatio-temporal correlations in voiceless plosive production
Nathalie Parlangeau, Alain Marchal

Constructing multi-level speech database for spontaneous speech processing
Minsoo Hahn, Sanghun Kim, Jung-Chul Lee, Yong-Ju Lee

Preliminaries to a romanian speech database
Marian Boldea, Alin Doroga, Tiberiu Dumitrescu, Maria Pescaru

Labelled data bank of spoken standard German the kiel corpus of read/spontaneous speech
Klaus J. Kohler

SAPPHIRE: an extensible speech analysis and recognition tool based on tcl/tk
Lee Hetherington, Michael McCandless

Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP
Jiro Kiyama, Yoshiaki Itoh, Ryuichi Oka

Very-large-vocabulary Mandarin voice message file retrieval using speech queries
Bo-Ren Bai, Lee-Feng Chien, Lin-Shan Lee

Gandalf - a Swedish telephone speaker verification database
H. Melin

The DCIEM map task corpus: spontaneous dialogue under sleep deprivation and drug treatment
Ellen Gurman Bard, C. Sotillo, A. H. Anderson, M. M. Taylor

The nemours database of dysarthric speech
Xavier Menéndez-Pidal, James B. Polikoff, Shirley M. Peters, Jennie E. Leonzio, H. T. Bunnell

POST: parallel object-oriented speech toolkit
Jean Hennebert, Dijana Petrovska Delacrétaz





Topics in ASR and Search


Clustered language models with context-equivalent states
J. P. Ueberla, I. R. Gransden

Modeling of contextual effects and its application to word spotting
Yuji Yonezawa, Masato Akagi

A new keyword spotting algorithm with pre-calculated optimal thresholds
J. Junkawitsch, L. Neubauer, Harald Höge, Günther Ruske

Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input
Roxane Lacouture, Yves Normandin

Techniques for approximating a trigram language model
Fabio Brugnara, Marcello Federico

Unsupervised and incremental speaker adaptation under adverse environmental conditions
Keizaburo Takagi, Koichi Shinoda, Hiroaki Hattori, Takao Watanabe

An adaptive-beam pruning technique for continuous speech recognition
Hugo Van hamme, Filip Van Aelten

Data based filter design for RASTA-like channel normalization in ASR
Carlos Avendano, Sarel van Vuuren, Hynek Hermansky

A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition
S. Ortmanns, Hermann Ney, Frank Seide, I. Lindam

Language-model look-ahead for large vocabulary speech recognition
S. Ortmanns, Hermann Ney, A. Eiden

A new search algorithm in segmentation lattices of speech signals
Jean-Luc Husson, Yves Laprie

LR-parser-driven viterbi search with hypotheses merging mechanism using context-dependent phone models
Tomokazu Yamada, Shigeki Sagayama

Discrete-utterance recognition with a fast match based on total data reduction
Jan Nouza

On-line garbage modeling with discriminant analysis for utterance verification
J. Caminero-Gil, C. de la Torre, L. Villarrubia, Cesar Martín del Alamo, Lúis Hernández

Cheating with imperfect transcripts
Paul Placeway, John Lafferty

Novel training method for classifiers used in speaker adaptation
Naoto Iwahashi

Large vocabulary word recognition based on a graph-structured dictionary
Katsuki Minamino

A word graph based n-best search in continuous speech recognition
Bach-Hiep Tran, Frank Seide, Volker Steinbiss

Viterbi beam search with layered bigrams
David M. Goblirsch

A wave decoder for continuous speech recognition
Eric Burhke, Wu Chou, Qiru Zhou

Long term on-line speaker adaptation for large vocabulary dictation
Eric Thelen

Incremental generation of word graphs
Gerhard Sagerer, Heike Rautenstrauch, Gernot A. Fink, Bernd Hildebrandt, A. Jusek, Franz Kummert

Improvement in n-best search for continuous speech recognition
Irina Illina, Yifan Gong

Sethos: the UPC speech understanding system
Antonio Bonafonte, José B. Mariño, Albino Nogueiras

Segmental search for continuous speech recognition
Pietro Laface, Luciano Fissore, A. Maro, Franco Ravera






General ASR Posters


JANUS-II: towards spontaneous Spanish speech recognition
Puming Zhan, Klaus Ries, Marsal Gavaldà, Donna Gates, Alon Lavie, Alex Waibel

Reduced semi-continuous models for large vocabulary continuous speech recognition in Dutch
Kris Demuynck, Jacques Duchateau, Dirk van Compernolle

Validating different flexible vocabulary approaches on the Swiss French Polyphone and Polyvar databases
Andrei Constantinescu, Olivier Bornet, Gilles Caloz, Gérard Chollet

Use of a reliability coefficient in noise cancelling by neural net and weighted matching algorithms
Nestor Becérra Yoma, Fergus R. McInnes, Mervyn A. Jack

Likelihood normalization using an ergodic HMM for continuous speech recognition
Kazuhiko Ozeki

Dynamic control of a production model
Laurence Candille, Henri Méloni

Speech recognition using sub-word units dependent on phonetic contexts of both training and recognition vocabularies
Hiroaki Hattori, Eiko Yamada

Hidden Markov models merging acoustic and articulatory information to automatic speech recognition
Bruno Jacob, Christine Senac

Creation of unseen triphones from diphones and monophones using a speech production approach
Mats Blomberg, Kjell Elenius

Speaker-independent dictation of Chinese speech with 32k vocabulary
Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang

Using accent-specific pronunciation modelling for robust speech recognition
J. J. Humphries, P. C. Woodland, D. Pearce

Dictionary learning for spontaneous speech recognition
Tilo Sloboda, Alex Waibel

Comparison of channel normalisation techniques for automatic speech recognition over the phone
Johan de Veth, Louis Boves

Anchor point detection for continuous speech recognition in Spanish: the spotting of phonetic events
Manuel A. Leandro, José M. Pardo

Cepstral compensation by polynomial approximation for environment-independent speech recognition
Bhiksha Raj, Evandro Bacci Gouvêa, Pedro J. Moreno, Richard M. Stern

Effect of speech coders on speech recognition performance
B. T. Lilly, Kuldip K. Paliwal

Wavelet transforms for non-uniform speech recogntion systems
Léonard Janer, Josep Martí, Climent Nadeu, Eduardo Lleida-Solano

A binaural model as a front-end for isolated word recognition
Tsuyoshi Usagawa, Markus Bodden, Klaus Rateitschek

A new speech enhancement: speech stream segregation
Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata





Perception of Vowels and Consonants


On the syllable structures of Chinese relating to speech recognition
Jialu Zhang

Can a moraic nasal occur word-initially in Japanese?
Takashi Otake, Kiyoko Yoneyama

Perceptual assimilation of american English vowels by Japanese listeners
Winifred Strange, Reiko Akahane-Yamada, B. H. Fitzgerald, R. Kubo

Context and speaker effects in the perceptual assimilation of German vowels by american listeners
Winifred Strange, Ocke-Schwen Bohn, S. A. Trent, M. C. McNair, K. C. Bielec

Examination of a perceptual non-native speech contrast: pharyngealized/non-pharyngealized discrimination by French-speaking adults
Mohamed Zahid

Context-dependent relevance of burst and transitions for perceived place in stops: it's in production, not perception
Roel Smits

The perception of morae in long vowels comparison among Japanese, Korean and English speakers
Ryoji Baba, Kaori Omuro, Hiromitsu Miyazono, Tsuyoshi Usagawa, Masahiko Higuchi

Juncture cues to disfluency
Robin J. Lickley

Effects of duration and formant movement on vowel perception
James R. Sawusch

Benchmarking human performance for continuous speech recognition
N. Deshmukh, R. J. Duncan, A. Ganapathiraju, J. Picone

Intelligibility of speech with filtered time trajectories of spectral envelopes
Takayuki Arai, Misha Pavel, Hynek Hermansky, Carlos Avendano

Perceptual use of vowel and speaker information in breath sounds
Douglas H. Whalen, Sonya M. Sheffert

The role of neighborhood relative frequency in spoken word recognition
Philippe Mousty, Monique Radeau, Ronald Peereman, Paul Bertelson

Transitional probability and phoneme monitoring
James M. McQueen, Mark A. Pitt

Identification of vowel features from French stop bursts
Anne Bonneau

Listening in a second language
Z. S. Bond, Thomas J. Moore, Beverley Gable

Perception of lexical tone across languages: evidence for a linguistic mode of processing
Denis Burnham, Elizabeth Francis, Di Webster, Sudaporn Luksaneeyanawin, Chayada Attapaiboon, Francisco Lacerda, Peter Keller

Acoustic correlates to the effects of talker variability on the perception of English /r/ and /l/ by Japanese listeners
James S. Magnuson, Reiko Akahane-Yamada


×

Plenary Lectures

Large Vocabulary

Multimodal ASR (Face and Lips)

Perception of Words

Phonetics, Transcription, and Analysis

Spoken Language Processing for Special Populations

Dialogue Special Sessions

Language Modeling

Feature Extraction for Speech Recognition

Speech Production - Measurement and Modeling

Speech Coding / HMMs and NNs in ASR

Vowels

NNs and Stochastic Modeling

Neural Models of Speech Processing

Utterance Verification and Word Spotting

Acquisition/Learning Training L2 Learners

Focus, Stress and Accent

Spoken Language Dialogue and Conversation

Speech Disorders

Vocal Tract Geometry

Prosody in ASR and Segmentation

Acquisition and Learning by Machine

Dialogue Systems

Speech Enhancement and Robust Processing

Speaker Adaptation and Normalization I

Spoken Language and NLP

Spoken Discourse Analysis/Synthesis

Acoustic Modeling

Physics and Simulation of the Vocal Tract

Duration and Rhythm

Acoustic Analysis

Speech Recognition Using HMMs and NNs

Adverse Environments and Multiple Microphones

Prosodic Synthesis in Dialogue

Speech Synthesis

Instructional Technology for Spoken Language

Multimodal Spoken Language Processing

Prosody - Phonological/Phonetic Measures

Phonetics and Perception

Language Acquisition

Production and Prosody Posters

User-Machine Interfaces

TTS Systems and Rules

Prosody and Labeling

Speaker/Language Identification and Verification

Emotion in Recognition and Synthesis

Stochastic Techniques in Robust Speech Recognition

Prosodic Synthesis in Text to Speech

Dialogue Events

Databases and Tools

Robust Speech Processing

Dialects and Speaking Styles

Production and Perception of Prosody

Topics in ASR and Search

Multimodal Dialogue/HCI

Multilingual Speech Processing

Acoustics in Synthesis

Pitch and Rate

General ASR Posters

Data-based Synthesis

Speaker Identification and Verification

Acoustic Phonetics

Perception of Vowels and Consonants