ISCA Archive ICSLP 2000 Sessions Booklet
  ISCA Archive Sessions Booklet
top

6th International Conference on Spoken Language Processing

Beijing, China
16-20 October 2000

General Chair: Dinghua Guan

Linguistics, Phonology, Phonetics, and Psycholinguistics 1, 2


Coarticulation patterns in identical twins: an acoustic case study
S. P. Whiteside, E. Rixon

Improved lexicon formation through removal of co-articulation and acoustic recognition errors
Philip Hanna, Darryl Stewart, Ji Ming, F. Jack Smith

A two-level approach to the handling of foreign items in Swedish speech technology applications
Anders Lindström, Anna Kasaty

Word repetitions in Japanese spontaneous speech
Yasuharu Den, Herbert H. Clark

The role of language experience in speaker and rate normalization processes
Allard Jongman, Corinne B. Moore

Data-driven importance analysis of linguistic and phonetic information
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann

Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language
Hiroya Fujisaki, Katsuhiko Shirai, Shuji Doshita, Seiichi Nakagawa, Keikichi Hirose, Shuichi Itahashi, Tatsuya Kawahara, Sumio Ohno, Hideaki Kikuchi, Kenji Abe, Shinya Kiriyama

The expression and recognition of emotions through prosody
Li-chiung Yang

Prosodic marking of information status in tokyo Japanese
Marc Swerts, Miki Taniguchi, Yasuhiro Katagiri

Influence of duration on static and dynamic properties of German vowels in spontaneous speech
Britta Wrede, Gernot A. Fink, Gerhard Sagerer

The regular accent in Chinese sentences
Bo Zheng, Bei Wang, Yufang Yang, Shinan Lu, Jianfen Cao

A tool for the synchronization of speech and mouth shapes: LIPS
Odile Mella, Dominique Fohr, Laurent Martin, Andreas Carlen

Semantic tree unification grammar: a new formalism for spoken language processing
Mohamed-Zakaria Kurdi


Discourse and Dialogue 1, 2


Identification of utterance intention in Japanese spontaneous spoken dialogue by use of prosody and keyword information
Akira Kurematsu, Yousuke Shionoya

Improved speech understanding using dialogue expectation in sentence parsing
Sherif Abdou, Michael Scordilis

The use of belief networks for mixed-initiative dialog modeling
Helen M. Meng, Carmen Wai, Roberto Pieraccini

Integrating flexibility into a structured dialogue model: some design considerations
Michael F. McTear, Susan Allen, Laura Clatworthy, Noelle Ellison, Colin Lavelle, Helen McCaffery

A task-independent dialogue controller based on the extended frame-driven method
Yasuhisa Niimi, Tomoki Oku, Takuya Nishimoto, Masahiro Araki

Language modeling for dialog system
Wei Xu, Alex Rudnicky

Building stochastic language model networks based on simultaneous word/phrase clustering
Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis

Prosody and topic structuring in spoken dialogue
Li-chiung Yang, Richard Esposito

Elements of conversational computing - a paradigm shift
Stéphane H. Maes

Rejection and key-phrase spottin techniques using a mumble model in a czech telephone dialog system
Ludek Müller, Filip Jurcicek, Lubos Smidl

Continuous listening for unconstrained spoken dialog
Tim Paek, Eric Horvitz, Eric Ringger

Audio signals in speech interfaces
Stefanie Shriver, Alan W. Black, Ronald Rosenfeld

Visualisation of spoken dialogues
Péter Pál Boda

The construction of speech output to support elderly visually impaired users starting to use the internet
Mary Zajicek


Recognition and Understanding of Spoken Language 1, 2


Effects of word string language models on noisy broadcast news speech recognition
Kazuyuki Takagi, Rei Oguro, Kazuhiko Ozeki

Semantic tokenization of verbalized numbers in language modeling
Xiaoqiang Luo, Martin Franz

Automatic transcription of lecture speech using topic-independent language modeling
Kazuomi Kato, Hiroaki Nanjo, Tatsuya Kawahara

Extending grammars based on similar-word recognition
Rocio Guillén, Randal Erman

Particle-based language modelling
E. W. D. Whittaker, P. C. Woodland

Lexical tree decoding with a class-based language model for Chinese speech recognition
W. N. Choi, Y. W. Wong, Tan Lee, P. C. Ching

Impact of bucketing on performance of linearly interpolated language models
K. Visweswariah, H. Printz, M. Picheny

An embedded knowledge integration for hybrid language modelling
Shuwu Zhang, Hirofami Yamamoto, Yoshinori Sagisaka

Hierarchical statistical language models: experiments on in-domain adaptation
Lucian Galescu, James Allen

A language model for conversational speech recognition using information designed for speech translation
Hirofumi Yamamoto, Kouichi Tanigaki, Yoshinori Sagisaka

Optimizing BNF grammars through source transformations
Bob Carpenter, Sol Lerner, Roberto Pieraccini

On enhancing katz-smoothing based back-off language model
Jian Wu, Fang Zheng

Can artificial neural networks learn language models?
Wei Xu, Alex Rudnicky

Improving language model perplexity and recognition accuracy for medical dictations via within-domain interpolation with literal and semi-literal corpora
Guergana Savova, Michael Schonwetter, Sergey Pakhomov

Placing structuring elements in a word sequence for generating new statistical language models
Karl Weilhammer, Günther Ruske

Dynamic selection of language models in a dialogue system
Yannick Estève, Frédéric Béchet, Renato de Mori

Stochastic modeling of semantic content for use IN a spoken dialogue system
Magne H. Johnsen, Trym Holter, Torbjørn Svendsen, Erik Harborg

Spoken word recognition using the artificial evolution of a set of vocabulary
Tomio Takara, Eiji Nagaki

Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems
Eric Horvitz, Tim Paek

Chinese spoken language understanding across domain
Yunbin Deng, Bo Xu, Taiyi Huang

Interpolation of stochastic grammar and word bigram models in natural language understanding
Sven C. Martin, Andreas Kellner, Thomas Portele

A portable development tool for spoken dialogue systems
Satoru Kogure, Seiichi Nakagawa

Error-tolerant language understanding for spoken dialogue systems
Yi-Chung Lin, Huei-Ming Wang

Language modeling by stochastic dependency grammar for Japanese speech recognition
Akinori Ito, Chiori Hori, Masaharu Katoh, Masaki Kohda

A tagger-aided language model with a stack decoder
Ruiqiang Zhang, Ezra Black, Andrew Finch, Yoshinori Sagisaka

Generalizing prosodic prediction of speech recognition errors
Julia Hirschberg, Diane Litman, Marc Swerts

Toward unconstrained command and control: data-driven semantic inference
Jerome R. Bellegarda, Kim E. A. Silverman

Continuous speech recognition with parse filtering
Ken Hanazawa, Shinsuke Sakai

Investigating text normalization and pronunciation variants for German broadcast transcription
Martine Adda-Decker, Gilles Adda, Lori Lamel

A comparison of data-derived and knowledge-based modeling of pronunciation variation
Mirjam Wester, Eric Fosler-Lussier

A bottom-up method for obtaining information about pronunciation variation
Judith M. Kessens, Helmer Strik, Catia Cucchiarini

Semi-continuous segmental probability modeling for continuous speech recognition
Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang

Acoustic modelling using modular/ensemble combinations of heterogeneous neural networks
Christos A. Antoniou, T. Jeff Reynolds

Unifying HMM and phone-pair segment models
Hsiao-Wuen Hon, Shankar Kumar, Kuansan Wang

Multi-group mixture weight HMM
Ming Li, Tiecheng Yu

Application of pattern recognition neural network model to hearing system for continuous speech
Tetsuro Kitazoe, Tomoyuki Ichiki, Makoto Funamori

Data-dependent kernels in svm classification of speech patterns
Nathan Smith, Mahesan Niranjan

Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition
S. Umesh, Richard C. Rose, S. Parthasarathy

Large vocabulary continuous speech recognition under real environments using adaptive sub-band spectral subtraction
Masahiro Fujimoto, Jun Ogata, Yasuo Ariki

Perceptual harmonic cepstral coefficients as the front-end for speech recognition
Liang Gu, Kenneth Rose

Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition
Yik-Cheung Tam, Brian Mak

On the use of speaking rate as a generalized feature to improve decision trees
Robert Faltlhauser, Thilo Pfau, Günther Ruske

Syllable recognition using glides based on a non-linear transformation
Jun Toyama, Masaru Shimbo

Consonant discrimination in elicited and spontaneous speech: a case for signal-adaptive front ends in ASR
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco

A new approach for multi-band speech recognition based on probabilistic graphical models
Khalid Daoudi, Dominique Fohr, Christophe Antoine

Test of several external posterior weighting functions for multiband full combination ASR
Hervé Glotin, Frédéric Berthommier

Using the modulation wavelet transform for feature extraction in automatic speech recognition
Kanji Okada, Takayuki Arai, Noburu Kanederu, Yasunori Momomura, Yuji Murahara

AM-demodulation of speech spectra and its application io noise robust speech recognition
Qifeng Zhu, Abeer Alwan

Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR
Astrid Hagen, Andrew Morris

Using multiple time scales in the framework of multi-stream speech recognition
Astrid Hagen, Hervé Bourlard

Streamlining the front end of a speech recognizer
Hua Yu, Alex Waibel

Reconstruction of damaged spectrographic features for robust speech recognition
Bhiksha Raj, Michael L. Seltzer, Richard M. Stern

Impact of speaking style and speaking task on acoustic models
Janienke Sturm, Hans Kamperman, Lou Boves, Els den Os

Encoded speech recognition accuracy improvement in adverse environments by enhancing formant spectral bands
Shubha Kadambe, Ron Burns

Soft decisions in missing data techniques for robust automatic speech recognition
Jon Barker, Ljubomir Josifovski, Martin Cooke, Phil Green

New tone recognition methods for Chinese continuous speech
Jian Liu, Tiecheng Yu

Reliable bands guided similarity measure for noise-robust speech recognition
Bo Zhang, Gang Peng, William S.-Y. Wang

A novel feature extraction using multiple acoustic feature planes for HMM-based speech recognition
Tsuneo Nitta, Masashi Takigawa, Takashi Fukuda

Integrating the energy information into MFCC
Fang Zheng, Guoliang Zhang

Speaker independent phoneme recognition by MLP using wavelet features
Omar Farooq, Sekharjit Datta

A corpus-based approach for robust ASR in reverberant environments
Laurent Couvreur, Christophe Couvreur, Christophe Ris

Modeling out-of-vocabulary words for robust speech recognition
Issam Bazzi, James R. Glass

Hidden Markov model environmental compensation for automatic speech recognition on hand-held mobile devices
Bojana Gajic, Richard C. Rose

A neural network for classification with incomplete data: application to robust ASR
Andrew C. Morris, Ljubomir Josifovski, Hervé Bourlard, Martin Cooke, Phil Green

Feature-dependent allophone clustering
Shigeki Matsuda, Mitsuru Nakai, Hiroshi Shimodaira, Shigeki Sagayama

Data-driven lexical modeling of pronunciation variations for ASR
Qian Yang, Jean-Pierre Martens

Fuzzy entropy hidden Markov models for speech recognition
Dat Tran, Michael Wagner

Adjacent node continuous-state HMM’s
Carl Quillen

Modelling phonetic context using head-body-tail models for connected digit recognition
Janienke Sturm, Eric Sanders

Using support vector machines for spoken digit recognition
Issam Bazzi, Dina Katabi

Data-driven model construction for continuous speech recognition using overlapping articulatory features
Jiping Sun, Xing Jing, Li Deng

Speech recognition using HMMs with quantized parameters
Marcel Vasilache

A perception and PDE based nonlinear transformation for processing spoken words
Yingyong Qi, Jack Xin

Training of isolated word recognizers with continuous speech
Reinhard Blasig, Georg Rose, Carsten Meyer





Miscellaneous 1 [A,B,C,G,H,L,O,Q,X]


A rule-based approach to farsi language text-to-phoneme conversion
Mohammad Reza Sadigh, Hamid Sheikhzadeh, M. R. Jahangir, Arash Farzan

Acoustic and perceptual properties of English fricatives
Allard Jongman, Yue Wang, Joan Sereno

The special phonological characteristics of monosyllabic function words in English
Stefanie Shattuck-Hufnagel, Nanette Veilleux

Selection of sublexical units for continuous speech recognition of basque
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, Luis Javier Rodríguez

Machine learning techniques for the identification of cues for stop place
Madelaine C. Plauché, Kemal Sönmez

Strategies of vowel reduction - a speaker-dependent phenomenon
Christina Widera

Syllable-final /s/ lenition in the LDC's callhome Spanish corpus
Michelle A. Fox

Meaning extraction based on frame representation for Japanese spoken dialogue
Akira Kurematsu, Takeaki Nakazaki

Pitch accents, boundary tones and turn-taking in dutch map task dialogues
Johanneke Caspers

An annotation scheme of spoken dialogues with topic break indexes
Yoichi Yamashita, Michiyo Murai

Application of the centering framework in spontaneous dialogues
Nanette Veilleux

Automatic lexicon generation and dialogue modeling for spontaneous speech
Hiroki Mori, Hideki Kasuya

Evaluating radio news intonation - autosegmental versus superpositional modelling
Maria Wolters, Hansjörg Mixdorff

A mixed language model for a dialogue system over ihe telephone
Daniele Falavigna, Roberto Gretter, Marco Orlandi

Positive and negative user feedback in a spoken dialogue corpus
Linda Bell, Joakim Gustafson

Stress and lexical activation in dutch
Anne Cutler, Mariëtte Koster

Automatic modeling and implementation of intonation for the arabic language in TTS systems
Safa Nasser Eldin, Hanna Abdel Nour, Rajouani Abdenbi

Modeling word durations
Venkata Ramana Rao Gadde

Japanese intonation synthesis using superposition and linear alignment models
Jennifer J. Venditti, Jan P. H. van Santen

Improving the naturalness of synthetic speech by utilizing the prosody of natural speech
Toshimitsu Minowa, Ryo Mochizuki, Hirofumi Nishimura

A hybrid statistical/RNN approach to prosody synthesis for taiwanese TTS
Sin-Horng Chen, Chen-Chung Ho

Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances
Nobuaki Minematsu, Yukiko Fujisawa, Seiichi Nakagawa

Restricted-domain female-voice synthesis in Spanish: from database design to ANN prosodic modeling
Juan Manuel Montero, Ricardo Córdoba, José A. Vallejo, Juana Gutiérrez-Arriola, Emilia Enríquez, Juan Manuel Pardo

A hierarchical intonation model for synthesising F0 contours in galician language
Xavier Fernández-Salgado, Eduardo R. Banga

Features for F0 contour prediction
Ted H. Applebaum, Nick Kibre, Steve Pearson

Prosodic variation of focused syllables of disyllabic word in Mandarin Chinese
Zhenglai Gu, Hiroki Mori, Hideki Kasuya

Automatic head gesture learning and synthesis from prosodic cues
Stephen M. Chu, Thomas S. Huang

Measuring the importance of morphological information for finnish speech synthesis
Martti Vainio, Toomas Altosaar, Stefan Werner

Learning the parameters of quantitative prosody models
Oliver Jokisch, Hansjörg Mixdorff, Hans Kruschke, Ulrich Kordon

A method for automatic extraction of parameters of the fundamental frequency contour
Shuichi Narusawa, Hiroya Fujisaki, Sumio Ohno

Recognition of emotional states using voice, face image and thermal image of face
Tetsuro Kitazoe, Sung-Ill Kim, Yasunari Yoshitomi, Tatsuhiko Ikeda

Turn taking and multimodal information in two-people dialog
Keiko Watanuki, Susumu Seki, Hideo Miyoshi

Implementation of a text-to-speech system for farsi language
Hamid Reza Abutalebi, Mahmood Bijankhan

Recognition of emotion in a realistic dialogue scenario
Richard Huber, Anton Batliner, Jan Buckow, Elmar Nöth, Volker Warnke, Heinrich Niemann

Differentiation in tone production in cantonese-speaking hearing-impaired children
Johanna Barry, Peter Blamey, Kathy Lee, Dilys Cheung

Learning effects for phonetic properties of synthetic speech
Martine van Zundert, Jacques Terken

An empirical study of the effectiveness of speech-recognition-based pronunciation training
Laura Mayfield Tomokiyo, Le Wang, Maxine Eskenazi

Automatic detection of mispronounced phonemes for language learning tools
Olivier Deroo, Christophe Ris, Sofie Gielen, Johan Vanparys

Estimation of duration models for phonemes in m exican speech synthesis
Horacio Meza Escalona, Ingrid Kirschning, Ofelia Cervantes Villagómez

Special text processing based external descriptor rule
Xiaoru Wu, Renhua Wang, Guoping Hu

Articulatory synthesis using a vocal-tract model of variable length
Zhenli Yu, Shangcui Zeng

Linguistic-prosodic processing for text-to-speech synthesis in italian
Philippe Boula de Mareüil

A unified approach for speech synthesis and speech recognition using stochastic Markov graphs
Matthias Eichner, Matthias Wolff, Rüdiger Hoffmann

Using F0 within a phonologically motivated method of unit selection
Andrew Breen, James Salter

Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context
Christophe J. Blouin, Paul C. Bagshaw

Automatic construction of acoustic inventory for the concatenative speech synthesis for polish
Artur Janicki

Universal and multilingual unit selection for DRESS
Diane Hirschfeld, Matthias Wolff

Improving speech synthesis for high intelligibility under adverse conditions
Davis Pan, Brian Heng, Shiufun Cheung, Ed Chang

Development of a formant-based analysis-synthesis system and generation of high quality liquid sounds of Japanese
Nobuyuki Nishizawa, Nobuaki Minematsu, Keikichi Hirose

Synthesizing and evaluating an artificial language: klingon
Oliver Jokisch, Matthias Eichner

Non-standard word and homograph resolution for asian language text analysis
Craig Olinsky, Alan W. Black

Re-estimation of LPC coefficients in the sense of l&inf; criterion
Zhang Sen, Katsuhiko Shirai

An efficient codebook search algorithm for EVRC
Sung-Kyo Jung, Yong-Soo Choi, Young-Cheol Park, Dae-Hee Youn

The reduction of the search time by the pre-determination of the grid bit in the g.723.1 MP-MLQ
Jong-Kuk Kim, Jeong-Jin Kim, Myung-Jin Bae

Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement
Sebastian Möller, Hervé Bourlard

HMM-based echo and announcement modeling approaches for noise suppression avoiding the problem of false triggers
Rathinavelu Chengalvarayan, David L. Thomson

Speaker information enhancement
Fangxin Chen

Exhaustive search for lower-bound error-rates in vocal tract length normalization
Hans Dolfing

Use of voicing information to improve the robustness of the spectral parameter set
Dusan Macho, Climent Nadeu

Residual noise compensation by a sequential EM algorithm for robust speech recognition in nonstationary noise
Kaisheng Yao, Bertram E. Shi, Satoshi Nakamura, Zhigang Cao

Principal mixture speaker adaptation for improved continuous speech recognition
Hui Ye, Pascale Fung, Taiyi Huang

Reduced impedance mismatch in speech database access
Toomas Altosaar, Martti Vainio

Internet training system for listening and pronunciation of Chinese stop consonants
Jiapeng Tian, Jouji Miwa

Identification of Japanese double-mora phonemes considering speaking rate for the use in CALL systems
Carlos Toshinori Ishi, Keikichi Hirose, Nobuaki Minematsu


Speech Perception, Comprehension, and Production (Special Session)


Phonological processing in the auditory system: a new class of stimuli and advances in fmri techniques
Roy D. Patterson, Stefan Uppenkamp, Dennis Norris, William Marslen-Wilson, Ingrid Johnsrude, Emma Williams

Brain regions responsible for word retrieval, speech production and deficient word fluency in elderly people: a PET activation study
Itaru F. Tatsumi, Michio Senda, Kenji Ishii, Masahiro Mishina, Masashi Oyama, Hinako Toyama, Keiichi Oda, Masayuki Tanaka, Yasuyuki Gondo

MEG-measurements of brain activity reveal the link between human speech production and perception
Paavo Alku, Hannu Tiitinen, Kalle J. Palomäki, Päivi Sivonen

Normal and impaired processing in quasi-regular domains of language: the case of English past-tense verbs
Karalyn Patterson, Matthew A. Lambon Ralph, Helen Bird, John R. Hodges, James L. McClelland

Neuropsychological and computational evidence for a model of lexical processing, verbal short-term memory and learning
Nadine Martin, Eleanor M. Saffran, Gary S. Dell, Myrna F. Schwartz, Prahlad Gupta

Normal and impaired reading of Japanese kanji and kana
Takao Fushimi, Mutsuo Ijuin, Naoko Sakuma, Masayuki Tanaka, Tadahisa Kondo, Shigeaki Amano, Karalyn Patterson, Itaru F. Tatsumi

A connectionist approach to naming disorders of Japanese in dyslexic patients
Mutsuo Ijuin, Takao Fushimi, Karalyn Patterson, Naoko Sakuma, Masayuki Tanaka, Itaru Tatsumi, Tadahisa Kondo, Shigeaki Amano

Impaired pronunciations of kanji words by Japanese CVA patients
Taeko N. Wydell, Takako Shinkai

Disability of phonological versus visual information processes in Japanese dyslexic children
Akira Uno, M. Kaneko, N. Haruhara, M. Kaga

Lexical tone in the spoken word recognition of Chinese
Xiaolin Zhou, Yanxuan Qu


Prosody 1, 2


Lexical tone in the speech production of Chinese words
Xiaolin Zhou, Jie Zhuang

Prosody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour
Yu Hu, Qin-Feng Liu, Ren-Hua Wang

Multi-strategy data mining on Mandarin prosodic patterns
Yiqiang Chen, Wen Gao, Tingshao Zhu, Jiyong Ma

A unified view on synchronized overlap-add methods for prosodic modifications of speech
Werner Verhelst, Dirk van Compernolle, Patrick Wambacq

Chinese tone modeling with stem-ML
Chilin Shih, Greg P. Kochanski

Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis
Colin W. Wightman, Ann K. Syrdal, Georg Stemmer, Alistair Conkie, Mark Beutnagel

Data-driven importance analysis of linguistic and phonetic information
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann

Tonal structure of yes-no question intonation in chaha
Zhiqiang Li, Degif Petros Banksira

Improved tone recognition by normalizing for coarticulation and intonation effects
Chao Wang, Stephanie Seneff

Discriminating Chinese lexical tones by anchoring F0 features
Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose

Universal and language-specific effects in the perception of question intonation
Carlos Gussenhoven, Aoju Chen

The interplay and interaction between prosody and syntax: evidence from Mandarin Chinese
Chiu-Yu Tseng, Da-De Chen

A quantitative description of German prosody offering symbolic labels as a by-product
Hansjörg Mixdorff, Hiroya Fujisaki




Production of Spoken Language (Poster)


Toward an acoustic-articulatory model of inter-speaker variability
Parham Mokhtari, Frantz Clermont, Kazuyo Tanaka

Degrees of freedom of tongue movements in speech may be constrained by biomechanics
Pascal Perrier, Joseph Perkell, Yohan Payan, Majid Zandipour, Frank Guenther, Ali Khalighi

Gestural overlap, place of articulation and speech rate - an x-ray investigation
Béatrice Vaxelaire, Rudolph Sock, Pascal Perrier

Articulatory compensation and adaptation for unexpected palate shape perturbation
Masaaki Honda, Akinori Fujino

Modeling of a speech production system based on MRI measurement of three-dimensional vocal tract shapes during fricative consonant phonation
Takuya Niikawa, Masafumi Matsumura, Takashi Tachimura, Takeshi Wada

Improving acoustic-to-articulatory inversion by using hypercube codebooks
Slim Ouni, Yves Laprie

Concatenative arabic speech synthesis using large speech database
Wael M. Hamza, Mohsen A. Rashwan

A new speech classifier based on Yinyang compensatory soft computing theory
Dong Chen, Jingming Kuang, Yan Zhang

New models predicting conversational effects of telephone transmission on speech communication quality
Sebastian Möller, Ute Jekosch, Alexander Raake

A novel search algorithm for LSF VQ
Jinyu Li, Xin Luo, Ren-Hua Wang

Conversational networking: conversational protocols for transport, coding, and control
Stéphane H. Maes, Dan Chazan, Gilad Cohen, Ron Hoory

A low bit rate speech coding method using a formant-articulatory parameter nomogram
Hiroshi Ohmura, Akira Sasou, Kazuyo Tanaka

Variable bit-rate sinusoidal transform coding using variable order spectral estimation
Ning Li, Derek J. Molyneux, Meau Shin Ho, B. M. G. Cheetham

Efficient harmonic-CELP based hybrid coding of speech at low bit rates
Yong-Soo Choi, Sueng-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn

Speech enhancement based on a constrained sinusoidal model
Jesper Jensen, John H. L. Hansen

A bark coherence function for perceived speech quality estimation
Sang-Wook Park, Seung-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn

A high-efficiency scheme for secure speech transmission using spatiotemporal chaos synchronization
Jinyu Kiang, Kun Deng, Ronghuai Huang


Speaker, Dialect, and Language Recognition (Poster)


Application of speaker authentication technology to a telephone dialogue system
Leandro Rodríguez Liñares, Carmen García Mateo

Language recognition using time-frequency principal component analysis and acoustic modeling
Michel Dutat, Ivan Magrin-Chagnolleau, Frédéric Bimbot

Comparative study of GMM, DTW, and ANN on Thai speaker identification system
Chularat Tanprasert, Varin Achariyakulporn

Efficient mixed-order hidden Markov model inference
Ludwig Schwardt, Johan du Preez

Speaker identification and verification using eigenvoices
Olivier Thyes, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua

A priori threshold selection for fixed vocabulary speaker verification systems
Arun C. Surendran, Chin-Hui Lee

Application of LDA to speaker recognition
Qin Jin, Alex Waibel

Automatic language identification using mixed-order HMMs and untranscribed corpora
Ludwig Schwardt, Johan du Preez

On the potential threat of using large speech corpora for impostor selection in speaker verification
Johan Lindberg, Mats Blomberg

Phonetic consistency in Spanish for pin-based speaker verification system
J. Ortega-Garcia, J. G. Rodriguez, D. T. Merino

An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition
Zhimin Liu, Xihong Wu, Bin Zhen, Huisheng Chi

Transition-oriented hidden Markov models for speaker verification
S. Douglas Peters, Matthieu Hébert, Daniel Boies

An LLR-based technique for frame selection for GMM-based text-independent speaker identification
Pang Kuen Tsoi, Pascale Fung

Robust speaker recognition based on high order cumulant
Jiyong Ma, Wen Gao

Two-stage speaker identification system based on VQ and NBDGMM
Luo Si, Qi Xiu Hu

A MAP approach, with synchronous decoding and unit-based normalization for text-dependent speaker verification
Johnny Mariethoz, Johan Lindberg, Frédéric Bimbot

A fast search method of speaker identification for large population using pre-selection and hierarchical matching
Zhibin Pan, Koji Kotani, Tadahiro Ohmi

Optimal fusion of diverse feature sets for speaker identification: an alternative method
Lan Wang, Ke Chen, Huisheng Chi

Transformation enhanced multi-grained modeling for text-independent speaker recognition
Upendra V. Chaudhari, Jiri Navrátil, Stéphane H. Maes, Ramesh Gopinath

Imposture using synthetic speech against speaker verification based on spectrum and pitch
Takashi Masuko, Keiichi Tokuda, Takao Kobayashi

Speaker recognition with recurrent neural networks
Shahla Parveen, Abdul Qadeer, Phil Green

Speaker feature extraction from pitch information based on spectral subtraction for speaker identification
Yoshiroh Itoh, Jun Toyama, Masaru Shimbo

Text-independent speaker identification using Gaussian mixture bigram models
Wei-Ho Tsai, Chiwei Che, Wen-Whei Chang

Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification
Hassan Ezzaidi, Jean Rouat

Speaker verification in mismatch training and testing conditions
Marcos Faúndez-Zanu, Adam Slupinski

Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training
Toshiaki Uchibe, Shingo Kuroiwa, Norio Higuchi

Accent-specific Mandarin adaptation based on pronunciation modeling technology
Mingkuan Liu, Bo Xu



Generation and Synthesis of Spoken Language 1, 2


Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach)
Edmilson S. Morais, Paul Taylor, Fábio Violaro

A corpus-based Chinese speech synthesis with contextual dependent unit selection
Ren-Hua Wang, Zhongke Ma, Wei Li, Donglai Zhu

Segment selection in the L&h Realspeak laboratory TTS system
Geert Coorman, Justin Fackrell, Peter Rutten, Bert Van Coile

A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units
Ren-yuan Lyu, Zhen-hong Fu, Yuang-chin Chiang, Hui-mei Liu

Puretalk: a high quality Japanese text-to-speech system
Masayuki Yamada, Yasuo Okutani, Toshiaki Fukada, Takashi Aso, Yasuhiro Komori

Using cross-syllable units for Cantonese speech synthesis
Ka Man Law, Tan Lee

Limited domain synthesis
Alan W. Black, Kevin A. Lenzo

Coupling dialogue and prosody computation in spoken dialogue generation
Christine H. Nakatani, Jennifer Chu-Carroll

A study on the pitch pattern of a singing voice synthesis system based on the cepstral method
Tomio Takara, Kazuto Izumi, Keiichi Funaki

Automatic methods for lexical stress assignment and syllabification
Steve Pearson, Roland Kuhn, Steven Fincke, Nick Kibre

Using bayesian belief networks for model duration in text-to-speech systems
Olga Goubanova, Paul Taylor

Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis
Diane Hirschfeld

Temporal patterns of critical-band spectrum for text-to-speech
Pratibha Jain, Hynek Hermansky


Speaker, Dialect, and Language Recognition 1, 2


Successive cohort selection (SCS) for text-independent speaker verification
Eric H. C. Choi, Jianming Song

Fuzzy normalisation methods for speaker verification
Dat Tran, Michael Wagner

Speaker verification in operational environments - monitoring for improved service operation
Yong Gu, Hans Jongebloed, Dorota Iskra, Els den Os, Lou Boves

On-line unsupervised adaptation in speaker verification
Larry P. Heck, Nikki Mirghafori

Multiple sub-band systems for speaker verification
P. Sivakumaran, A. M. Ariyaeeinia, Jill A. Hewitt

An orthogonal GMM based speaker verification system
Xiaoxing Liu, Baosheng Yuan, Yonghong Yan

A nave de-lambing method for speaker identification
Qin Jin, Alex Waibel

The lincoln speaker recognition system: NIST eval2000
Douglas A. Reynolds, R. Bob Dunn, Jack L. McLaughlin

Foldering voicemail messages by caller using text independent speaker recognition
Aaron E. Rosenberg, S. Parthasarathy, Julia Hirschberg, Stephen Whittaker

Structural framework for combining speaker recognition methods
Claude Montacié, Marie-José Caraty

Bootstrapping for speaker recognition
Walter D. Andrews, Joseph P. Campbell, Douglas A. Reynolds

On the importance of components of the MFCC in speech and speaker recognition
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi

On the influence of rate, pitch, and spectrum on automatic speaker recognition performance
Thomas F. Quatieri, R. Bob Dunn, Douglas A. Reynolds

A model-based transformational approach to robust speaker recognition
Remco Teunen, Ben Shahshahani, Larry Heck


Linguistics, Phonology, Phonetics, and Psycholinguistics (Poster)


Contrastive lateral clicks and variation in click types
Amanda Miller-Ockhuizen, Bonny E. Sands

Analysis of acoustic models trained on a large-scale Japanese speech database
Tomoko Matsui, Masaki Naito, Yoshinori Sagisaka, Kozo Okuda, Satoshi Nakamura

Farsi vowel compensatory lengthening: an experimental approach
Mahmood Bijankhan

Cortical reorganization associated with the acquisition of Mandarin tones by american learners: an FMRI study
Yue Wang, Joan A. Sereno, Allard Jongman, Joy Hirsch

The production of real and non-words in adult stutterers and non-stutterers: an acoustic study
S. P. Whiteside, R. A. Varley, T. Phillips, H. Garety

A new proposal of laryngeal features for the tonal system of Vietnamese
Masaaki Shimizu, Masatake Dantsuji

How to choose training set for language modeling
Hong Zhang, Bo Xu, Taiyi Huang

High performance "general purpose" phonetic recognition for Italian
Piero Cosi, John-Paul Hosom

First approach to the selection of lexical units for continuous speech recognition of Basque
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, N. Ezeiza, M. Peñagarikano, M. Hernandez, Luis Javier Rodriguez

Assimilation, ambiguity, and the feature parsing problem
David W. Gow Jr.

Optimization of units for continuous-digit recognition task
Sachin S. Kajarakar, Hynek Hermansky

Perceptual features for the identification of Romance languages
Ioana Vasilescu, Francois Pellegrino, Jean-Marie Hombert

Perception of Swedish vowel quantity: tracing late stages of development
Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan

Statistically trained orthographic to sound models for Thai
Ananlada Chotimongkol, Alan W. Black

Speech timing patterning as an indicator of discourse and syntactic boundaries
Janice Fon, Keith Johnson

On the phonetics of geminates: evidence from Cypriot Greek
Amalia Arvaniti, Georgios Tserdanelis

A simple procedure to clarify the relation between text and prosody
Hanny den Ouden, Carel van Wijk, Marc Swerts

Effects of consonantal voicing on English diphthongs: a comparison of L1 and L2 production
Kimiko Tsukada

The challenge of non-lexical speech sounds
Nigel Ward

A method to synthesize Arabic from short phonetic
Yousif A. El-Imam

A brazilian portuguese language corpus development
Mauricio C. Schramm, Luis Felipe R. Freitas, Adriano Zanuz, Dante Barone

Visual lipreading of voicing for French stop consonants
C. Colin, Monique Radeau, Didier Demolin, A. Soquet

Acoustic features of vowel production in Mandarin speakers of English
Yang Chen, Michael Robb

Spoken language navigation systems for drivers
Robert Belvin, Ron Burns, Cheryl Hein

An approach to intelligent Chinese dialogue system
Fang Chen, Baozong Yuan

Goal-oriented table-driven design for dialogue manager
Huei-Ming Wang, Yi-Chung Lin

Dialogue management in the Bell Labs communicator system
Alexandros Potamianos, Egbert Ammicht, Hong-Kwang J. Kuo

Dialogue management based on a hierarchical task structure
Jiang Han, Yong Wang

Melodic characteristics of backchannels in Dutch map task dialogues
Johanneke Caspers

Corrections in spoken dialogue systems
Marc Swerts, Diane Litman, Julia Hirschberg

F0 correlates of topic and subject in spontaneous Japanese speech
John Fry

Specification of communicative acts of utterances based on dialogue corpus analysis
Mutsuko Tomokiyo, Solange Hollard

An experimental verification of the prosodic/lexical effects on the occurrence of backchannels
Hiroaki Noguchi, Yasuhiro Katagiri, Yasuharu Den

The acoustic characteristics of Japanese identical vowel sequences in connected speech
Tsutomu Sato, John A. Maidment


Spoken and Multi-Modal Dialogue Systems


Effects of dialog initiative and multi-modal presentation strategies on large directory information access
Shrikanth Narayanan, Giuseppe Di Fabbrizio, C. Kamm, James Hubbell, B. Buntschuh, P. Ruscitti, Jerry H. Wright

A declarative framework for building compositional dialog modules
William Thompson, Harry Bliss

A plan-based dialog system with probabilistic inferences
Kuansan Wang

Generating effective confirmation and guidance using two-level confidence measures for dialogue systems
Kazunori Komatani, Tatsuya Kawahara

Intelligent barge-in in conversational systems
Nikko Ström, Stephanie Seneff

A system for the research into multi-modal man-machine communication within a virtual environment
Andrew Breen, Barry Eggleton, Gavin Churcher, Paul Deans, Simon Downey

Advances in automatic transcription of Italian broadcast news
Fabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani

Live thesaurus construction for interactive voice-based web search
Shui-Lung Chuang, Hsiao-Tieh Pu, Wen-Hsiang Lu, Lee-Feng Chien

Selecting TV news stories and newswire articles related to a target article of newswire using SVM
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi

Towards an integrated approach for spoken document retrieval
Kenney Ng

An experimental study of an audio indexing system for the web
Beth Logan, Pedro Moreno, Jean-Manuel van Thong, Ed Whittaker

Title generation for spoken broadcast news using a training corpus
Rong Jin, Alex G. Hauptmann

Evaluating different information retrieval algorithms on real-world data
Manfred Weber, Thomas Kemp

Transcription and summarization of voicemail speech
Konstantinos Koumpis, Steve Renals

Robust rejection for embedded systems
W. C. Tsai, Y. C. Chu

Multimodal signal processing in naturalistic noisy environments
Sharon Oviatt

A multi-modal dialog system for business transactions
Joyce Chai, Sylvie Levesque, Margorzata Budzikowska, Veronika Horvath, Nanda Kambhatla, Nicolas Nicolov, Wlodek Zadrozny

Office message center - a spoken dialogue system
Jiang Han, Yonghong Yan, Zhiwei Lin, Yong Wang, Jian Liu, Danjun Liu, Zhihui Wang

A new method for understanding sequences of utterances by multiple speakers
Noboru Miyazaki, Jun-ichi Hirasawa, Mikio Nakano, Kiyoaki Aikawa

Improvement of dialogue efficiency by dialogue control model according to performance of processes
Hideaki Kikuchi, Katsuhiko Shirai

MUXING: a telephone-access Mandarin conversational system
C. Wang, D. Scott Cyphers, Xiaolong Mou, Joseph Polifroni, Stephanie Seneff, J. Yi, Victor Zue

Jaspis - a framework for multilingual adaptive speech applications
Markku Turunen, Jaakko Hakulinen

The CU communicator: an architecture for dialogue systems
Bryan Pellom, Wayne Ward, Sameer Pradhan

Preferred modalities in dialogue systems
Vildan Bilici, Emiel Krahmer, Saskia te Riele, Raymond Veldhuis

Introduction to the IST-HLT project speech-driven multimodal automatic directory assistance (SMADA)
Fréderic Béchet, Elisabeth den Os, Lou Boves, Jürgen Sienel

Using HPSG to represent multi-modal grammar in multi-modal dialogue
Crusoe Mao, Tony Tuo, Danjun Liu

An efficient dialogue control method under system²s limited knowledge
Kohji Dohsaka, Norihito Yasuda, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa

A distributed spoken user interface based on open agent architecture (OAA)
Ying Cheng, Anurag Gupta, Raymond Lee





Miscellaneous Topics 2 [M,J]


Automatic speech recognition in Mandarin for embedded platforms
Fengguang Zhao, Prabhu Raghavan, Sunil K. Gupta, Ziyi Lu, Wentao Gu, Wentao Gu

Confidence measure based unsupervised speaker adaptation
Husheng Li, Jia Liu, Runsheng Liu

Improved variable preselection list length estimation using NNs in a large vocabulary telephone speech recognition system
Javier Macías-Guarasa, Javier Ferreiros, José Colás, A. Gallardo-Antolín, Juan Manuel Pardo

Incorporating multiple-HMM acoustic modeling in a modular large vocabulary speech recognition system in telephone environment
Ascensión Gallardo-Antolín, Javier Ferreiros, Javier Macías-Guarasa, R. de Córdoba, Juan Manuel Pardo

Decision tree based text-to-phoneme mapping for speech recognition
Janne Suontausta, Juha Häkkinen

Reduced traceback matrix storage for small footprint model alignment
Jeff Meunier

Dynamic adaptation of vocabulary independent HMMs to an application environment
Claudio Vair, Luciano Fissore, Pietro Laface

Synergy of spectral and perceptual features in multi-source connectionist speech recognition
Roberto Gemello, Loreta Moisa, Pietro Laface

High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation
Ramalingam Hariharan, Olli Viikki

Transcription of broadcast news with a time constraint: IBM’s 10xRT HUB4 system
Ellen Eide, Benoît Maison, D. Kanevsky, P. Olsen, S. Chen, L. Mangu, M. Gales, Miroslav Novak, Ramesh Gopinath

Exact alpha-beta computation in logarithmic space with application to MAP word graph construction
Geoffrey Zweig, Mukund Padmanabhan

Relationship among speaking style, inter-phoneme's distance and speech recognition performance
Kazumasa Yamamoto, Seiichi Nakagawa

Spanish recogniser of continuously spelled names over the telephone
Ruben San-Segundo, José Colás, Javier Ferreiros, Javier Macías-Guarasa, Juan Miguel Pardo

Two-stream modeling of Mandarin tones
Frank Seide, Nick J.C. Wang

A neural network speech recognizer based on the both acoustic steady portions and transitions
Seyyed Ali Seyyed Salehi

Belief networks for a syntactic and semantic analysis of spoken utterances for speech understanding
Marc Hofmann, Manfred Lang

A robust speech understanding system using conceptual relational grammar
Jiping Sun, Roberto Togneri, Li Deng

Incorporating tone information into Cantonese large-vocabulary continuous speech recognition
Wai Lau, Tan Lee, Yiu Wing Wong, P. C. Ching

A novel loss function for the overall risk criterion based discriminative training of HMM models
Janez Kaiser, Bogomir Horvat, Zdravko Kacic

Looking for topic similarities of highly inflected languages for language model adaptation
Mirjam Sepesy Maucec, Zdravko Kacic, Bogomir Horvat

Integrating MAP and linear transformation for language model adaptation
David Janiszek, Frédéric Béchet, Renato De Mori

Utterance verification based speech recognition system
Beng Tiong Tan, Yong Gu, Trevor Thomas

Use of linear extrapolation based linear predictive cepstral features (LE-LPCC) for Tamil speech recognition
Rathinavelu Chengalvarayan

Robust fundamental frequency estimation using instantaneous frequencies of harmonic components
Yoshinori Atake, Toshio Irino, Hideki Kawahara, Jinlin Lu, Satoshi Nakamura, Kiyohiro Shikano

Integrating different acoustic and syntactic language models in a continuous speech recognition system
Amparo Varona, In Torres, Miren Karmele López de Ipiña, Luis Javier Rodriguez

Combining multiple speech recognizers using voting and language model information
Holger Schwenk, Jean-Luc Gauvain

Dialogue management based on inferred behavioral goal - improving the accuracy of understanding by dialogue context -
Keisuke Watanabe, Yasushi Ishikawa

Speech recognition using context conditional word posterior probabilities
Ralf Schlüter, Frank Wessel, Hermann Ney

The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language
Hugo Meinedo, Joao P. Neto

Combination of acoustic models in continuous speech recognition hybrid systems
Hugo Meinedo, Joao P. Neto

Automatic speech recognition of non-native speakers using consonant-vowel-consonant (CVC) words
David A. van Leeuwen, Sander J. van Wijngaarden

Understanding Chinese in spoken dialogue systems
Gang Zhao, Hong Xu

A front-end using the harmonicity cue for speech enhancement in loud noise
Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier

Lucent automatic speech recognition: a speech recognition engine for internet and telephony srvice applications
Qiru Zhou, Sergey Kosenko

Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables
Todd A. Stephenson, Hervé Bourlard, Samy Bengio, Andrew C. Morris

Towards robust telephony speech recognition in office and automobile environments
Subrata Das, David Lubensky

Extracting phonological chunks based on piecewise linear segment lattices
Hiroaki Kojima, Kazuyo Tanaka

Evaluating hierarchical hybrid statistical language models
Lucian Galescu, James Allen

An efficient lexical tree search for large vocabulary continuous speech recognition
Jun Ogata, Yasuo Ariki

Reliability evaluation of speech recognition in acoustic modeling
Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu

Using GMM for voiced/voiceless segmentation and tone decision in Mandarin continuous speech recognition
Ching X. Xu

Auditory spectrum based features (ASBF) for robust speech recognition
Chi H. Yim, Oscar C. Au, Wanggen Wan, Cyan L. Keung, Carrson C. Fung

Large vocabulary Mandarin speech recognition with different approaches in modeling tones
Eric Chang, Jianlai Zhou, Shuo Di, Chao Huang, Kai-Fu Lee

Fast very large vocabulary recognition based on compact DAWG-structured language models
Kalirroi Georgila, Kyriakos Sgarbas, Nikos Fanotakis, George Kokkinakis

Crosslinguistic disfluency modeling: a comparative analysis of Swedish and tok pisin human-human ATIS dialogues
Robert Eklund

Vector space representation of language probabilities through SVD of n-gram matrix
Shiro Terashima, Kazuya Takeda, Fumitada Itakura

Spoken language parsing based on incremental disambiguation
Yoshihide Kato, Shigeki Matsubara, Katsuhiko Toyama, Yasuyoshi Inagaki

Jacobian adaptation of HMM with initial model selection for noisy speech recognition
Hiroshi Shimodaira, Yutaka Kato, Toshihiko Akae, Mitsuru Nakai, Shigeki Sagayama

The BBN Byblos 2000 conversational Mandarin LVCSR system
Han Shu, Chuck Wooters, Owen Kimball, Thomas Colthurst, Fred Richardson, Spyros Matsoukas, Herbert Gish

The 2000 BBN Byblos LVCSR system
Thomas Colthurst, Owen Kimball, Fred Richardson, Han Shu, Chuck Wooters, Rukmini Iyer, Herbert Gish

Broadcast news transcription in Mandarin
Langzhou Chen, Lori Lamel, Gilles Adda, Jean-Luc Gauvain

Word concept model: a knowledge representation for dialogue agents
Yang Li, Tong Zhang, Stephen E. Levinson

Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights
Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura

Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems
Hiroaki Nanjo, Akinobu Lee, Tatsuya Kawahara

Taiwanese corpus collection via continuous speech recognition tool
Yuang-Chin Chiang, Zhi-Siang Yang, Ren-Yuan Lyu

Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR
Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin

Frame level likelihood transformations for ASR and utterance verification
Konstantin P. Markov, Satoshi Nakamura

Integrating recognition confidence scoring with language understanding and dialogue modeling
Timothy J. Hazen, Theresa Burianek, Joseph Polifroni, Stephanie Seneff

Speech recognition based on estimation of mutual information
Yibiao Yu, Heming Zhao

Keyword spotting in auto-attendant system
Qing Guo, Yonghong Yan, Zhiwei Lin, Baosheng Yuan, Qingwei Zhao, Jian Liu

A new approach for modeling OOV words
Weimin Ren, Chengfa Wang, Wen Gao, Jinpei Xu

Speech recognition using error spotting
Rachida El Méliani, Douglas O'Shaughnessy

Robust endpoint detection for in-car speech recognition
Chung-Ho Yang, Ming-Shiun Hsieh

Internet speech analysis system using e-mail and web technology
Jouji Miwa, Masaru Kumagai

Multi-class linear dimension reduction by generalized Fisher criteria
Marco Loog, Reinhold Haeb-Umbach

Improving the representation of time structure in front-ends for automatic speech recognition
Wendy J. Holmes

Speech analysis by rule extraction from trained artificial neural networks
Katrin Kirchhoff

Minimum mean square error spectral peak envelope estimation for automatic vowel classification
Jaishree Venugopal, Stephen A. Zahorian, Montri Karnjanadecha

Probabilistic compensation of unreliable feature components for robust speech recognition
Cyan L. Keung, Oscar C. Au, Chi H. Yim, Carrson C. Fung

A new tone conversion method for Mandarin by an adaptive linear prediction analysis
Congxiu Wang, Qihu Li, Guoying Zhao, Li Yin, Shuai Hao, Da Meng


Trans-Modal and Multi-Modal Human-Computer Interaction (Special Session)


Multimodal interface research: a science without borders
Sharon Oviatt

Studies of audiovisual speech perception using production-based animation
K. G. Munhall, C. Kroos, T. Kuratate, J. Lucero, M. Pitermann, Eric Vatikiotis-Bateson, H. Yehia

Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction
Chalapathi Neti, Giridharan Iyengar, Gerasimos Potamianos, A. Senior, Benoit Maison

Towards robust lipreading
Wen Gao, Jiyong Ma, Rui Wang, Hongxun Yao

Stream weight optimization of speech and lip image sequence for audio-visual speech recognition
Satoshi Nakamura, Hidetoshi Ito, Kiyohiro Shikano

HMM-based text-to-audio-visual speech synthesis
Shinji Sako, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura

Real-time speech-generated subtitles: problems and solutions
Jill Hewitt, Andi Bateman, Andrew Lambourne, A. Ariyaeeinia, P. Sivakumaran

Mipad: a next generation PDA prototype
Xuedong Huang, Alex Acero, C. Chelba, Li Deng, D. Duchene, Joshua Goodman, H. Hon, D. Jacoby, L. Jiang, R. Loynd, M. Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, Mike Plumpe, K. Wang, Y. Wang

Dialogue management for multimodal user registration
Fei Huang, Jie Yang, Alex Waibel

Segmental optical phonetics for human and machine speech processing
Lynne E. Bernstein

Classification of Thai consonant naming using Thai tone
Umavasee Thathong, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Boonchai Thampanitchawong


Signal Analysis, Processing, and Feature Extraction 1, 2


A high-performance auditory feature for robust speech recognition
Qi Li, Frank K. Soong, Olivier Siohan

A new strategy of formant tracking based on dynamic programming
Kun Xia, Carol Espy-Wilson

Dominant subspace analysis for auditory spectrum
Xugang Lu, Gang Li, Lipo Wang

Spectral and cepstral projection bases constructed by independent component analysis
Ilyas Potamitis, Nikos Fanotakis, George Kokkinakis

Relating LPC modeling to a factor-based articulatory model
Sacha Krstulovic

On data-derived temporal processing in speech feature extraction
Michael L. Shire, Barry Y. Chen

Minimum Bayes error feature selection
George Saon, Mukund Padmanabhan

Using mutual information to design feature combinations
Daniel P. W. Ellis, Jeff A. Bilmes

Multichannel signal separation for cocktail party speech recognition: a dynamic recurrent network
Seungjin Choi, Heonseok Hong, Hervé Glotin, Frédéric Berthommier

An automatic algorithm for segmenting and labelling a connected digit sequence
V. Kamakshi Prasad, Hema A. Murthy

The signal reconstruction of speech by KPCA
Hui Yan, Xuegong Zhang, Yanda Li, Liqin Shen, Weibin Zhu

Blind source separation based on subband ICA and beamforming
Hiroshi Saruwatari, Satoshi Kurita, Kazuya Takeda, Fumitada Itakura, Kiyohiro Shikano

A synchrony front-end using phase-locked-loop techniques
Claudio Estienne, Patricia Pelle

On the use of filter-bank energies driven from the autocorrelation sequence for noisy speech recognition
Javier Hernando




Prosody (Poster)


Duration modeling for Chinese synthesis from C-toBI labeled corpus
Weibin Zhu, Liqin Shen, Xiaochuan Miu

The pitch movement of word stress in Chinese
Bei Wang, Bo Zheng, Shinan Lu, Jianfen Cao, Yufang Yang

The distribution of fillers in lectures in the Japanese language
Michiko Watanabe, Carlos Toshinori Ishi

Research on stress in bisyllsblic words of Mongolian
Huhe Harnud, Yuling Zheng, Jiayou Chen

Modelling of the perception of English sentence stress for computer-assisted language learning
Kazunori Imoto, Masatake Dantsuji, Tatsuya Kawahara

Data driven intonation modelling of 6 languages
Jeska Buhmann, Halewijn Vereecken, Justin Fackrell, Jean-Pierre Martens, Bert van Coile

Prosody prediction using a tree-structure similarity metric
Laurent Blin, Mike Edgington

Prosodic features for automatic text-independent evaluation of degree of nativeness for language learners
Carlos Teixeira, Horacio Franco, Elizabeth Shriberg, Kristin Precoda, Kemal Sönmez

Instantaneous estimation of prosodic pronunciation habits for Japanese students to learn English pronunciation
Nobuaki Minematsu, Seiichi Nakagawa

Synthesis of fundamental FDrequency contours of standard Chinese sentences from tone sandhi and focus conditions
Jinfu Ni, Keikichi Hirose

Syllable duration and its functions in standard Chinese discourse
Yiqing Zu, Xiaoxia Chan, Aijun Li, Wu Hua, Guohua Sun

Generating prosody by superposing multi-parametric overlapping contours
Bleicke Holm, Gérard Bailly

Consistent pitch marking
Raymond Veldhuis

Labeler agreement in transcribing korean intonation with K-toBI
Sun-Ah Jun, Sook-Hyang Lee, Keeho Kim, Yong-Ju Lee

Effectiveness of prosodic features in syntactic analysis of read Japanese sentences
Yukiyoshi Hirose, Kazuhiko Ozeki, Kazuyuki Takagi

A study of F0 declination in Japanese: towards a discourse model of prosodic structure
Mieko Banno

Data-driven intonation modeling using a neural network and a command response model
Atsuhiro Sakurai, Nobuaki Minematsu, Keikichi Hirose

Natural F0 contours with a new neural-network-hybrid approach
Caglayan Erdem, Martin Holzapfel, Rüdiger Hoffmann

Prosodic variation with text type
Justin Fackrell, Halewijn Vereecken, Jeska Buhmann, Jean-Pierre Martens, Bert Van Coile

Inter-transcriber reliability of toBI prosodic labeling
Ann K. Syrdal, Julia McGory

Stem-ML: language-independent prosody description
Greg P. Kochanski, Chilin Shih

Using prosody database in Chinese speech synthesis
Minghui Dong, Kim Teng Lua

Some articulatory and acoustic changes associated with emphasis in spoken English
Donna Erickson, Kikuo Maekawa, Michiko Hashi, Jianwu Dang

Fast speech timing in Dutch: durational correlates of lexical stress and pitch accent
Esther Janse, Anke Sennema, Anneke Slis

On perception of word-based local speech rate in Japanese without focusing attention
Makoto Hiroshige, Kantaro Suzuki, Kenji Araki, Koji Tochinai

Modeling and generation of accentual phrase F0 contours based on discrete HMMs synchronized at mora-unit transitions
Atsuhiro Sakurai, Koji Iwano, Keikichi Hirose

Synthesizing prosody for commands in a Xhosa TTS system
Philippa H. Louw, Justus. C. Roux, Elizabeth. C. Botha


Generation and Synthesis of Spoken Language (Poster)


Design and implementation of a Greek text-to-speech system based on concatenative synthesis
Costas Christogiannis, Yiannis Stavroulas, Yiannis Vamvakoulas, Theodora Varvarigou, Agatha Zappa, Chilin Shih, Amalia Arvaniti

GENESIS-II: a versatile system for language generation in conversational system applications
Lauren Baptist, Stephanie Seneff

New analysis method for harmonic plus noise model based on time-domain periodicity score
Eun-Kyoung Kim, Yung-Hwan Oh

Straight-based voice conversion algorithm based on Gaussian mixture model
Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano

Syllable-based text-to-phoneme conversion for German
Marion Libossek, Florian Schiel

A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network
Horst-Udo Hain

Parametric high definition (PHD) speech synthesis-by-analysis: the development of a fundamentally new system creating connected speech by modifying lexically-represented language units
Hans G. Tillmann, Hartmut R. Pfitzinger

A new synthesis algorithm using phase information for TTS systems
Chul H. Kwon, Minkyu Lee, Joseph P. Olive

Unit fusion for concatenative speech synthesis
Johan Wouters, Michael W. Macon

Diphone collection and synthesis
Kevin A. Lenzo, Alan W. Black

Natural language generation for spoken dialogue
Thomas Portele

Preselection of candidate units in a unit selection-based text-to-speech synthesis system
Alistair Conkie, Mark C. Beutnagel, Ann K. Syrdal, Philip E. Brown

Self-organizing letter code-book for text-to-phoneme neural network model
Kare Jean Jensen, Søren Riis

A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis
Jon R. W. Yi, James R. Glass, I. Lee Hetherington

Analysis of fundamental frequency contours of standard Chinese in terms of the command-response model and its application to synthesis by rule of intonation
Changfu Wang, Hiroya Fujisaki, Ryou Tomana, Sumio Ohno

Manipulating speech pitch periods according to optimal insertion/deletion position in residual signal for intonation control in speech synthesis
Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano

Improving naturalness of Thai text-to-speech synthesis by prosodic rule
Pradit Mittrapiyanuruk, Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Virach Sornlertlamvanich

Word-level F0 range in Mandarin Chinese and its application to inserting words into a sentence
Dawei Xu, Hiroki Mori, Hideki Kasuya

A new Japanese TTS system based on speech-prosody database and speech modification
Mitsuaki Isogai, Kimihito Tanaka, Satoshi Takano, Hideyuki Mizuno, Masanobu Abe, Sin’ya Nakajima

Stress assignment in Spanish proper names
Ruben San-Segundo, Juan Manuel Montero, Ricardo de Córdoba, Juana Gutiérrez-Arriola

Segmentation of prosodic phrases for improving the naturalness of synthesized Mandarin Chinese speech
Zhengyu Niu, Peiqi Chai

Practical language modeling: an interpolating method
Xiaohu Liu, Douglas O'Shaughnessy

Combination of different n-grams based on their different assumptions
Gongjun Li, Na Dong, Toshiro Ishikawa

Construction of speech corpus in moving car environment
Nobuo Kawaguchi, Shigeki Matsubara, Hiroyuki Iwa, Shoji Kajita, Kazuya Takeda, Fumitada Itakura, Yasuyoshi Inagaki

Parsing spoken dialogues
Yue-Shi Lee, Hsin-Hsi Chen

A noise robust multilingual reference recogniser based on SPEECHDAT(II)
Børge Lindberg, Finn Tore Johansen, Narada Warakagoda, Gunnar Lehtinen, Zdravko Kacic, Andrej Zgank, Kjell Elenius, Giampiero Salvi

The design and application of a speech database for Chinese TTS system
Muhua Lv, Lianhong Cai

Use of multiple classifiers for speech recognition in wireless CDMA network environments
Rathinavelu Chengalvarayan

An imperative programming language for spoken language translation
Alexander Franz, Keiko Horiguchi, Lei Duan

Fine keyword clustering using a thesaurus and example sentences for speech translation
Yumi Wakita, Kenji Matsui, Yoshinori Sagisaka

Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS
JunLan Feng, XianFang Wang, LiMin Du

Spoken language corpus for machine interpretation research
Yasuyuki Aizawa, Shigeki Matsubara, Nobuo Kawaguchi, Katsuhiko Toyama, Yasuyoshi Inagaki



Perception and Comprehension of Spoken Language 1, 2


Cross-linguistic aspects of intonation perception
Veronika Makarova

Visual information and the perception of prosody
Haruo Kubozono, Shosuke Haraguchi

Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours
Masato Akagi, Hironori Kitakaze

Neuromagnetic study on localization of speech sounds
Kalle J. Palomäki, Paavo Alku, Ville Mäkinen, Patrick May, Hannu Tiitinen

Perception of identical vowel sequences in Japanese conversational speech
Yukiyoshi Hirose, Kazuhiko Kakehi

Acoustic cues to perception of vowel quality
Santiago Fernández, Sergio Feijóo

A solution to the reduction of concatenation artefacts in speech synthesis
Esther Klabbers, Raymond Veldhuis, Kim Koppen

Domain-unconstrained language understanding based on CKIP-auto tag, how-net, and ART
Jhing-Fa Wang, Hsien-Chang Wang, Kin-Nan Lee, Chieh-Yi Huang

The generation of representations of word meanings from dictionaries
Chris Powell, Mary Zajicek, David Duce

Grammar partitioning and parser composition for natural language understanding
Po Chui Luk, Helen Meng, Filung Wang

Comprehension of synthesized speech while driving and in the lab
Jennifer Lai, Omer Tsimhoni, Paul Green

Orthographic influences on initial phoneme addition and deletion tasks: the effect of lexical status
Michael D. Tyler, Denis K. Burnham

Investigation of analysis and synthesis parameters of straight by subjective evaluation
Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara




Prosody, Acquisition, and Learning


Factors affecting native Japanese speakers' production of intrusive (epenthetic) vowels in English words
Keiichi Tajima, Donna Erickson, Kyoko Nagao

Beyond the conventional statistical language models: the variable-length sequences approach
Imed Zitouni, Kamel Smaïli, Jean-Paul Haton

Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures
Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara

ASR-based subtitling of live TV-programs for the hearing impaired
Trym Holter, Erik Harborg, Magne Hallstein Johnsen, Torbjörn Svendsen

Natural language processing for Taiwanese sign language to speech conversion
Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Shiang Guo

Japanese spoken language learning system using java information technology
Jouji Miwa, Hiroshi Sasaki, Kazunori Tanno

L2 pronunciation quality in read and spontaneous speech
Helmer Strik, Catia Cucchiarini, Diana Binnenpoorte

Designing modulation filters for improving speech intelligibility in reverberant environments
Tomoko Kitamura, Keisuke Kinoshita, Takayuki Arai, Akiko Kusumoto, Yuji Murahara

An environment model-based robust speech recognition
Lei Zhang, Jiqing Han, Chengguo Lv, Chengfa Wang

Particle filtering for non-stationary speech modelling and enhancement
Jaco Vermaak, Christophe Andrieu, Arnaud Doucet

Maximum likelihood noise HMMm estimation in model-based robust speech recognition
Martin Graciarena

Microphone array within a handset or face mask for speech enhancement
Qingsheng Zeng, Douglas O'Shaughnessy

Embedding visually recognizable watermarks into digital audio signals
Chengfa Wang, Qiusheng Wang

Auditory perception of amplitude modulated sinusoid using a pure tone and band-limited noises as modulation signals
Mamoru Iwaki

Spectral voice conversion based on unsupervised clustering of acoustic space
Masoud Geravanchizadeh

Removing hum from spoken language resources
Hartmut R. Pfitzinger

Joint pronunciation modelling of non-native speakers using data-driven methods
Ingunn Amdal, Filipp Korkmazskiy, Arun C. Surendran

A comparison of disfluency distribution in a unimodal and a multimodal speech interface
Linda Bell, Robert Eklund, Joakim Gustafson

Modelling pronunciation variations in spontaneous Mandarin speech
Yi Liu, Pascale Fung

A method of generating English pronunciation dictionary for Japanese English recognition systems
Tadashi Suzuki, Jun Ishii, Kunio Nakajima

A framework for evaluating contextual understanding
Hélène Bonneau-Maynard, L. Devillers

Towards high performance continuous Mandarin digit string recognition
Yonggang Deng, Taiyi Huang, Bo Xu

Stochastic suprasegmentals: relationships between redundancy, prosodic structure and care of articulation in spontaneous speech
Matthew Aylett

An automatic pitch-marking method using wavelet transform
Masaharu Sakamoto, Takashi Saitoh

A proposal of a model to extract Japanese voluntary speech rate control
Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai

Acoustic characteristics of surprise in Russian questions
Veronika Makarova

Neural network based integration of multiple confidence measures for OOV detection
Yonggang Deng, Yang Cao, Bo Xu

How fast can we really change pitch? maximum speed of pitch change revisited
Yi Xu, Xuejing Sun

Predicting segmental durations for Dutch using the sums-of-products approach
Esther Klabbers, Jan van Santen

A stochastic polynomial tone model for continuous Mandarin speech
Yang Cao, Taiyi Huang, Bo Xu, Chengrong Li

Detection of filled pauses in spontaneous conversational speech
Marcel Gabrea, Douglas O’Shaughnessy

Some observations on different strategies for the timing of fundamental frequency events
Bertil Lyberg, Sonia Sangarig

Research on dynamic characters of Chinese pitch contours
Zhiyong Wu, Lianhong Cai, Tongchun Zhou


Adaptation and Acquisition in Spoken Language Processing (Poster)


Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers
Bing Zhao, Bo Xu

An online incremental speaker adaptation method using speaker-clustered initial models
Zhipeng Zhang, Sadaoki Furui

Prior parameter transformation for unsupervised speaker adaptation
Guoqiang Li, Limin Du, Ziqiang Hou

Improved Jacobian adaptation for fast acoustic model adaptation in noisy speech recognition
Ruhi Sarikaya, John H. L. Hansen

A study of vocal tract length normalization with generation-dependent acoustic models
Keiko Fujita, Yoshio Ono, Yoshihisa Nakatoh

Optimal on-line Bayesian model selection for speaker adaptation
Shaojun Wang, Yunxin Zhao

Unsupervised audio stream segmentation and clustering via the Bayesian information criterion
Bowen Zhou, John H. L. Hansen

Frame-period adaptation for speaking rate robust speech recognition
Satoru Tsuge, Toshiaki Fukada, Kenji Kita

Cross-language use of acoustic information for automatic speech recognition
C. Nieuwoudt, Elizabeth C. Botha

Selective training of HMMs by using two-stage clustering
Shoei Sato, Toru Imai, Hideki Tanaka, Akio Ando

Compensation of noise effects for robust speech recognition in car environments
Angel de la Torre, Dominique Fohr, Jean-Paul Haton

Bayesian speaker adaptation based on probabilistic principal component analysis
Dong Kook Kim, Nam Soo Kim

MLLR-based accent model adaptation without accented data
Wai Kat Liu, Pascale Fung

Fast speaker adaptation using eigenspace-based maximum likelihood linear regression
Kuan-Ting Chen, Wen-Wei Liau, Hsin-Min Wang, Lin-Shan Lee

Stream confidence estimation for audio-visual speech recognition
Gerasimos Potamianos, Chalapathy Neti

The effect of reduced spectral information on Japanese consonant perception: comparison between L1 and L2 listeners
Masahiko Komatsu, Won Tokuma, Shinichi Tokuma, Takayuki Arai

Can cantonese children with cochlear implants perceive lexical tones?
Valter Ciocca, Rani Aisha, Alex Francis, Lena Wong

Recognition of spoken words in the continuous speech: effects of transitional probability
Michael C. W. Yip

Detection of speech landmarks using temporal cues
Ariel Salomon, Carol Espy-Wilson

A set of Japanese word cohorts rated for relative familiarity
Takashi Otake, Anne Cutler

The phonetic value of the devocalized vowel in Japanese - in case of velar plosive
Kimiko Yamakawa, Hiromitsu Miyazono, Ryoji Baba

Positive and negative influences of the lexicon on phonemic decision-making
James M. McQueen, Anne Cutler, Dennis Norris

Phonotactic and acoustic cues for word segmentation in English
Andrea Weber

Intelligibility of time-compressed speech: three ways of time-compression
Esther Janse

Evidence for demodulation in speech perception
Hartmut Traunmüller





Miscellaneous 3 [D,E,F,I,P,N,R,S,U,W,Y,Z]


Talking to thimble jellies: children²s conversational speech with animated characters
Sharon Oviatt

A high-resolution glottal pulse tracker
Robert Rodman, David McAllister, Donald Bitzer, D. Chappell

Analysis of voice production in breathy, normal and pressed phonation by comparing inverse filtering and videokymography
Paavo Alku, Jan G. Svec, Erkki Vilkman, Frantisek Sram

Model of the mechanical linkage of the upper lip-jaw for the articulatory coordination
Takayuki Ito, Hiroaki Gomi, Masaaki Honda

Measurement of palatolingual contact pressure and tongue force using a force-sensor-mounted palatal plate
Masafumi Matsumura, Takuya Niikawa, Taku Torii, Hitoshi Yamasaki, Hisanaga Hara, Takashi Tachimura, Takeshi Wada

A 3d tongue model based on MRI data
Olov Engwall

Speech quality improvement in TTS system using ABS/OLA sinusoidal model
Jae-Hyun Bae, Heo-Jin Byeon, Yung-Hwan Oh

A study of palatal segments' production by danish speakers
Marielle Bruyninckx, Bernard Harmegnies

Dynamic selection of feature spaces for robust speech recognition
Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny

A probabilistic model of integration of acoustic cues in FV syllables
Santiago Fernández, Sergio Feijóo

Directed graphical models of classifier combination: application to phone recognition
Jeff A. Bilmes, Katrin Kirchhoff

Real-time multilingual HMM training robust to channel variations
E. E. Jan, Jaime Botella Ordinas, George Saon, Salim Roukos

The intelligibility of German and English speech to Dutch listeners
Sander J. van Wijngaarden, Herman J.M. Steeneken

On the use of bandpass liftering in speaker recognition
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi

On auditory-phonetic short-term transformation
René Carré, Liliane Sprenger-Charolles, Souhila Messaoud-Galusi, Willy Serniclaes

Predicting the perceptual confusion of synthetic plosive consonants in noise
James J. Hant, Abeer Alwan

Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches
Martha Larson, Daniel Willett, Joachim Köhler, Gerhard Rigoll

Learning and transfer of learning for synthetic speech
Martine van Zundert, Jacques Terken

Neural plasticity revealed in perceptual training of a Japanese adult listener to learn american /l-r/ contrast: a whole-head magnetoencephalography study
Yang Zhang, Patricia K. Kuhl, Toshiaki Imada, Paul Iverson, John Pruitt, Makoto Kotani, Erica Stevens

The effect of consonantal context and acoustic characteristics on the discrimination between the English vowel /i/ and /e/ by Japanese learners
Akiyo Joto

A study on emotional feature recognition in speech
Li Zhao, Wei Lu, Ye Jiang, Zhenyang Wu

LPC, LPCC and MFCC parameterisation applied to the detection of voice impairments
Juan I. Godino-Llorente, Santiago Aguilera-Navarro, Pedro Gómez-Vilda

A complementary approach to computer-aided transcription: synergy of statistical-based and kbnowledge discovery paradigms
Benjamin K. T'sou, Tom B. Y. Lai

Teraspeech’2000 : a 10,000 speakers database
Marie-José Caraty, Claude Montacié

The MATE workbench - a tool in support of spoken dialogue annotation and information extraction
Laila Dybkjær, Niels Ole Bernsen

Discarding impossible events from statistical language models
Armelle Brun, David Langlois, Kamel Smaili, Jean-Paul Haton

A tool to build a treebank for conversational Chinese
Yves Lepage, Nicolas Auclerc, Satoshi Shirai

Parameter reduction in a text-independent speaker verification system
Roland Auckenthaler, Michael Carey, John Maso

Advances on HMM-based text-dependent speaker verification
Yong Gu, Trevor Thomas

Optimisation of GMM in speaker recognition
Robert Stapert, John S. Mason, Roland Auckenthaler

Distance-based Gaussian mixture model for speaker recognition over the telephone
Ran D. Zilca, Yuval Bistritz

Pruning abnormal data for better making a decision in speaker verification
Jun-Hui Liu, Ke Chen

ASR, dialects, and acoustic/phonological distances
Louis ten Bosch

Speaker verification by integrating dynamic and static features using subspace method
Masafumi Nishida, Yasuo Ariki

Improvement of speaker recognition system by individual information weighting
Se-Hyun Kim, Gil-Jin Jang, Yung-Hwan Oh

Speaker verification in noise using temporal constraints
Néstor Becerra Yoma, Tarciano Facco Pegoraro

Speaker identification using discriminative features selection
Bogdan Sabac, Inge Gavat, Zica Valsan

A further investigation on speech features for speaker characterization
Ivan Magrin-Chagnolleau, Guilleaume Gravier, Mouhamadou Seck, Olivier Boeffard, R. Blouet, Frédéric Bimbot

Language identification from short segments of speech
Jyotsana Balleda, Hema A Murthy, T. Nagarajan

Generation of utterances based on visual context information
Susanne Kronenberg, Franz Kummert

A spoken dialogue system for conference/workshop services
Mazin Rahim, Roberto Pieraccini, Wieland Eckert, Esther Levin, Giuseppe Di Fabbrizio, Giuseppe Riccardi, Candy Kamm, Shrikanth Narayanan

Developing robust, user-centred multimodal spoken language systems: the MUeSLI project
Gavin Churcher, Peter Wyard

TABOR - a norwegian spoken dialogue system for bus travel information
Magne H. Johnsen, Torbjørn Svendsen, Tore Amble, Trym Holter, Erik Harborg

Language understanding component for Chinese dialogue system
Yinfei Huang, Fang Zheng, Mingxing Xu, Pengju Yan, Wenhu Wu

Designing a domain independent platform of spoken dialogue system
Kazumi Aoyama, Izumi Hirano, Hideaki Kikuchi, Katsuhiko Shirai

An enhanced BLSTIP dialogue research platform
Qiru Zhou, Antoine Saad, Sherif Abdou

Using machine learning method and subword unit representations for spoken document categorization
Weidong Qu, Katsuhiko Shirai

ASR satisficing: the effects of ASR accuracy on speech retrieval
Litza Stark, Steve Whittaker, Julia Hirschberg

A system for retrieving broadcast news speech documents using voice input keywords and similarity between words
Hiromitsu Nishizaki, Seiichi Nakagawa

Intention extraction and semantic matching for internet FAQ retrieval using spoken language query
Yu-Sheng Lai, Kuen-Lin Lee, Chung-Hsien Wu

A domain-independent model to improve spelling in a web environment
Robert J. van Vark, Jelle K. de Haan, Leon J. M. Rothkrantz

Expanded vector space model based on word space in cross media retrieval of news speech data
Seiichi Takao, Jun Ogata, Yasuo Ariki

Audio stream phrase recognition for a national gallery of the spoken word: "one small step"
John H. L. Hansen, Bowen Zhou, Murat Akbacak, Ruhi Sarikaya, Bryan Pellom

Pronunciation variants description using recognition error modeling with phonetic derivation hypotheses
Hideharu Nakajima, Yoshinori Sagisaka, Hirofumi Yamamoto

Evaluating responsiveness in spoken dialog systems
Wataru Tsukahara, Nigel Ward

Characteristics of spoken language required for objective quality evaluation of echo cancellers
Nobuhiko Kitawaki, Futoshi Asano, Takeshi Yamada

Evaluation of the ATR-matrix speech translation system with a pair comparison method between the system and humans
Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Yoshinori Sagisaka, Seiichi Yamamoto

An automatic timing detection method for superimposing closed captions of TV programs
Ichiro Maruyama, Yoshiharu Abe, Terumasa Ehara, Katsuhiko Shirai

Normalized time-frequency speech representation in articulation training systems
Marcel Ogner, Zdravko Kacic

Semantic transcoding: making the handicapped and the aged free from their barriers in obtaining information on the web
Shinichi Torihara, Katashi Nagao

The use of nonlinear energy transformation for Tamil connected-digit speech recognition
Rathinavelu Chengalvarayan

State based sub-band Wiener filters for speech enhancement in car environments
Aimin Chen, Saeed Vaseghi

Total least squares based subband modelling for scalable speech representations with damped sinusoids
Kris Hermus, Werner Verhelst, Patrick Wambacq, Philippe Lemmerling

Speech enhancement: new approaches to soft decision
Joon-Hyuk Chang, Nam Soo Kim





Recognition and Understanding of Spoken Language 3, 4


Subword-dependent speaker clustering for improved speech recognition
Li Jiang, Xuedong Huang

An equivalent-class based MMI learning method for MGCPM
Chunhua Luo, Fang Zheng, Mingxing Xu

Continuous speech recognition using articulatory data
Alan A. Wrench, Korin Richmond

Asynchrony with trained transition probabilities improves performance in multi-band speech recognition
Brian Mak, Yik-Cheung Tam

Discriminative MLPs in HMM-based recognition of speech in cellular telephony
Sunil Sivadas, Pratibha Jain, Hynek Hermansky

Acoustic modeling for spontaneous speech recognition using syllable dependent models
Toshiyuki Hanazawa, Jun Ishii, Yohei Okato, Kunio Nakajima

A robust training strategy against extraneous acoustic variations for spontaneous speech recognition
Hui Jiang, Li Deng

Improved performance and generalization of minimum classification error training for continuous speech recognition
Darryl W. Purnell, Elizabeth C. Botha

Dynamic threshold setting via Bayesian information criterion (BIC) in HMM training
Ying Jia, Yonghong Yan, Baosheng Yuan

Modelling sub-phone insertions and deletions in continuous speech recognition
Thomas Hain, Philip C. Woodland

Improved acoustics modeling for speech recognition using transformation techniques
Carrson C. Fung, Oscar C. Au, Wanggen Wan, Chi H. Yim, Cyan L. Keung

Discriminative training of tied-mixture HMM by deterministic annealing
Liang Gu, Jayanth Nayak, Kenneth Rose

Discriminative training in natural language call routing
Hong-Kwang Jeff Kuo, Chin-Hui Lee

A speech recognition method with a language-independent intermediate phonetic code
Kazuyo Tanaka, Hiroaki Kojima

Confidence measures based on the k-nn probability estimator
Fabrice Lefèvre

On deriving a phoneme model for a new language
Niloy Mukherjee, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma

Estimation of semantic case of Japanese dialogue by use of distance derived from statistics of dependency
Tomonobu Saito, Kiyoshi Hashimoto

A semantically-based confidence measure for speech recognition
Stephen Cox, Srinandan Dasmahapatra

Support vector machines for automatic data cleanup
Aravind Ganapathiraju, Joseph Picone

Competition-based score analysis for utterance verification in name recognition
Yong Gu, Trevor Thomas

Utterance verification/rejection for speaker-dependent and speaker-independent speech recognition
Yaxin Zhang

Emotion recognition in speech signal: experimental study, development, and application
Valery A. Petrushin

A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)
Ren-yuan Lyu, Chi-yu Chen, Yuang-chin Chiang, Min-shung Liang

A data-driven methodology for the production of multilingual conversational systems
Ossama Emam, Jorge Gonzalez, Carsten Günther, Eric Janke, Siegfried Kunzmann, Giulio Maltese, Claire Waast-Richard

Multi-path, context dependent SC-HMM architectures for improved connected word recognition
Tzur Vaich, Arnon Cohen

Robust recognition using multiple utterances
Yoram Meron, Keikichi Hirose

High performance Italian continuous "digit" recognition
Piero Cosi, John-Paul Hosom, Fabio Tesser

The automatic speech recognition engine ESPERE: experiments on telephone speech
Dominique Fohr, Odile Mella, Christophe Antoine

A comparison of distributed and network speech recognition for mobile communication systems
Imre Kiss

An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces
Joe Frankel, Korin Richmond, Simon King, Paul Taylor

The OGI kids² speech corpus and recognizers
Khaldoun Shobaki, John-Paul Hosom, Ronald A. Cole

Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning
Jian Wu, Fang Zheng

A three-stage solution for flexible vocabulary speech understanding
Grace Chung

Decoding speech in the presence of other sound sources
Jon Barker, Martin Cooke, Daniel P. W. Ellis

Efficient search strategy in large vocabulary continuous speech recognition using prosodic boundary information
Shi-Wook Lee, Keikichi Hirose, Nobuaki Minematsu

Large vocabulary Korean continuous speech recognition using a one-pass algorithm
Ha-Jin Yu, Hoon Kim, Joon-Mo Hong, Min-Seong Kim, Jong-Seok Lee

A tree-trellis n-best decoder for stochastic context-free grammars
Alexander Seward

EWAVES: an efficient decoding algorithm for lexical tree based speech recognition
Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua

Novel two-pass search strategy using time-asynchronous shortest-first second-pass beam search
Atsunori Ogawa, Yoshiaki Noda, Shoichi Matsunaga

Pruning of state-tying tree using bayesian information criterion with multiple mixtures
Yu-Chung Chan, Manhung Siu, Brian Mak

Improvements of the Philips 2000 Taiwan Mandarin benchmark system
Yuan-Fu Liao, Nick Wang, Max Huang, Hank Huang, Frank Seide

Extending the generation of word graphs for a cross-word m-gram decoder
Christoph Neukirchen, Xavier Aubert, Hans Dolfing

Improvements in search algorithm for large vocabulary continuous speech recognition
Qingwei Zhao, Zhiwei Lin, Baosheng Yuan, Yonghong Yan

New developments in automatic meeting transcription
Hua Yu, Takashi Tomokiyo, Zhirong Wang, Alex Waibel

Effective vector quantization for a highly compact acoustic model for LVCSR
Jielin Pan, Baosheng Yuan, Yonghong Yan

Effective lexical tree search for large vocabulary continuous speech recognition
Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori

Improvements in automatic speech summarization and evaluation methods
Chiori Hori, Sadaoki Furui

Automatic phonetic transcription of spontaneous speech (american English)
Shuangyu Chang, Lokendra Shastri, Steven Greenberg

Speed improvement of the tree-based time asynchronous search
Miroslav Novak, Michael Picheny

Recent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard)
Jing Huang, B. Kingsbury, L. Mangu, Mukund Padmanabhan, George Saon, Geoffrey Zweig

Speaker normalization training and adaptation for speech recognition
Lei He, Ditang Fang, Wenhu Wu

Lexical and acoustic modeling of non-native speech in LVSCR
Laura Mayfield Tomokiyo

Modeling phone correlation for speaker adaptive speech recognition
Baojie Li, Keikichi Hirose, Nobuaki Minematsu

Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices
Henrik Botterweck

Efficiently using speaker adaptation data
Chengyi Zheng, Yonghong Yan

A combination of speaker normalization and speech rate normalization for automatic speech recognition
Thilo Pfau, Robert Faltlhauser, Günther Ruske

Speech model compensation with direct adaptation of cepstral variance to noisy environment
Tai-Hwei Hwang, Kuo-Hwei Yuo, Hsiao-Chuan Wang

Gaussian similarity analysis and its application in speaker adaptation
Ji Wu, Zuoying Wang

A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique
Nobuyasu Itoh, Masafumi Nishimura, Shinsuke Mori

VODIS - voice-operated driver information systems: a usability study on advanced speech technologies for car environments
Petra Geutner, Luis Arevalo, Joerg Breuninger

Natural language call steering for service applications
Wu Chou, Qiru Zhou, Hong-Kwang Jeff Kuo, Antoine Saad, David Attwater, Peter Durston, Mark Farrell, Frank Scahill

A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas
Jörg Hunsinger, Manfred Lang

Low complexity connected digit recognition for mobile applications
Prabhu Raghavan, Sunil K. Gupta

Telephone speech recognition from large lists of Czech words
Jan Nouza

Speech and word detection algorithms for hands-free applications
Duanpei Wu, X. Menendez-Pidal, L. Olorenshaw, R. Chen, M. Tanaka, M. Amador

Large vocabulary continuous speech recognition of read speech over cellular and landline networks
Ashwin Rao, Bob Roth, Venkatesh Nagesha, Don McAllaster, Natalie Liberman, Larry Gillick





Adaptation and Acquisition in Spoken Language Processing 1, 2


Rapid adaptation of n-gram language models using inter-word correlation for speech recognition
Koki Sasaki, Hui Jiang, Keikichi Hirose

Class-based language model adaptation using mixtures of word-class weights
Gareth Moore, Steve Young

A language model adaptation approach based on text classification
Jiasong Sun, Xiaodong Cui, Zuoying Wang, Yang Liu

Automatically incorporating unknown words in JUPITER
Grace Chung

Look-ahead sequential feature vector normalization for noisy speech recognition
Rathinavelu Chengalvarayan

Speaker adaptation in noisy environments based on parameter estimation using uncertain data
Naoto Iwahashi, Akihiko Kawasaki

Speech/noise separation using two microphones and a VQ model of speech signals
Alex Acero, Steven Altschuler, Lani Wu

Using maximum likelihood linear regression for segment clustering and speaker identification
Michiel Bacchiani

Structural maximum a-posteriori linear regression for unsupervised speaker adaptation
Tor André Myrvoll, Olivier Siohan, Chin-Hui Lee, Wu Chou

Transformation-based Bayesian predictive classification for online environmental learning and robust speech recognition
Jen-Tzung Chien, Guo-Hong Liao

Improved MLLR speaker adaptation using confidence measures for conversational speech recognition
Michael Pitz, Frank Wessel, Hermann Ney

Unified acoustic modeling for continuous speech recognition
Rathinavelu Chengalvarayan

A nonlinear unsupervised adaptation technique for speech recognition
Satya Dharanipragada, Mukund Padmanabhan

Using class weighting in inter-class MLLR
Sam-Joo Doh, Richard M. Stern


Acoustics of Spoken Language (Poster)


Burst detection based on measurements of intensity discrimination
John-Paul Hosom, Ronald A. Cole

Using acoustic condition clustering to improve acoustic change detection on broadcast news
Javier Ferreiros López, Daniel P. W. Ellis

Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems
Jon P. Nedel, Rita Singh, Richard M. Stern

The measurement of acoustic similarity and its applications
Liqin Shen, Guokang Fu, Haixin Chai, Yong Qin

Glottal parameters contributing to the perceotion of loud voices
Sopae Yi, Hyung Soon Kim, One Good Lee

Grapheme based speech recognition for large vocabularies
Christoph Schillo, Gernot A. Fink, Franz Kummert

Automatic subword unit refinement for spontaneous speech recognition via phone splitting
Jon P. Nedel, Rita Singh, Richard M. Stern

Rhythm timing in Japanese English
Takeshi Tarui

A vocal tract area ratio estimation from spectral parameter extracted by straight
Mamoru Iwaki

Decision tree based rate of speech modeling for speech recognition
Bhuvana Ramabhadran, Yuqing Gao

Spectral peak tracking and its use in speech recognition
Mukund Padmanabhan

Weighted pairwise scatter to improve linear discriminant analysis
Yongxin Li, Yuqing Gao, Hakan Erdogan

ARTIC: a new Czech text-to-speech system using statistical approach to speech segment database construction
Jindrich Matousek, Josef Psutka

Extended maximum a posterior linear regression (EMAPLR) model adaptation for speech recognition
Wu Chou, Olivier Siohan, Tor André Myrvoll, Chin-Hui Lee

Thai monophthong recognition using continuous density hidden Markov model and LPC cepstral coefficients
Ekkarit Maneenoi, Somchai Jitapunkul, Visarut Ahkuputra, Umavasee Thathong, Boonchai Thampanitchawong, Sudaporn Luksaneeyanawin

Error recovery and sentence verification using statistical partial pattern tree for conversational speech
Chung-Hsien Wu, Yeou-Jiunn Chen, Cher-Yao Yang

Vowel landmark detection
Andrew Wilson Howitt

Rival training: efficient use of data in discriminative training
Carsten Meyer, Georg Rose

Nasal detection module for a knowledge-based speech recognition system
Marilyn Y. Chen

Semi-continuous segmental probability model for speech signals
Jun Liu, Xiaoyan Zhu, Bin Jia

Cross-domain robust acoustic training
Ea-Ee Jan, Jaime Botella Ordinas

A c/v segmentation method for Mandarin speech based on multiscale fractal dimension
Fan Wang, Fang Zheng, Wenhu Wu

An application of SAMPA-c for standard Chinese
Xiaoxia Chen, Aijun Li, Guohua Sun, Wu Hua, Zhigang Yu


Signal Analysis, Processing, and Feature Extraction


Joint speech signal enhancement based on spectral subtraction and SVD filter
Wenkai Lu, Xuegong Zhang, Yanda Li, Shen Liqin, Zhu Weibin

Inverse lattice filtering of speech with adapted non-uniform delays
Sacha Krstulovic, Frédéric Bimbot

Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay
Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari

Filterbank-based feature extraction for speech recognition and its application to voice mail transcription
Jun Huang, Mukund Padmanabhan

A cepstrum-based harmonics-to-noise ratio in voice signals
Peter J. Murphy

A pitch determination algorithm based on subharmonic-to-harmonic ratio
Xuejing Sun

Source separation techniques applied to speech linear prediction
Jordi Solé i Casals, Enric Monte i Moreno, Christian Jutten, Anisse Taleb

Model based voice decomposition method
Masahide Sugiyama

A time-varying complex speech analysis based on IV method
Keiichi Funaki

A sinusoidal model based on frequency-to-instantaneous frequency mapping
Parham Zolfaghari, Hideki Kawahara

Dynamic feature extraction by wavelet analysis
Omar Farooq, Sekharjit Datta

An investigation of variable block length methods for calculation of spectral/temporal features for automatic speech recognition
Montri Karnjanadecha, Stephen A. Zahorian

Glottal excitation modeling using HMM with application to robust analysis of speech signal
Akira Sasou, Kazuyo Tanaka

Automatic segmentation of speech based on hidden Markov models and acoustic features
Laura Docío-Fernández, Carmen García-Mateo

VERBMOBIL dialogues: multifaced analysis
Akira Kurematsu, Youichi Akegami, Susanne Burge, Susanne Jekat, Brigitte Lause, Victoria L. Maclaren, Daniela Oppermann, Tanja Schultz

A computation-efficient parameter adaptation algorithm for the generalized spectral subtraction method
Jin-Jie Zhang, Zhi-Gang Cao, Zheng-Xin Ma

A semantic tagging tool for spoken dialogue corpus
Masahiro Araki, Kiyoshi Ueda, Takuya Nishimoto, Yasuhisa Niimi

The phonetic labeling on read and spontaneous discourse corpora
Aijun Li, Xiaoxia Chen, Guohua Sun, Wu Hua, Zhigang Yin, Yiqing Zu, Fang Zheng, Zhanjiang Song

The quality of multilingual automatic segmentation using German MAUS
Nicole Beringer, Florian Schiel

UWB_S01 corpus - a czech read-speech corpus
Vlasta Radová, Josef Psutka

Web-based monitoring, logging and reporting tools for multi-service multi-modal systems
Giuseppe Di Fabbrizio, Shrikanth Narayanan

Comparing the recognition performance of CSRs: in search of an adequate metric and statistical significance test
Helmer Strik, Catia Cucchiarini, Judith M. Kessens

Perceptual dimensions of speech sound quality in modern transmission systems
Alexander Raake


×

Speech Production Control (Special Session)

Linguistics, Phonology, Phonetics, and Psycholinguistics 1, 2

Discourse and Dialogue 1, 2

Recognition and Understanding of Spoken Language 1, 2

Production of Spoken Language

Linguistics, Phonology, Phonetics, and Psycholinguistics 3

Dialogue Systems and Speech Input

Miscellaneous 1 [A,B,C,G,H,L,O,Q,X]

Speech Perception, Comprehension, and Production (Special Session)

Prosody 1, 2

Speech Interface and Dialogue Systems

Multimodal, Translingual, and Dialogue Systems

Production of Spoken Language (Poster)

Speaker, Dialect, and Language Recognition (Poster)

Prosody and Paralinguistics (Special Session)

Generation and Synthesis of Spoken Language 1, 2

Speaker, Dialect, and Language Recognition 1, 2

Linguistics, Phonology, Phonetics, and Psycholinguistics (Poster)

Spoken and Multi-Modal Dialogue Systems

Speech, Facial Expression, and Gesture

Generation and Synthesis of Spoken Language 3

Speaker, Dialect, and Language Recognition 3

Miscellaneous Topics 2 [M,J]

Trans-Modal and Multi-Modal Human-Computer Interaction (Special Session)

Signal Analysis, Processing, and Feature Extraction 1, 2

Language Modeling

Acoustic Modeling

Prosody (Poster)

Generation and Synthesis of Spoken Language (Poster)

Rules and Corpora (Special Session)

Perception and Comprehension of Spoken Language 1, 2

Spoken Language Processing

Acoustic Features for Robust Speech Recognition

Prosody, Acquisition, and Learning

Adaptation and Acquisition in Spoken Language Processing (Poster)

Large Vocabulary Continuous Speech Recognition

Speech Coding and Transmission

Acoustic Model Adaptation

Miscellaneous 3 [D,E,F,I,P,N,R,S,U,W,Y,Z]

Language Resources and Technology Evaluation (Special Session)

Acquisition and Learning of Spoken Language 1, 2

Acoustics of Spoken Language 1, 2

Recognition and Understanding of Spoken Language 3, 4

Problems and Prospects of Trans-Lingual Communication (Special Session)

Spoken Language Resources, Labeling, and Assessment

Robust Modeling

Adaptation and Acquisition in Spoken Language Processing 1, 2

Acoustics of Spoken Language (Poster)

Signal Analysis, Processing, and Feature Extraction