ISCA ARCHIVE SESSIONS WEB

IberSPEECH 2021

24-25 March 2021, Valladolid, Spain

Chairs: Valentín Cardeñoso-Payo, David Escudero-Mancebo and César González-Ferreras

DOI: 10.21437/IberSPEECH.2021

Applications of Speech Technologies for Learning and Education


Prosodic feature selection for automatic quality assessment of oral productions in people with Down syndrome
David Escudero, Valentín Cardeñoso-Payo, Mario Corrales Astorgano, César González-Ferreras

Performance Comparison of Specific and General-Purpose ASR Systems for Pronunciation Assessment of Japanese Learners of Spanish
Cristian Tejedor-García, Valentín Cardeñoso-Payo, David Escudero-Mancebo

An ASR-based Reading Tutor for Practicing Reading Skills in the First Grade: Improving Performance through Threshold Adjustment
Yu Bai, Ferdy Hubers, Catia Cucchiarini, Helmer Strik

Impact of vowel reduction in L2 Chinese learners of Portuguese within and across word boundaries
Catarina Realinho, Rita Gonçalves, Helena Moniz, Isabel Trancoso

Nativeness Assessment for Crowdsourced Speech Collections
Diogo Botelheiro, Alberto Abad, João Freitas, Rui Correia



Speech Processing and Acoustic Event Detection


Convolutional Recurrent Neural Networks for Speech Activity Detection in Naturalistic Audio from Apollo Missions
Pablo Gimeno, Dayana Ribas, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

Dual-channel eKF-RTF framework for speech enhancement with DNN-based speech presence estimation
Juan Manuel Martín-Doñas, Antonio M. Peinado, Iván López-Espejo, Angel Gomez

An analysis of Sound Event Detection under acoustic degradation using multi-resolution systems
Diego de Benito-Gorrón, Daniel Ramos, Doroteo T. Toledano

Speech Enhancement for Wake-Up-Word detection in Voice Assistants
David Bonet, Guillermo Cámbara, Fernando López, Pablo Gómez, Carlos Segura, Jordi Luque, Mireia Farrús

An approach to intent detection and classification based on attentive recurrent neural networks
Fernando Fernández-Martínez, David Griol, Zoraida Callejas, Cristina Luna-Jiménez

Contrasting the Emotions identified in Spanish TV debates and in Human-Machine Interactions
Midel de Velasco, Raquel Justo, Leila Ben Letaifa, M. Inés Torres

A proposal for emotion recognition using speech features, transfer learning and convolutional neural networks
Roberto Móstoles, David Griol, Zoraida Callejas, Fernando Fernández-Martínez

Using Audio Events to Extend a Multi-modal Public Speaking Database with Reinterpreted Emotional Annotations
Esther Rituerto-González, Clara Luis-Mingueza, Carmen Pelález-Moreno


Albayzín Evaluation Challenges


Query-by-Example Spoken Term Detection using Attentive Pooling Networks at ALBAYZIN 2020 Evaluation: The AUDIAS-UAM System
Juan Ignacio Álvarez-Trejos, Doroteo T. Toledano

GTH-UPM System for Albayzin Multimodal Diarization Challenge 2020
Cristina Luna-Jiménez, Ricardo Kleinlein, Fernando Fernández-Martínez, José Manuel Pardo-Muñoz, José Manuel Moya-Fernández

ViVoLAB Multimodal Diarization System for RTVE 2020 Challenge
Victoria Mingote, Ignacio Viñals, Pablo Gimeno, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

The GTM-UVIGO System for Audiovisual Diarization 2020
Manuel Porta-Lorenzo, José Luis Alba-Castro, Laura Docío-Fernández

The Biometric Vox System for the Albayzin-RTVE 2020 Speaker Diarization and Identity Assignment Challenge
Roberto Font, Teresa Grau

The CLIR-CLSP System for the IberSPEECH-RTVE 2020 Speaker Diarization and Identity Assignment Challenge
Carlos Rodrigo Castillo-Sanchez, Leibny Paola Garcia-Perera

Diarization and Identity Attribution Compatibility in the Albayzin 2020 Challenge
Ignacio Viñals, Pablo Gimeno, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

The Biometric Vox System for the Albayzin-RTVE 2020 Speech-to-Text Challenge
Roberto Font, Teresa Grau

The Vicomtech Speech Transcription Systems for the Albayzín-RTVE 2020 Speech to Text Transcription Challenge
Aitor Álvarez, Haritz Arzelus, Iván G. Torre, Ander González-Docasal

Sigma-UPM ASR Systems for the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge
Juan M. Perero-Codosero, Fernando M. Espinoza-Cuadros, Luis A. Hernández-Gómez

BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge
Martin Kocour, Guillermo Cámbara, Jordi Luque, David Bonet, Mireia Farrús, Martin Karafiát, Karel Veselý, Jan Černocký

MLLP-VRAIN Spanish ASR Systems for the Albayzin-RTVE 2020 Speech-To-Text Challenge
Javier Jorge, Adrià Giménez, Pau Baquero-Arnal, Javier Iranzo-Sánchez, Alejandro Pérez, Gonçal V. Garcés Díaz-Munío, Joan Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchis, Alfons Juan


Research and Development Projects


Incorporation of an automatic module for the prediction of the quality of oral communication of people with Down syndrome in an educational video game
David Escudero, Valentín Cardeñoso-Payo, Mario Corrales Astorgano, César González-Ferreras, Valle Flores Lucas, Lourdes Aguilar, Yolanda Martín-de-San-Pablo, Alfonso Rodríguez-de-Rojas

CIRUSS Platform: Surgery Patient Empowerment by Stress and Anxiety Monitoring
Sergio Figueras, Alejandro García-Caballero, Carmen Garcia Mateo, Laura Docio-Fernandez, Edward L. Campbell, Baltasar G. Perez-Schofield, Leandro Rodríguez-Liñares, Arturo J. Méndez

Voice Restoration with Silent Speech Interfaces (ReSSInt)
Inma Hernaez, Jose Andrés González-López, Eva Navas, Jose Luis Pérez Córdoba, Ibon Saratxaga, Gonzalo Olivares, Jon Sánchez de la Fuente, Alberto Galdón, Víctor García Romillo, Míriam González-Atienza, Tanja Schultz, Phil Green, Michael Wand, Ricard Marxer, Lorenz Diener

The Vox Senes project: a study of segmental changes and rhythm variations on European Portuguese aging voice
Catarina Oliveira, Ana Rita Valente, Luciana Albuquerque, Fábio Barros, Paula Martins, Samuel Silva, António Teixeira

Hispabot-Covid19: the official Spanish conversational system about Covid-19
David Griol, David Pérez Fernández, Zoraida Callejas

Project MEMNON: Extending Speech Production Studies to Silent Speech, Dynamic Sounds and Audiovisual Speech Synthesis
Samuel Silva, António Teixeira, Nuno Almeida, Diogo Cunha, David Ferreira, Conceição Cunha

Towards conversational technology to promote, monitor and protect mental health
Zoraida Callejas, David Griol, Kawtar Benghazi, Manuel Noguera, María Inés Torres, Raquel Justo, Anna Esposito, Gennaro Cordasco, Raymond Bond, Maurice Mulvenna, Edel Ennis, Siobhan O'Neill, Huiru Zheng, Matthias Kraus, Nicolas Wagner, Wolfgang Minker, Gavin McConvey, Matthias Hemmje, Michael Fuchs, Neil Glackin, Gérard Chollet

GENIOVOX Project: Computational generation of expressive voice
Oriol Guasch, Francesc Alías, Marc Arnela, Joan Claudi Socoró, Marc Freixes, Arnau Pont



ASR and NLP Techniques


A study of data augmentation for increased ASR robustness against packet losses
María Pilar Fernández-Gallego, Doroteo T. Toledano

TRIBUS: An end-to-end automatic speech recognition system for European Portuguese
Carlos Carvalho, Alberto Abad

mintzai-ST: Corpus and Baselines for Basque-Spanish Speech Translation
Thierry Etchegoyhen, Haritz Arzelus, Harritxu Gete Ugarte, Aitor Alvarez, Ander González-Docasal, Edson Benites Fernandez

Confidence Measures for Interactive Neural Machine Translation
Angel Navarro, Francisco Casacuberta

Sentence Embeddings and Sentence Similarity for Portuguese FAQs
Nuno Carriço, Paulo Quaresma

Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning
Rui Ribeiro, Alberto Abad, José Lopes



Speech Synthesis and Multimodal Processing


Automatic Speaker Adaptation Assessment Based on Objective Measures for Voice Banking Donors
Agustin Alonso, Victor García, Inma Hernaez, Eva Navas, Jon Sanchez

Data-driven analysis of nasal vowels dynamics and coordination: Results for bilabial contexts
Conceição Cunha, Nuno Almeida, Jens Frahm, Samuel Silva, António Teixeira

Analysis of Visual Features for Continuous Lipreading in Spanish
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Implementation of neural network based synthesizers for Spanish and Basque
Victor Garcia, Inma Hernaez, Eva Navas

Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis
Jose Andres Gonzalez Lopez, Miriam González Atienza, Alejandro Gómez Alanis, José Luis Pérez Córdoba, Phil D. Green

Generation of Synthetic Sign Language Sentences
Aitana Villaplana, Carlos David Martinez Hinarejos

Contribution of vocal tract and glottal source spectral cues in the generation of happy and aggressive [a] vowels
Marc Freixes, Francesc Alías, Joan Claudi Socoró

The age effects on EP vowel production: an ultrasound pilot study
Luciana Albuquerque, Ana Rita Valente, Fábio Barros, António Teixeira, Samuel Silva, Paula Martins, Catarina Oliveira


Speaker Characterization and Diarization


Exploring Transformer-based Language Recognition using Phonotactic Information
David Romero, Luis Fernando D'Haro, Christian Salamea

Adversarial Transformation of Spoofing Attacks for Voice Biometrics
Alejandro Gomez-Alanis, Jose A. Gonzalez, Antonio M. Peinado

Active correction for speaker diarization with human in the loop
Yevhenii Prokopalo, Meysam Shamsi, Loic Barrault, Sylvain Meignier, Anthony Larcher

An Automatic System for Dementia Detection using Acoustic and Linguistic Features
Miriam Gonzalez-Atienza, Antonio M. Peinado, Jose A. Gonzalez-Lopez

Alzheimer's Dementia Detection from Audio and Language Modalities in Spontaneous Speech
Edward L. Campbell, Laura Docio-Fernandez, Javier Jiménez-Raboso, Carmen Gacia-Mateo


Applications of Speech Technologies for Learning and Education

Keynote 1

Speech Processing and Acoustic Event Detection

Albayzín Evaluation Challenges

Research and Development Projects

Ph.D. Thesis

ASR and NLP Techniques

Keynote 2

Speech Synthesis and Multimodal Processing

Speaker Characterization and Diarization