Program Overview
Technical Program
Wednesday, March 24 |
|
Oral 1: Applications of Speech Technologies for Learning and Education Wednesday, 24 March 2021 Chair: Mireia Farrús |
|
O1.1 10:00 – 10:10 |
Prosodic feature selection for automatic quality assessment of oral productions in people with Down syndrome (abs) (pdf) |
David Escudero, Valentín Cardeñoso-Payo, Mario Corrales Astorgano and César González-Ferreras | |
O1.2 10:10 – 10:20 |
Performance Comparison of Specific and General-Purpose ASR Systems for Pronunciation Assessment of Japanese Learners of Spanish (abs) (pdf) |
Cristian Tejedor-García, Valentín Cardeñoso-Payo and David Escudero-Mancebo | |
O1.3 10:20 – 10:30 |
An ASR-based Reading Tutor for Practicing Reading Skills in the First Grade: Improving Performance through Threshold Adjustment (abs) (pdf) |
Yu Bai, Ferdy Hubers, Catia Cucchiarini and Helmer Strik | |
O1.4 10:30 – 10:40 |
Impact of vowel reduction in L2 Chinese learners of Portuguese within and across word boundaries (abs) (pdf) |
Catarina Realinho, Rita Gonçalves, Helena Moniz and Isabel Trancoso | |
O1.5 10:40 – 10:50 |
Nativeness Assessment for Crowdsourced Speech Collections (abs) (pdf) |
Diogo Botelheiro, Alberto Abad, João Freitas and Rui Correia | |
Keynote 1 Wednesday, 24 March 2021 | |
KN1 11:00 – 12:00 |
Characterizing and Assessing the Oral Reading Fluency of Young Readers (abs) |
Gérard Bailly and Erika Godde | |
Oral 2: Speech Processing and Acoustic Event Detection Wednesday, 24 March 2021 Chair: Luis Javier Rodríguez Fuentes |
|
O2.1 12:30 – 12:40 |
Convolutional Recurrent Neural Networks for Speech Activity Detection in Naturalistic Audio from Apollo Missions (abs) (pdf) |
Pablo Gimeno, Dayana Ribas, Alfonso Ortega, Antonio Miguel and Eduardo Lleida | |
O2.2 12:40 – 12:50 |
Dual-channel eKF-RTF framework for speech enhancement with DNN-based speech presence estimation (abs) (pdf) |
Juan Manuel Martín-Doñas, Antonio M. Peinado, Iván López-Espejo and Angel Gomez | |
O2.3 12:50 – 13:00 |
An analysis of Sound Event Detection under acoustic degradation using multi-resolution systems (abs) (pdf) |
Diego de Benito-Gorrón, Daniel Ramos and Doroteo T. Toledano | |
O2.4 13:00 – 13:10 |
Speech Enhancement for Wake-Up-Word detection in Voice Assistants (abs) (pdf) |
David Bonet, Guillermo Cámbara, Fernando López, Pablo Gómez, Carlos Segura, Jordi Luque and Mireia Farrús | |
O2.5 13:10 – 13:20 |
An approach to intent detection and classification based on attentive recurrent neural networks (abs) (pdf) |
Fernando Fernández-Martínez, David Griol, Zoraida Callejas and Cristina Luna-Jiménez | |
O2.6 13:20 – 13:30 |
Contrasting the Emotions identified in Spanish TV debates and in Human-Machine Interactions (abs) (pdf) |
Mikel de Velasco, Raquel Justo, Leila Ben Letaifa and M. Inés Torres | |
O2.7 13:30 – 13:40 |
A proposal for emotion recognition using speech features, transfer learning and convolutional neural networks (abs) (pdf) |
Roberto Móstoles, David Griol, Zoraida Callejas and Fernando Fernández-Martínez | |
O2.8 13:40 – 13:50 |
Using Audio Events to Extend a Multi-modal Public Speaking Database with Reinterpreted Emotional Annotations (abs) (pdf) |
Esther Rituerto-González, Clara Luis-Mingueza and Carmen Pelález-Moreno | |
Albayzín Evaluation Challenges Wednesday, 24 March 2021 Chair: Eduardo Lleida and Alfonso Ortega |
|
Ch.0 15:00 – 15:20 |
Presentation of the Albayzín Evaluation Challenges |
Eduardo Lleida, Alfonso Ortega and Javier Tejedor | |
Ch.1 15:20 – 15:30 |
Cenatav Voice Group System for Albayzin 2020 Search on Speech Evaluation |
Jose M. Ramirez , Ana R. Montalvo and Jose R. Calvo | |
Ch.2 15:30 – 15:40 |
Query-by-Example Spoken Term Detection using Attentive Pooling Networks at ALBAYZIN 2020 Evaluation: The AUDIAS-UAM System (abs) (pdf) |
Juan Ignacio Álvarez-Trejos and Doroteo T. Toledano | |
Ch.3 15:40 – 15:50 |
GTH-UPM System for Albayzin Multimodal Diarization Challenge 2020 (abs) (pdf) |
Cristina Luna-Jiménez, Ricardo Kleinlein, Fernando Fernández-Martínez, José Manuel Pardo-Muñoz and José Manuel Moya-Fernández | |
Ch.4 15:50 – 16:00 |
ViVoLAB Multimodal Diarization System for RTVE 2020 Challenge (abs) (pdf) |
Victoria Mingote, Ignacio Viñals, Pablo Gimeno, Antonio Miguel, Alfonso Ortega and Eduardo Lleida | |
Ch.5 16:00 – 16:10 |
The GTM-UVIGO System for Audiovisual Diarization 2020 (abs) (pdf) |
Manuel Porta-Lorenzo, José Luis Alba-Castro and Laura Docío-Fernández | |
Ch.6 16:10 – 16:20 |
The Biometric Vox System for the Albayzin-RTVE 2020 Speaker Diarization and Identity Assignment Challenge (abs) (pdf) |
Roberto Font and Teresa Grau | |
Ch.7 16:20 – 16:30 |
The CLIR-CLSP System for the IberSPEECH-RTVE 2020 Speaker Diarization and Identity Assignment Challenge (abs) (pdf) |
Carlos Rodrigo Castillo-Sanchez and Leibny Paola Garcia-Perera | |
Ch.8 16:30 – 16:40 |
Diarization and Identity Attribution Compatibility in the Albayzin 2020 Challenge (abs) (pdf) |
Ignacio Viñals, Pablo Gimeno, Alfonso Ortega, Antonio Miguel and Eduardo Lleida | |
16:40 – 17:00 | Break |
Ch.9 17:00 – 17:10 |
The Biometric Vox System for the Albayzin-RTVE 2020 Speech-to-Text Challenge (abs) (pdf) |
Roberto Font and Teresa Grau | |
Ch.10 17:10 – 17:20 |
The Vicomtech Speech Transcription Systems for the Albayzín-RTVE 2020 Speech to Text Transcription Challenge (abs) (pdf) |
Aitor Álvarez, Haritz Arzelus, Iván G. Torre and Ander González-Docasal | |
Ch.11 17:20 – 17:30 |
End to End AUDIAS-UAM system for Albayzin 2020 Speech to Text Challenge |
Beltrán Labrador, Diego de Benito-Gorrón and Doroteo T. Toledano | |
Ch.12 17:30 – 17:40 |
Sigma-UPM ASR Systems for the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge (abs) (pdf) |
Juan M. Perero-Codosero, Fernando M. Espinoza-Cuadros and Luis A. Hernández-Gómez | |
Ch.13 17:40 – 17:50 |
BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge (abs) (pdf) |
Martin Kocour, Guillermo Cámbara, Jordi Luque, David Bonet, Mireia Farrús, Martin Karafiát, Karel Veselý and Jan Černocký | |
Ch.14 17:50 – 18:00 |
MLLP-VRAIN Spanish ASR Systems for the Albayzin-RTVE 2020 Speech-To-Text Challenge (abs) (pdf) |
Javier Jorge, Adrià Giménez, Pau Baquero-Arnal, Javier Iranzo-Sánchez, Alejandro Pérez, Gonçal V. Garcés Díaz-Munío, Joan Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchis and Alfons Juan | |
Research and Development Projects Wednesday, 24 March 2021 Chair: Francesc Alías |
|
Pr.1 18:00 – 18:10 |
Incorporation of an automatic module for the prediction of the quality of oral communication of people with Down syndrome in an educational video game (abs) (pdf) |
David Escudero, Valentín Cardeñoso-Payo, Mario Corrales Astorgano, César González-Ferreras, Valle Flores Lucas, Lourdes Aguilar, Yolanda Martín-de-San-Pablo and Alfonso Rodríguez-de-Rojas | |
Pr.2 18:10 – 18:20 |
CIRUSS Platform: Surgery Patient Empowerment by Stress and Anxiety Monitoring (abs) (pdf) |
Sergio Figueras, Alejandro García-Caballero, Carmen Garcia Mateo, Laura Docio-Fernandez, Edward L. Campbell, Baltasar G. Perez-Schofield, Leandro Rodríguez-Liñares and Arturo J. Méndez | |
Pr.3 18:20 – 18:30 |
Voice Restoration with Silent Speech Interfaces (ReSSInt) (abs) (pdf) |
Inma Hernaez, Jose Andrés González-López, Eva Navas, Jose Luis Pérez Córdoba, Ibon Saratxaga, Gonzalo Olivares, Jon Sánchez de la Fuente, Alberto Galdón, Víctor García Romillo, Míriam González-Atienza, Tanja Schultz, Phil Green, Michael Wand, Ricard Marxer and Lorenz Diener | |
Pr.4 18:30 – 18:40 |
The Vox Senes project: a study of segmental changes and rhythm variations on European Portuguese aging voice (abs) (pdf) |
Catarina Oliveira, Ana Rita Valente, Luciana Albuquerque, Fábio Barros, Paula Martins, Samuel Silva and António Teixeira | |
Pr.5 18:40 – 18:50 |
Hispabot-Covid19: the official Spanish conversational system about Covid-19 (abs) (pdf) |
David Griol, David Pérez Fernández and Zoraida Callejas | |
Pr.6 18:50 – 19:00 |
Project MEMNON: Extending Speech Production Studies to Silent Speech, Dynamic Sounds and Audiovisual Speech Synthesis (abs) (pdf) |
Samuel Silva, António Teixeira, Nuno Almeida, Diogo Silva, David Ferreira and Conceição Cunha | |
Pr.7 19:00 – 19:10 |
Towards conversational technology to promote, monitor and protect mental health (abs) (pdf) |
Zoraida Callejas, David Griol, Kawtar Benghazi, Manuel Noguera, María Inés Torres, Raquel Justo, Anna Esposito, Gennaro Cordasco, Raymond Bond, Maurice Mulvenna, Edel Ennis, Siobhan O’Neill, Huiru Zheng, Matthias Kraus, Nicolas Wagner, Wolfgang Minker, Gavin McConvey, Matthias Hemmje, Michael Fuchs, Neil Glackin and Gérard Chollet | |
Pr.8 19:10 – 19:20 |
GENIOVOX Project: Computational generation of expressive voice (abs) (pdf) |
Oriol Guasch, Francesc Alías, Marc Arnela, Joan Claudi Socoró, Marc Freixes and Arnau Pont | |
Ph.D. Thesis Wednesday, 24 March 2021 Chair: Carlos D. Martínez Hinarejos |
|
PhD.1 19:30 – 19:40 |
Adverse Drug Reaction extraction on Electronic Health Records written in Spanish: A PhD Thesis overview (abs) (pdf) |
Sara Santiso | |
PhD.2 19:40 – 19:50 |
Design and Evaluation of Mobile Computer-Assisted Pronunciation Training Tools for Second Language Learning: a Ph.D. Thesis Overview (abs) (pdf) |
Cristian Tejedor-García, Valentín Cardeñoso-Payo and David Escudero-Mancebo | |
PhD.3 19:50 – 20:00 |
New tools for the differential evaluation of Parkinson’s disease using voice and speech processing (abs) (pdf) |
Laureano Moro-Velazquez, Jorge Gomez-Garcia, Najim Dehak and Juan Ignacio Godino-Llorente | |
PhD.4 20:00 – 20:10 |
Prosody training of people with Down syndrome using an educational video game (abs) (pdf) |
Mario Corrales-Astorgano | |
PhD.5 20:10 – 20:20 |
Self-supervised Deep Learning Approaches to Speaker Recognition: A Ph.D. Thesis Overview (abs) (pdf) |
Umair Khan and Javier Hernando | |
Thursday, March 25 |
|
Oral 3: ASR and NLP Techniques Thursday, 25 March 2021 Chair: David Griol |
|
O3.1 10:00 – 10:10 |
A study of data augmentation for increased ASR robustness against packet losses (abs) (pdf) |
María Pilar Fernández-Gallego and Doroteo T. Toledano | |
O3.2 10:10 – 10:20 |
TRIBUS: An end-to-end automatic speech recognition system for European Portuguese (abs) (pdf) |
Carlos Carvalho and Alberto Abad | |
O3.3 10:20 – 10:30 |
mintzai-ST: Corpus and Baselines for Basque-Spanish Speech Translation (abs) (pdf) |
Thierry Etchegoyhen, Haritz Arzelus, Harritxu Gete Ugarte, Aitor Alvarez, Ander González-Docasal and Edson Benites Fernandez | |
O3.4 10:30 – 10:40 |
Confidence Measures for Interactive Neural Machine Translation (abs) (pdf) |
Angel Navarro and Francisco Casacuberta | |
O3.5 10:40 – 10:50 |
Sentence Embeddings and Sentence Similarity for Portuguese FAQs (abs) (pdf) |
Nuno Carriço and Paulo Quaresma | |
O3.6 10:50 – 11:00 |
Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning (abs) (pdf) |
Rui Ribeiro, Alberto Abad and José Lopes | |
Keynote 2 Thursday, 25 March 2021 | |
KN2 11:00 – 12:00 |
Diverse Conversational Spoken Language Generation (abs) |
Antonio Bonafonte | |
Oral 4: Speech Synthesis and Multimodal Processing Thursday, 25 March 2021 Chair: Helena Moniz |
|
O4.1 12:30 – 12:40 |
Automatic Speaker Adaptation Assessment Based on Objective Measures for Voice Banking Donors (abs) (pdf) |
Agustin Alonso, Victor García, Inma Hernaez, Eva Navas and Jon Sanchez | |
O4.2 12:40 – 12:50 |
Data-driven analysis of nasal vowels dynamics and coordination: Results for bilabial contexts (abs) (pdf) |
Conceição Cunha, Nuno Almeida, Jens Frahm, Samuel Silva and António Teixeira | |
O4.3 12:50 – 13:00 |
Analysis of Visual Features for Continuous Lipreading in Spanish (abs) (pdf) |
David Gimeno-Gómez and Carlos-D. Martínez-Hinarejos | |
O4.4 13:00 – 13:10 |
Implementation of neural network based synthesizers for Spanish and Basque (abs) (pdf) |
Victor Garcia, Inma Hernaez and Eva Navas | |
O4.5 13:10 – 13:20 |
Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis (abs) (pdf) |
Jose Andres Gonzalez Lopez, Miriam González Atienza, Alejandro Gómez Alanis, José Luis Pérez Córdoba and Phil D. Green | |
O4.6 13:20 – 13:30 |
Generation of Synthetic Sign Language Sentences (abs) (pdf) |
Aitana Villaplana and Carlos David Martinez Hinarejos | |
O4.7 13:30 – 13:40 |
Contribution of vocal tract and glottal source spectral cues in the generation of happy and aggressive [a] vowels (abs) (pdf) |
Marc Freixes, Francesc Alías and Joan Claudi Socoró | |
O4.8 13:40 – 13:50 |
The age effects on EP vowel production: an ultrasound pilot study (abs) (pdf) |
Luciana Albuquerque, Ana Rita Valente, Fábio Barros, António Teixeira, Samuel Silva, Paula Martins and Catarina Oliveira | |
Oral 5: Speaker Characterization and Diarization Thursday, 25 March 2021 Chair: Alberto Abad |
|
O5.1 15:00 – 15:10 |
Exploring Transformer-based Language Recognition using Phonotactic Information (abs) (pdf) |
David Romero, Luis Fernando D’Haro and Christian Salamea | |
O5.2 15:10 – 15:20 |
Adversarial Transformation of Spoofing Attacks for Voice Biometrics (abs) (pdf) |
Alejandro Gomez-Alanis, Jose A. Gonzalez and Antonio M. Peinado | |
O5.3 15:20 – 15:30 |
Active correction for speaker diarization with human in the loop (abs) (pdf) |
Yevhenii Prokopalo, Meysam Shamsi, Loic Barrault, Sylvain Meignier and Anthony Larcher | |
O5.4 15:30 – 15:40 |
An Automatic System for Dementia Detection using Acoustic and Linguistic Features (abs) (pdf) |
Miriam Gonzalez-Atienza, Antonio M. Peinado and Jose A. Gonzalez-Lopez | |
O5.5 15:40 – 15:50 |
Alzheimer’s Dementia Detection from Audio and Language Modalities in Spontaneous Speech (abs) (pdf) |
Edward L. Campbell, Laura Docio-Fernandez, Javier Jiménez-Raboso and Carmen Gacia-Mateo |