default search action
Odyssey 2020: Tokyo, Japan
- Kong-Aik Lee, Takafumi Koshinaka, Koichi Shinoda:
Odyssey 2020: The Speaker and Language Recognition Workshop, 1-5 November 2020, Tokyo, Japan. ISCA 2020
Keynote: Sadaoki Furui
- Sadaoki Furui:
Modeling of Perceptual Speaker Embedding and Its Application to Speech and Speaker Recognition.
Speaker Recognition 1
- Daniel Garcia-Romero, Gregory Sell, Alan McCree:
MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition. 1-8 - Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff:
BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition. 9-16 - Yingke Zhu, Brian Mak:
Orthogonality Regularizations for End-to-End Speaker Verification. 17-23 - Anna Silnova, Niko Brummer, Johan Rohdin, Themos Stafylakis, Lukás Burget:
Probabilistic Embeddings for Speaker Diarization. 24-31
Speaker and Language Recognition
- Rashmi Kethireddy, Sudarsana Reddy Kadiri, Santosh Kesiraju, Suryakanth V. Gangashetty:
Zero-Time Windowing Cepstral Coefficients for Dialect Classification. 32-38 - Raphaël Duroselle, Denis Jouvet, Irina Illina:
Unsupervised Regularization of the Embedding Extractor for Robust Language Identification. 39-46 - Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai:
Compensation on x-vector for Short Utterance Spoken Language Identification. 47-52 - Po-Chin Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan, Shan-Wen Hsiao:
Improving Embedding-based Neural-Network Speaker Recognition. 53-59 - Min Hyun Han, Woo Hyun Kang, Sung Hwan Mun, Nam Soo Kim:
Information Preservation Pooling for Speaker Embedding. 60-66 - Ville Vestman, Kong Aik Lee, Tomi Kinnunen:
Neural i-vectors. 67-74 - Mohammad MohammadAmini, Driss Matrouf, Paul-Gauthier Noé:
Denoising x-vectors for Robust Speaker Recognition. 75-80 - Pierre-Michel Bousquet, Mickael Rouvier:
Adaptation Strategy and Clustering from Scratch for New Domains of Speaker Recognition. 81-87 - Mitchell McLaren, Md. Hafizur Rahman, Diego Castán, Mahesh Kumar Nandwana, Aaron Lawson:
Adaptive Mean Normalization for Unsupervised Adaptation of Speaker Embeddings. 88-94
Diarization
- Andreas Stolcke:
Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm. 95-101 - Qingjian Lin, Weicheng Cai, Lin Yang, Junjie Wang, Jun Zhang, Ming Li:
DIHARD II is Still Hard: Experimental Results and Discussions from the DKU-LENOVO Team. 102-109 - Liping Chen, Kong-Aik Lee, Lei He, Frank K. Soong:
On Early-stop Clustering for Speaker Diarization. 110-116 - Nikolaos Flemotomos, Panayiotis G. Georgiou, Shrikanth Narayanan:
Linguistically Aided Speaker Diarization Using Speaker Role Information. 117-124 - Qingjian Lin, Tingle Li, Lin Yang, Junjie Wang, Ming Li:
Optimal Mapping Loss: A Faster Loss for End-to-End Speaker Diarization. 125-131
Spoofing and Countermeasure 1
- Tianxiang Chen, Avrosh Kumar, Parav Nagarsheth, Ganesh Sivaraman, Elie Khoury:
Generalization of Audio Deepfake Detection. 132-137 - Qiongqiong Wang, Kong Aik Lee, Takafumi Koshinaka:
Using Multi-Resolution Feature Maps with Convolutional Neural Networks for Anti-Spoofing in ASV. 138-142 - Madhu R. Kamble, Hemant A. Patil:
Novel Variable Length Teager Energy Profiles for Replay Spoof Detection. 143-150 - Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi:
An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning. 151-158 - Xiaohai Tian, Rohan Kumar Das, Haizhou Li:
Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion. 159-164
Keynote: Mirco Ravanelli
- Mirco Ravanelli:
Towards Unsupervised Learning of Speech Representations.
Special Session: VOiCES 2020
- Mahesh Kumar Nandwana, Michael Lomnitz, Colleen Richey, Mitchell McLaren, Diego Castán, Luciana Ferrer, Aaron Lawson:
The VOiCES from a Distance Challenge 2019: Analysis of Speaker Verification Results and Remaining Challenges. 165-170 - Jee-Weon Jung, Ju-ho Kim, Hye-Jin Shim, Seung-bin Kim, Ha-Jin Yu:
Selective Deep Speaker Embedding Enhancement for Speaker Verification. 171-178 - Aleksei Gusev, Vladimir Volokhov, Tseren Andzhukaev, Sergey Novoselov, Galina Lavrentyeva, Marina Volkova, Alice Gazizullina, Andrey Shulipa, Artem Gorlanov, Anastasia Avdeeva, Artem Ivanov, Alexander Kozlov, Timur Pekhovsky, Yuri Matveev:
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances. 179-186 - Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Jan Cernocký:
Utilizing VOiCES Dataset for Multichannel Speaker Verification with Beamforming. 187-193 - Raghuveer Peri, Haoqi Li, Krishna Somandepalli, Arindam Jati, Shrikanth Narayanan:
An Empirical Analysis of Information Encoded in Disentangled Neural Speaker Representations. 194-201 - Shreyas Ramoji, Prashant Krishnan V, Sriram Ganapathy:
NPLDA: A Deep Neural PLDA Model for Speaker Verification. 202-209 - Weiwei Lin, Man-Wai Mak, Lu Yi:
Learning Mixture Representation for Deep Speaker Embedding Using Attention. 210-214
Voice Conversion and Synthesis
- Dongsuk Yook, Seong-Gyun Leem, Keonnyeong Lee, In-Chul Yoo:
Many-to-Many Voice Conversion Using Cycle-Consistent Variational Autoencoder with Multiple Decoders. 215-221 - Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King:
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis. 222-229 - Kun Zhou, Berrak Sisman, Haizhou Li:
Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data. 230-237 - Berrak Sisman, Haizhou Li:
Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data. 238-244 - Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss. 245-251 - Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das, Haizhou Li:
Personalized Singing Voice Generation Using WaveRNN. 252-258
Evaluation and Benchmarking
- Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Douglas A. Reynolds, Lisa P. Mason, Jaime Hernandez-Cordero:
The 2019 NIST Audio-Visual Speaker Recognition Evaluation. 259-265 - Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Douglas A. Reynolds, Lisa P. Mason, Jaime Hernandez-Cordero:
The 2019 NIST Speaker Recognition Evaluation CTS Challenge. 266-272 - Jesús Antonio Villalba López, Daniel Garcia-Romero, Nanxin Chen, Gregory Sell, Jonas Borgstrom, Alan McCree, Leibny Paola García-Perera, Saurabh Kataria, Phani Sankar Nidadavolu, Pedro Torres-Carrasquiilo, Najim Dehak:
Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19. 273-280 - Shreyas Ramoji, Prashant Krishnan V, Bhargavram Mysore, Prachi Singh, Sriram Ganapathy:
LEAP System for SRE 2019 CTS Challenge - Improvements and Error Analysis. 281-288 - Jahangir Alam, Gilles Boulianne, Lukás Burget, Mohamed Dahmane, Mireia Díez Sánchez, Alicia Lozano-Diez, Ondrej Glembek, Pierre-Luc St-Charles, Marc Lalonde, Pavel Matejka, Petr Mizera, João Monteiro, Ladislav Mosner, Cedric Noiseux, Ondrej Novotný, Oldrich Plchot, Johan Rohdin, Anna Silnova, Josef Slavícek, Themos Stafylakis, Shuai Wang, Hossein Zeinali:
Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge. 289-295
Keynote: Luciana Ferrer
- Luciana Ferrer:
The importance of Calibration in Speaker Verification.
Spoofing and Countermeasure 2
- João Monteiro, Jahangir Alam, Tiago H. Falk:
A Multi-condition Training Strategy for Countermeasures Against Spoofing Attacks to Speaker Recognizers. 296-303 - Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Hemant A. Patil:
Analysis of Teager Energy Profiles for Spoof Speech Detection. 304-311 - Itshak Lapidot, Jean-François Bonastre:
Effects of Waveform PMF on Anti-spoofing Detection for Replay Data - ASVspoof 2019. 312-318 - Sung-Hyun Yoon, Min-Sung Koh, Ha-Jin Yu:
Phase Spectrum of Time-flipped Speech Signals for Robust Spoofing Detection. 319-325 - Bence Mark Halpern, Finnian Kelly, Rob van Son, Anil Alexander:
Residual Networks for Resisting Noise: Analysis of an Embeddings-based Spoofing Countermeasure. 326-332 - Hemlata Tak, Jose Patino, Andreas Nautsch, Nicholas W. D. Evans, Massimiliano Todisco:
An Explainability Study of the Constant Q Cepstral Coefficient Spoofing Countermeasure for Automatic Speaker Verification. 333-340 - Bhusan Chettri, Tomi Kinnunen, Emmanouil Benetos:
Subband Modeling for Spoofing Detection in Automatic Speaker Verification. 341-348
Speaker Recognition 2
- Joon Son Chung, Jaesung Huh, Seongkyu Mun:
Delving into VoxCeleb: Environment Invariant Speaker Recognition. 349-356 - Chau Luu, Peter Bell, Steve Renals:
Dropping Classes for Deep Speaker Representation Learning. 357-364 - Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. 365-371 - Luciana Ferrer, Mitchell McLaren:
A Speaker Verification Backend for Improved Calibration Performance across Varying Conditions. 372-379 - Zhongxin Bai, Xiao-Lei Zhang, Jingdong Chen:
Partial AUC Metric Learning Based Speaker Verification Back-End. 380-384
Speech Application
- Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai:
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes. 385-390 - Jilong Wu, Yiteng Huang, Hyun-Jin Park, Niranjan Subrahmanya, Patrick Violette:
Small Footprint Multi-channel Keyword Spotting. 391-395 - Rasa Lileikyte, Dwight Irvin, John H. L. Hansen:
Assessing Child Communication Engagement via Speech Recognition in Naturalistic Active Learning Spaces. 396-401 - David van der Vloed, Finnian Kelly, Anil Alexander:
Exploring the Effects of Device Variability on Forensic Speaker Comparison Using VOCALISE and NFI-FRIDA, A Forensically Realistic Database. 402-407 - Kevin Wilkinghoff:
On Open-Set Speaker Identification with I-Vectors. 408-414 - Leibny Paola García-Perera, Jesús Villalba, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak:
Speaker Detection in the Wild: Lessons Learned from JSALT 2019. 415-422 - Chien-Lin Huang:
Speaker Characterization Using TDNN, TDNN-LSTM, TDNN-LSTM-Attention based Speaker Embeddings for NIST SRE 2019. 423-427 - Tianyu Liang, Yi Liu, Can Xu, Xianwei Zhang, Liang He:
Combined Vector Based on Factorized Time-delay Neural Network for Text-Independent Speaker Recognition. 428-432
Speaker Recognition 3
- Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio López-Moreno:
Personal VAD: Speaker-Conditioned Voice Activity Detection. 433-439 - Ganesh Sivaraman, Amruta Vidwans, Elie Khoury:
Speech Bandwidth Expansion For Speaker Recognition On Telephony Audio. 440-445 - Haruna Miyamoto, Sayaka Shiota, Hitoshi Kiya:
Application of Bandwidth Extension with No Learning to Data Augmentation for Speaker Verification. 446-450 - Yanpei Shi, Qiang Huang, Thomas Hain:
Robust Speaker Recognition Using Speech Enhancement And Attention Model. 451-458 - Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Najim Dehak:
Analysis of Deep Feature Loss Based Enhancement for Speaker Verification. 459-466
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.