default search action
Yanmin Qian
Person information
Other persons with a similar name
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j38]Shuai Wang, Zhengyang Chen, Bing Han, Hongji Wang, Chengdong Liang, Binbin Zhang, Xu Xiang, Wen Ding, Johan Rohdin, Anna Silnova, Yanmin Qian, Haizhou Li:
Advancing speaker embedding learning: Wespeaker toolkit for research and production. Speech Commun. 162: 103104 (2024) - [j37]Bing Han, Zhengyang Chen, Yanmin Qian:
Self-Supervised Learning With Cluster-Aware-DINO for High-Performance Robust Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 529-541 (2024) - [j36]Wei Wang, Yanmin Qian:
Universal Cross-Lingual Data Generation for Low Resource ASR. IEEE ACM Trans. Audio Speech Lang. Process. 32: 973-983 (2024) - [j35]Zhengyang Chen, Bing Han, Shuai Wang, Yanmin Qian:
Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1636-1649 (2024) - [j34]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
Advanced Long-Content Speech Recognition With Factorized Neural Transducer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1803-1815 (2024) - [j33]Jiahong Li, Chenda Li, Yifei Wu, Yanmin Qian:
Unified Cross-Modal Attention: Robust Audio-Visual Speech Recognition and Beyond. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1941-1953 (2024) - [j32]Bei Liu, Haoyu Wang, Yanmin Qian:
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3771-3784 (2024) - [c184]Bing Han, Zhiqiang Lv, Anbai Jiang, Wen Huang, Zhengyang Chen, Yufeng Deng, Jiawei Ding, Cheng Lu, Wei-Qiang Zhang, Pingyi Fan, Jia Liu, Yanmin Qian:
Exploring Large Scale Pre-Trained Models for Robust Machine Anomalous Sound Detection. ICASSP 2024: 1326-1330 - [c183]Wangyou Zhang, Jee-weon Jung, Yanmin Qian:
Improving Design of Input Condition Invariant Speech Enhancement. ICASSP 2024: 10696-10700 - [c182]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition. ICASSP 2024: 10901-10905 - [c181]Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li:
Prompt-Driven Target Speech Diarization. ICASSP 2024: 11086-11090 - [c180]Hang Shao, Bei Liu, Yanmin Qian:
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models. ICASSP 2024: 11296-11300 - [c179]Wen Huang, Bing Han, Shuai Wang, Zhengyang Chen, Yanmin Qian:
Robust Cross-Domain Speaker Verification with Multi-Level Domain Adapters. ICASSP 2024: 11781-11785 - [c178]Linfeng Yu, Wangyou Zhang, Chenpeng Du, Leying Zhang, Zheng Liang, Yanmin Qian:
Generation-Based Target Speech Extraction with Speech Discretization and Vocoder. ICASSP 2024: 12612-12616 - [c177]Wen Huang, Anbai Jiang, Bing Han, Xinhu Zheng, Yihong Qiu, Wenxi Chen, Yuzhe Liang, Pingyi Fan, Wei-Qiang Zhang, Cheng Lu, Xie Chen, Jia Liu, Yanmin Qian:
Semi-Supervised Acoustic Scene Classification with Test-Time Adaptation. ICME Workshops 2024: 1-5 - [c176]Yuzhe Liang, Wenxi Chen, Anbai Jiang, Yihong Qiu, Xinhu Zheng, Wen Huang, Bing Han, Yanmin Qian, Pingyi Fan, Wei-Qiang Zhang, L. Cheng, Jia Liu, Xie Chen:
Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer. ICME Workshops 2024: 1-6 - [c175]Bing Han, Junyu Dai, Weituo Hao, Xinyan He, Dong Guo, Jitong Chen, Yuxuan Wang, Yanmin Qian, Xuchen Song:
InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models. IJCAI 2024: 5835-5843 - [i79]Wangyou Zhang, Jee-weon Jung, Shinji Watanabe, Yanmin Qian:
Improving Design of Input Condition Invariant Speech Enhancement. CoRR abs/2401.14271 (2024) - [i78]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
Advanced Long-Content Speech Recognition With Factorized Neural Transducer. CoRR abs/2403.13423 (2024) - [i77]Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng:
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations. CoRR abs/2404.06690 (2024) - [i76]Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen:
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting. CoRR abs/2404.19040 (2024) - [i75]Haoyu Wang, Bei Liu, Hang Shao, Bo Xiao, Ke Zeng, Guanglu Wan, Yanmin Qian:
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs. CoRR abs/2405.17233 (2024) - [i74]Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng:
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation. CoRR abs/2405.17809 (2024) - [i73]Wangyou Zhang, Kohei Saijo, Jee-weon Jung, Chenda Li, Shinji Watanabe, Yanmin Qian:
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement. CoRR abs/2406.04269 (2024) - [i72]Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian:
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement. CoRR abs/2406.04660 (2024) - [i71]Bei Liu, Haoyu Wang, Yanmin Qian:
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization. CoRR abs/2406.05359 (2024) - [i70]Yidi Jiang, Ruijie Tao, Zhengyang Chen, Yanmin Qian, Haizhou Li:
Target Speech Diarization with Multimodal Prompts. CoRR abs/2406.07198 (2024) - [i69]Zhengyang Chen, Xuechen Liu, Erica Cooper, Junichi Yamagishi, Yanmin Qian:
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems. CoRR abs/2406.08812 (2024) - [i68]Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, Pingyi Fan:
AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection. CoRR abs/2406.11364 (2024) - [i67]Chenda Li, Samuele Cornell, Shinji Watanabe, Yanmin Qian:
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement. CoRR abs/2406.13471 (2024) - [i66]Shuai Wang, Zhengyang Chen, Kong Aik Lee, Yanmin Qian, Haizhou Li:
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning. CoRR abs/2407.15188 (2024) - [i65]Zhengyang Chen, Bing Han, Shuai Wang, Yidi Jiang, Yanmin Qian:
Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching. CoRR abs/2409.04859 (2024) - [i64]Zhengyang Chen, Shuai Wang, Mingyang Zhang, Xuechen Liu, Junichi Yamagishi, Yanmin Qian:
Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion. CoRR abs/2409.05004 (2024) - [i63]Xinhu Zheng, Anbai Jiang, Bing Han, Yanmin Qian, Pingyi Fan, Jia Liu, Wei-Qiang Zhang:
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models. CoRR abs/2409.07016 (2024) - [i62]Shuai Wang, Ke Zhang, Shaoxiong Lin, Junjie Li, Xuefei Wang, Meng Ge, Jianwei Yu, Yanmin Qian, Haizhou Li:
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction. CoRR abs/2409.15799 (2024) - 2023
- [j31]Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing. J. Open Source Softw. 8(91): 5403 (2023) - [j30]Bei Liu, Zhengyang Chen, Yanmin Qian:
Depth-First Neural Architecture With Attentive Feature Fusion for Efficient Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1825-1838 (2023) - [c174]Chang Chen, Xun Gong, Yanmin Qian:
Efficient Text-Only Domain Adaptation For CTC-Based ASR. ASRU 2023: 1-7 - [c173]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR. ASRU 2023: 1-8 - [c172]Shaoxiong Lin, Chao Zhang, Yanmin Qian:
Improving Speech Enhancement Using Audio Tagging Knowledge From Pre-Trained Representations and Multi-Task Learning. ASRU 2023: 1-7 - [c171]Dongning Yang, Wei Wang, Yanmin Qian:
FAT-HuBERT: Front-End Adaptive Training of Hidden-Unit BERT For Distortion-Invariant Robust Speech Recognition. ASRU 2023: 1-8 - [c170]Wangyou Zhang, Kohei Saijo, Zhong-Qiu Wang, Shinji Watanabe, Yanmin Qian:
Toward Universal Speech Enhancement For Diverse Input Conditions. ASRU 2023: 1-6 - [c169]Wangyou Zhang, Lei Yang, Yanmin Qian:
Exploring Time-Frequency Domain Target Speaker Extraction For Causal and Non-Causal Processing. ASRU 2023: 1-6 - [c168]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer. ICASSP 2023: 1-5 - [c167]Xun Gong, Wei Wang, Hang Shao, Xie Chen, Yanmin Qian:
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR. ICASSP 2023: 1-5 - [c166]Bing Han, Zhengyang Chen, Yanmin Qian:
Exploring Binary Classification Loss for Speaker Verification. ICASSP 2023: 1-5 - [c165]Bing Han, Wen Huang, Zhengyang Chen, Yanmin Qian:
Improving Dino-Based Self-Supervised Speaker Verification with Progressive Cluster-Aware Training. ICASSP Workshops 2023: 1-5 - [c164]Jiahong Li, Chenda Li, Yifei Wu, Yanmin Qian:
Robust Audio-Visual ASR with Unified Cross-Modal Attention. ICASSP 2023: 1-5 - [c163]Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-Modality Clues. ICASSP 2023: 1-5 - [c162]Chenda Li, Yifei Wu, Yanmin Qian:
Predictive Skim: Contrastive Predictive Coding for Low-Latency Online Speech Separation. ICASSP 2023: 1-5 - [c161]Tao Liu, Zhengyang Chen, Yanmin Qian, Kai Yu:
Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge. ICASSP 2023: 1-2 - [c160]Hang Shao, Tian Tan, Wei Wang, Xun Gong, Yanmin Qian:
Joint Discriminator and Transfer Based Fast Domain Adaptation For End-To-End Speech Recognition. ICASSP 2023: 1-5 - [c159]Haoyu Wang, Bei Liu, Yifei Wu, Zhengyang Chen, Yanmin Qian:
Lowbit Neural Network Quantization for Speaker Verification. ICASSP Workshops 2023: 1-5 - [c158]Hongji Wang, Chengdong Liang, Shuai Wang, Zhengyang Chen, Binbin Zhang, Xu Xiang, Yanlei Deng, Yanmin Qian:
Wespeaker: A Research and Production Oriented Speaker Embedding Learning Toolkit. ICASSP 2023: 1-5 - [c157]Wei Wang, Yanmin Qian:
HuBERT-AGG: Aggregated Representation Distillation of Hidden-Unit Bert for Robust Speech Recognition. ICASSP 2023: 1-5 - [c156]Yifei Wu, Chenda Li, Yanmin Qian:
Light-Weight Visualvoice: Neural Network Quantization On Audio Visual Speech Separation. ICASSP Workshops 2023: 1-5 - [c155]Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. ICASSP 2023: 1-5 - [c154]Leying Zhang, Zhengyang Chen, Yanmin Qian:
Adaptive Large Margin Fine-Tuning For Robust Speaker Verification. ICASSP 2023: 1-5 - [c153]Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng:
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers. INTERSPEECH 2023: 1314-1318 - [c152]Bei Liu, Haoyu Wang, Yanmin Qian:
Extremely Low Bit Quantization for Mobile Speaker Verification Systems Under 1MB Memory. INTERSPEECH 2023: 1973-1977 - [c151]Zhilong Zhang, Wei Wang, Yanmin Qian:
Fast and Efficient Multilingual Self-Supervised Pre-training for Low-Resource Speech Recognition. INTERSPEECH 2023: 2248-2252 - [c150]Wei Wang, Yanmin Qian:
UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR. INTERSPEECH 2023: 2253-2257 - [c149]Bei Liu, Yanmin Qian:
Reversible Neural Networks for Memory-Efficient Speaker Verification. INTERSPEECH 2023: 3127-3131 - [c148]Bei Liu, Yanmin Qian:
ECAPA++: Fine-grained Deep Embedding Learning for TDNN Based Speaker Verification. INTERSPEECH 2023: 3132-3136 - [c147]Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian:
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022. INTERSPEECH 2023: 3202-3206 - [c146]Wei Wang, Xun Gong, Hang Shao, Dongning Yang, Yanmin Qian:
Text Only Domain Adaptation with Phoneme Guided Data Splicing for End-to-End Speech Recognition. INTERSPEECH 2023: 3347-3351 - [c145]Linfeng Yu, Wangyou Zhang, Chenda Li, Yanmin Qian:
Overlap Aware Continuous Speech Separation without Permutation Invariant Training. INTERSPEECH 2023: 3512-3516 - [c144]Wangyou Zhang, Yanmin Qian:
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition. INTERSPEECH 2023: 3517-3521 - [c143]Zhengyang Chen, Bing Han, Shuai Wang, Yanmin Qian:
Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor. INTERSPEECH 2023: 3552-3556 - [c142]Haoyu Wang, Bei Liu, Yifei Wu, Yanmin Qian:
Adaptive Neural Network Quantization For Lightweight Speaker Verification. INTERSPEECH 2023: 5331-5335 - [c141]Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang:
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation. NeurIPS 2023 - [c140]Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe:
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation. WASPAA 2023: 1-5 - [d1]Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310). Zenodo, 2023 - [i61]Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-modality Clues. CoRR abs/2303.08372 (2023) - [i60]Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. CoRR abs/2303.10949 (2023) - [i59]Bing Han, Zhengyang Chen, Yanmin Qian:
Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification. CoRR abs/2304.05754 (2023) - [i58]Zhengyang Chen, Bing Han, Shuai Wang, Yanmin Qian:
Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor. CoRR abs/2305.10704 (2023) - [i57]Hang Shao, Wei Wang, Bei Liu, Xun Gong, Haoyu Wang, Yanmin Qian:
Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR. CoRR abs/2305.10788 (2023) - [i56]Wangyou Zhang, Yanmin Qian:
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition. CoRR abs/2305.16286 (2023) - [i55]Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng:
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers. CoRR abs/2305.18747 (2023) - [i54]Bing Han, Zhengyang Chen, Yanmin Qian:
Exploring Binary Classification Loss For Speaker Verification. CoRR abs/2307.08205 (2023) - [i53]Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe:
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation. CoRR abs/2307.12231 (2023) - [i52]Bing Han, Junyu Dai, Xuchen Song, Weituo Hao, Xinyan He, Dong Guo, Jitong Chen, Yuxuan Wang, Yanmin Qian:
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models. CoRR abs/2308.14360 (2023) - [i51]Zhengyang Chen, Bing Han, Shuai Wang, Yanmin Qian:
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer. CoRR abs/2309.06672 (2023) - [i50]Junyi Ao, Mehmet Sinan Yildirim, Meng Ge, Shuai Wang, Ruijie Tao, Yanmin Qian, Liqun Deng, Longshuai Xiao, Haizhou Li:
USED: Universal Speaker Extraction and Diarization. CoRR abs/2309.10674 (2023) - [i49]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition. CoRR abs/2309.11730 (2023) - [i48]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR. CoRR abs/2309.13573 (2023) - [i47]Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng:
Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction. CoRR abs/2309.13874 (2023) - [i46]Wangyou Zhang, Kohei Saijo, Zhong-Qiu Wang, Shinji Watanabe, Yanmin Qian:
Toward Universal Speech Enhancement for Diverse Input Conditions. CoRR abs/2309.17384 (2023) - [i45]Hang Shao, Bei Liu, Yanmin Qian:
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models. CoRR abs/2310.09499 (2023) - [i44]Dongning Yang, Wei Wang, Yanmin Qian:
FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition. CoRR abs/2311.17790 (2023) - 2022
- [j29]Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Xiangzhan Yu, Furu Wei:
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. IEEE J. Sel. Top. Signal Process. 16(6): 1505-1518 (2022) - [j28]Yanmin Qian, Zhikai Zhou:
Optimizing Data Usage for Low-Resource Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 30: 394-403 (2022) - [j27]Chenda Li, Zhuo Chen, Yanmin Qian:
Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1508-1520 (2022) - [j26]Yanmin Qian, Xun Gong, Houjun Huang:
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2842-2853 (2022) - [j25]Wangyou Zhang, Xuankai Chang, Christoph Böddeker, Tomohiro Nakatani, Shinji Watanabe, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3173-3188 (2022) - [c139]Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu, Yanmin Qian:
Time-Domain Audio-Visual Speech Separation on Low Quality Videos. ICASSP 2022: 256-260 - [c138]Chenda Li, Lei Yang, Weiqin Wang, Yanmin Qian:
Skim: Skipping Memory Lstm for Low-Latency Real-Time Continuous Speech Separation. ICASSP 2022: 681-685 - [c137]Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng:
Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification. ICASSP 2022: 6147-6151 - [c136]Bing Han, Zhengyang Chen, Yanmin Qian:
Local Information Modeling with Self-Attention for Speaker Verification. ICASSP 2022: 6727-6731 - [c135]Zhikai Zhou, Tian Tan, Yanmin Qian:
Punctuation Prediction for Streaming On-Device Speech Recognition. ICASSP 2022: 7277-7281 - [c134]Bing Han, Zhengyang Chen, Bei Liu, Yanmin Qian:
MLP-SVNET: A Multi-Layer Perceptrons Based Network for Speaker Verification. ICASSP 2022: 7522-7526 - [c133]Bei Liu, Haoyu Wang, Zhengyang Chen, Shuai Wang, Yanmin Qian:
Self-Knowledge Distillation via Feature Enhancement for Speaker Verification. ICASSP 2022: 7542-7546 - [c132]Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng:
Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding. ICASSP 2022: 7802-7806 - [c131]Zhikai Zhou, Wei Wang, Wangyou Zhang, Yanmin Qian:
Exploring Effective Data Utilization for Low-Resource Speech Recognition. ICASSP 2022: 8192-8196 - [c130]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. ICASSP 2022: 9156-9160 - [c129]Wei Wang, Xun Gong, Yifei Wu, Zhikai Zhou, Chenda Li, Wangyou Zhang, Bing Han, Yanmin Qian:
The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021. ICASSP 2022: 9261-9265 - [c128]Bei Liu, Zhengyang Chen, Yanmin Qian:
Attentive Feature Fusion for Robust Speaker Verification. INTERSPEECH 2022: 286-290 - [c127]Bei Liu, Zhengyang Chen, Yanmin Qian:
Dual Path Embedding Learning for Speaker Verification with Triplet Attention. INTERSPEECH 2022: 291-295 - [c126]Bei Liu, Zhengyang Chen, Shuai Wang, Haoyu Wang, Bing Han, Yanmin Qian:
DF-ResNet: Boosting Speaker Verification Performance with Depth-First Design. INTERSPEECH 2022: 296-300 - [c125]Leying Zhang, Zhengyang Chen, Yanmin Qian:
Enroll-Aware Attentive Statistics Pooling for Target Speaker Verification. INTERSPEECH 2022: 311-315 - [c124]Tao Liu, Shuai Fan, Xu Xiang, Hongbo Song, Shaoxiong Lin, Jiaqi Sun, Tianyuan Han, Siyuan Chen, Binwei Yao, Sen Liu, Yifei Wu, Yanmin Qian, Kai Yu:
MSDWild: Multi-modal Speaker Diarization Dataset in the Wild. INTERSPEECH 2022: 1476-1480 - [c123]Xun Gong, Zhikai Zhou, Yanmin Qian:
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregessive Speech Recognition. INTERSPEECH 2022: 2618-2622 - [c122]Bing Han, Zhengyang Chen, Yanmin Qian:
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction. INTERSPEECH 2022: 4780-4784 - [c121]Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei:
Separating Long-Form Speech with Group-wise Permutation Invariant Training. INTERSPEECH 2022: 5383-5387 - [c120]Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding. INTERSPEECH 2022: 5458-5462 - [c119]Bowen Qu, Chenda Li, Jinfeng Bai, Yanmin Qian:
Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models. ISCSLP 2022: 329-333 - [c118]Wei Wang, Wangyou Zhang, Shaoxiong Lin, Yanmin Qian:
Text-Informed Knowledge Distillation for Robust Speech Enhancement and Recognition. ISCSLP 2022: 334-338 - [c117]Zhikai Zhou, Shuang Cao, Zhengyang Chen, Bei Liu, Ming Xia, Hong Jiang, Yanmin Qian:
Medical Difficult Airway Detection using Speech Technology. ISCSLP 2022: 349-353 - [c116]Houjun Huang, Yanmin Qian:
Speaking style compensation on synthetic audio for robust keyword spotting. ISCSLP 2022: 448-452 - [c115]Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan:
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines. ISCSLP 2022: 488-492 - [c114]Tao Liu, Xu Xiang, Zhengyang Chen, Bing Han, Kai Yu, Yanmin Qian:
The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022. ISCSLP 2022: 498-501 - [c113]Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian:
End-to-End Multi-Speaker ASR with Independent Vector Analysis. SLT 2022: 496-501 - [c112]Zhengyang Chen, Yao Qian, Bing Han, Yanmin Qian, Michael Zeng:
A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning. SLT 2022: 599-604 - [i43]Chenda Li, Lei Yang, Weiqin Wang, Yanmin Qian:
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation. CoRR abs/2201.10800 (2022) - [i42]Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. CoRR abs/2202.03647 (2022) - [i41]Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian:
End-to-End Multi-speaker ASR with Independent Vector Analysis. CoRR abs/2204.00218 (2022) - [i40]Xun Gong, Yizhou Lu, Zhikai Zhou, Yanmin Qian:
Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. CoRR abs/2204.09883 (2022) - [i39]Zhengyang Chen, Bei Liu, Bing Han, Leying Zhang, Yanmin Qian:
The SJTU X-LANCE Lab System for CNSRC 2022. CoRR abs/2206.11699 (2022) - [i38]Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding. CoRR abs/2207.09514 (2022) - [i37]Xun Gong, Zhikai Zhou, Yanmin Qian:
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition. CoRR abs/2207.10600 (2022) - [i36]Bing Han, Zhengyang Chen, Yanmin Qian:
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction. CoRR abs/2208.01928 (2022) - [i35]Bing Han, Zhengyang Chen, Zhikai Zhou, Yanmin Qian:
The SJTU System for Short-duration Speaker Verification Challenge 2021. CoRR abs/2208.01933 (2022) - [i34]Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan:
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines. CoRR abs/2208.08042 (2022) - [i33]Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian:
SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2209.09076 (2022) - [i32]Zhengyang Chen, Yao Qian, Bing Han, Yanmin Qian, Michael Zeng:
A comprehensive study on self-supervised distillation for speaker representation learning. CoRR abs/2210.15936 (2022) - [i31]Hongji Wang, Chengdong Liang, Shuai Wang, Zhengyang Chen, Binbin Zhang, Xu Xiang, Yanlei Deng, Yanmin Qian:
Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit. CoRR abs/2210.17016 (2022) - [i30]Zhengyang Chen, Bing Han, Xu Xiang, Houjun Huang, Bei Liu, Yanmin Qian:
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022. CoRR abs/2211.00815 (2022) - [i29]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer. CoRR abs/2211.09412 (2022) - 2021
- [j24]Jichen Yang, Hongji Wang, Rohan Kumar Das, Yanmin Qian:
Modified Magnitude-Phase Spectrum Information for Spoofing Detection. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1065-1078 (2021) - [j23]Yanmin Qian, Zhengyang Chen, Shuai Wang:
Audio-Visual Deep Neural Network for Robust Person Verification. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1079-1092 (2021) - [c111]Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian:
Dual-Path Modeling for Long Recording Speech Separation in Meetings. ICASSP 2021: 5739-5743 - [c110]Zhengyang Chen, Shuai Wang, Yanmin Qian:
Self-Supervised Learning Based Domain Adaptation for Robust Speaker Verification. ICASSP 2021: 5834-5838 - [c109]Chenpeng Du, Bing Han, Shuai Wang, Yanmin Qian, Kai Yu:
SynAug: Synthesis-Based Data Augmentation for Text-Dependent Speaker Verification. ICASSP 2021: 5844-5848 - [c108]Houjun Huang, Xu Xiang, Fei Zhao, Shuai Wang, Yanmin Qian:
Unit Selection Synthesis Based Data Augmentation for Fixed Phrase Speaker Verification. ICASSP 2021: 5849-5853 - [c107]Houjun Huang, Xu Xiang, Yexin Yang, Rao Ma, Yanmin Qian:
AISpeech-SJTU Accent Identification System for the Accented English Speech Recognition Challenge. ICASSP 2021: 6254-6258 - [c106]Tian Tan, Yizhou Lu, Rao Ma, Sen Zhu, Jiaqi Guo, Yanmin Qian:
AISpeech-SJTU ASR System for the Accented English Speech Recognition Challenge. ICASSP 2021: 6413-6417 - [c105]Wei Wang, Zhikai Zhou, Yizhou Lu, Hongji Wang, Chenpeng Du, Yanmin Qian:
Towards Data Selection on TTS Data for Children's Speech Recognition. ICASSP 2021: 6888-6892 - [c104]Wangyou Zhang, Christoph Böddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend. ICASSP 2021: 6898-6902 - [c103]Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie:
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods. ICASSP 2021: 6918-6922 - [c102]Christoph Böddeker, Wangyou Zhang, Tomohiro Nakatani, Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Naoyuki Kamo, Yanmin Qian, Reinhold Haeb-Umbach:
Convolutive Transfer Function Invariant SDR Training Criteria for Multi-Channel Reverberant Speech Separation. ICASSP 2021: 8428-8432 - [c101]Xun Gong, Yizhou Lu, Zhikai Zhou, Yanmin Qian:
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. Interspeech 2021: 1274-1278 - [c100]Leying Zhang, Zhengyang Chen, Yanmin Qian:
Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification. Interspeech 2021: 1897-1901 - [c99]Zhengxi Liu, Yanmin Qian:
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition. Interspeech 2021: 2222-2226 - [c98]Bing Han, Zhengyang Chen, Zhikai Zhou, Yanmin Qian:
The SJTU System for Short-Duration Speaker Verification Challenge 2021. Interspeech 2021: 2332-2336 - [c97]Yifei Wu, Chenda Li, Song Yang, Zhongqin Wu, Yanmin Qian:
Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party. Interspeech 2021: 3021-3025 - [c96]Xun Gong, Zhengyang Chen, Yexin Yang, Shuai Wang, Lan Wang, Yanmin Qian:
Speaker Embedding Augmentation with Noise Distribution Matching. ISCSLP 2021: 1-5 - [c95]Shuai Wang, Yexin Yang, Yanmin Qian, Kai Yu:
Revisiting the Statistics Pooling Layer in Deep Speaker Embedding Learning. ISCSLP 2021: 1-5 - [c94]Chenpeng Du, Hao Li, Yizhou Lu, Lan Wang, Yanmin Qian:
Data Augmentation for end-to-end Code-Switching Speech Recognition. SLT 2021: 194-200 - [c93]Chenda Li, Yi Luo, Cong Han, Jinyu Li, Takuya Yoshioka, Tianyan Zhou, Marc Delcroix, Keisuke Kinoshita, Christoph Böddeker, Yanmin Qian, Shinji Watanabe, Zhuo Chen:
Dual-Path RNN for Long Recording Speech Separation. SLT 2021: 865-872 - [c92]Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Yanmin Qian:
Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions. WASPAA 2021: 146-150 - [i28]Houjun Huang, Xu Xiang, Fei Zhao, Shuai Wang, Yanmin Qian:
Unit selection synthesis based data augmentation for fixed phrase speaker verification. CoRR abs/2102.09817 (2021) - [i27]Houjun Huang, Xu Xiang, Yexin Yang, Rao Ma, Yanmin Qian:
AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge. CoRR abs/2102.09828 (2021) - [i26]Xian Shi, Fan Yu, Yizhou Lu, Yuhao Liang, Qiangze Feng, Daliang Wang, Yanmin Qian, Lei Xie:
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods. CoRR abs/2102.10233 (2021) - [i25]Wangyou Zhang, Christoph Böddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend. CoRR abs/2102.11525 (2021) - [i24]Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian:
Dual-Path Modeling for Long Recording Speech Separation in Meetings. CoRR abs/2102.11634 (2021) - [i23]Zhengxi Liu, Yanmin Qian:
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition. CoRR abs/2106.13419 (2021) - [i22]Zhengyang Chen, Shuai Wang, Yanmin Qian:
Self-Supervised Learning Based Domain Adaptation for Robust Speaker Verification. CoRR abs/2108.13843 (2021) - [i21]Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng:
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification. CoRR abs/2110.05777 (2021) - [i20]Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng:
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding. CoRR abs/2110.12138 (2021) - [i19]Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei:
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. CoRR abs/2110.13900 (2021) - [i18]Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Yanmin Qian:
Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions. CoRR abs/2110.14139 (2021) - [i17]Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei:
Separating Long-Form Speech with Group-Wise Permutation Invariant Training. CoRR abs/2110.14142 (2021) - 2020
- [j22]Wangyou Zhang, Xuankai Chang, Yanmin Qian, Shinji Watanabe:
Improving End-to-End Single-Channel Multi-Talker Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1385-1394 (2020) - [j21]Shuai Wang, Yexin Yang, Zhanghao Wu, Yanmin Qian, Kai Yu:
Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2598-2609 (2020) - [c91]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
End-To-End Multi-Speaker Speech Recognition With Transformer. ICASSP 2020: 6134-6138 - [c90]Yexin Yang, Shuai Wang, Xun Gong, Yanmin Qian, Kai Yu:
Text Adaptation for Speaker Verification with Speaker-Text Factorized Embeddings. ICASSP 2020: 6454-6458 - [c89]Zhengyang Chen, Shuai Wang, Yanmin Qian, Kai Yu:
Channel Invariant Speaker Embedding Learning with Joint Multi-Task and Adversarial Training. ICASSP 2020: 6574-6578 - [c88]Chenda Li, Yanmin Qian:
Deep Audio-Visual Speech Separation with Attention Mechanism. ICASSP 2020: 7314-7318 - [c87]Wangyou Zhang, Yanmin Qian:
Learning Contextual Language Embeddings for Monaural Multi-Talker Speech Recognition. INTERSPEECH 2020: 304-308 - [c86]Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe, Yanmin Qian:
End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming. INTERSPEECH 2020: 324-328 - [c85]Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian, Kai Yu:
Dual-Adversarial Domain Adaptation for Generalized Replay Attack Detection. INTERSPEECH 2020: 1086-1090 - [c84]Chenda Li, Yanmin Qian:
Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation. INTERSPEECH 2020: 1426-1430 - [c83]Zhengyang Chen, Shuai Wang, Yanmin Qian:
Multi-Modality Matters: A Performance Leap on VoxCeleb. INTERSPEECH 2020: 2252-2256 - [c82]Zhengyang Chen, Shuai Wang, Yanmin Qian:
Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network. INTERSPEECH 2020: 3017-3021 - [c81]Yizhou Lu, Mingkun Huang, Hao Li, Jiaqi Guo, Yanmin Qian:
Bi-Encoder Transformer Network for Mandarin-English Code-Switching Speech Recognition Using Mixture of Experts. INTERSPEECH 2020: 4766-4770 - [i16]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
End-to-End Multi-speaker Speech Recognition with Transformer. CoRR abs/2002.03921 (2020) - [i15]Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe, Yanmin Qian:
End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming. CoRR abs/2005.10479 (2020) - [i14]Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Kai Yu:
End-to-end spoofing detection with raw waveform CLDNNs. CoRR abs/2007.13060 (2020) - [i13]Qi Liu, Yanmin Qian, Kai Yu:
Future Vector Enhanced LSTM Language Model for LVCSR. CoRR abs/2008.01832 (2020) - [i12]Yefei Chen, Shuai Wang, Yanmin Qian, Kai Yu:
End-to-End Speaker-Dependent Voice Activity Detection. CoRR abs/2009.09906 (2020) - [i11]Chenpeng Du, Hao Li, Yizhou Lu, Lan Wang, Yanmin Qian:
Data Augmentation for End-to-end Code-switching Speech Recognition. CoRR abs/2011.02160 (2020) - [i10]Christoph Böddeker, Wangyou Zhang, Tomohiro Nakatani, Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Naoyuki Kamo, Yanmin Qian, Shinji Watanabe, Reinhold Haeb-Umbach:
Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation. CoRR abs/2011.15003 (2020)
2010 – 2019
- 2019
- [j20]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 20(3): 438 (2019) - [j19]Yanmin Qian, Xu Xiang:
Binary neural networks for speech recognition. Frontiers Inf. Technol. Electron. Eng. 20(5): 701-715 (2019) - [j18]Yanmin Qian, Hu Hu, Tian Tan:
Data augmentation using generative adversarial networks for robust speech recognition. Speech Commun. 114: 1-9 (2019) - [j17]Shuai Wang, Zili Huang, Yanmin Qian, Kai Yu:
Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 27(11): 1686-1696 (2019) - [c80]Xu Xiang, Shuai Wang, Houjun Huang, Yanmin Qian, Kai Yu:
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition. APSIPA 2019: 1652-1656 - [c79]Peiyao Sheng, Zhuolin Yang, Yanmin Qian:
GANs for Children: A Generative Data Augmentation Strategy for Children Speech Recognition. ASRU 2019: 129-135 - [c78]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition. ASRU 2019: 237-244 - [c77]Mingkun Huang, Yizhou Lu, Lan Wang, Yanmin Qian, Kai Yu:
Exploring Model Units and Training Strategies for End-to-End Speech Recognition. ASRU 2019: 524-531 - [c76]Wangyou Zhang, Man Sun, Lan Wang, Yanmin Qian:
End-to-End Overlapped Speech Detection and Speaker Counting with Raw Waveform. ASRU 2019: 660-666 - [c75]Shuai Wang, Yexin Yang, Tianzhe Wang, Yanmin Qian, Kai Yu:
Knowledge Distillation for Small Foot-print Deep Speaker Embedding. ICASSP 2019: 6021-6025 - [c74]Xuankai Chang, Yanmin Qian, Kai Yu, Shinji Watanabe:
End-to-end Monaural Multi-speaker ASR System without Pretraining. ICASSP 2019: 6256-6260 - [c73]Yexin Yang, Hongji Wang, Heinrich Dinkel, Zhengyang Chen, Shuai Wang, Yanmin Qian, Kai Yu:
The SJTU Robust Anti-Spoofing System for the ASVspoof 2019 Challenge. INTERSPEECH 2019: 1038-1042 - [c72]Shuai Wang, Johan Rohdin, Lukás Burget, Oldrich Plchot, Yanmin Qian, Kai Yu, Jan Cernocký:
On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction. INTERSPEECH 2019: 1148-1152 - [c71]Zhanghao Wu, Shuai Wang, Yanmin Qian, Kai Yu:
Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification. INTERSPEECH 2019: 1163-1167 - [c70]Jiaqi Guo, Yongbin You, Yanmin Qian, Kai Yu:
Joint Decoding of CTC Based Systems for Speech Recognition. INTERSPEECH 2019: 2205-2209 - [c69]Wangyou Zhang, Xuankai Chang, Yanmin Qian:
Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System. INTERSPEECH 2019: 2633-2637 - [c68]Wangyou Zhang, Ying Zhou, Yanmin Qian:
Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking. INTERSPEECH 2019: 2703-2707 - [c67]Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian, Kai Yu:
Cross-Domain Replay Spoofing Attack Detection Using Domain Adversarial Training. INTERSPEECH 2019: 2938-2942 - [c66]Chenda Li, Yanmin Qian:
Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech. INTERSPEECH 2019: 3446-3450 - [i9]Xu Xiang, Shuai Wang, Houjun Huang, Yanmin Qian, Kai Yu:
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition. CoRR abs/1906.07317 (2019) - [i8]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition. CoRR abs/1910.06522 (2019) - 2018
- [j16]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(1): 40-63 (2018) - [j15]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(4): 582 (2018) - [j14]Zhehuai Chen, Yanmin Qian, Kai Yu:
Sequence discriminative training for deep learning based acoustic keyword spotting. Speech Commun. 102: 100-111 (2018) - [j13]Yanmin Qian, Xuankai Chang, Dong Yu:
Single-channel multi-talker speech recognition with permutation invariant training. Speech Commun. 104: 1-11 (2018) - [j12]Tian Tan, Yanmin Qian, Hu Hu, Ying Zhou, Wen Ding, Kai Yu:
Adaptive Very Deep Convolutional Residual Network for Noise Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 26(8): 1393-1405 (2018) - [j11]Heinrich Dinkel, Yanmin Qian, Kai Yu:
Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection. IEEE ACM Trans. Audio Speech Lang. Process. 26(11): 2002-2014 (2018) - [c65]Ying Zhou, Yanmin Qian:
Robust Mask Estimation By Integrating Neural Network-Based and Clustering-Based Approaches for Adaptive Acoustic Beamforming. ICASSP 2018: 536-540 - [c64]Tian Tan, Yanmin Qian, Dong Yu:
Knowledge Transfer in Permutation Invariant Training for Single-Channel Multi-Talker Speech Recognition. ICASSP 2018: 571-5718 - [c63]Zili Huang, Shuai Wang, Yanmin Qian:
Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification. ICASSP 2018: 4869-4873 - [c62]Hu Hu, Tian Tan, Yanmin Qian:
Generative Adversarial Networks Based Data Augmentation for Noise Robust Speech Recognition. ICASSP 2018: 5044-5048 - [c61]Shuai Wang, Yanmin Qian, Kai Yu:
Focal Kl-Divergence Based Dilated Convolutional Neural Networks for Co-Channel Speaker Identification. ICASSP 2018: 5339-5343 - [c60]Yanmin Qian, Tian Tan, Hu Hu, Qi Liu:
Noise Robust Speech Recognition on Aurora4 by Humans and Machines. ICASSP 2018: 5604-5608 - [c59]Wen Ding, Tian Tan, Yanmin Qian:
Fast Adaptation on Deepmixture Generative Network Based Acoustic Modeling. ICASSP 2018: 5944-5948 - [c58]Xuankai Chang, Yanmin Qian, Dong Yu:
Adaptive Permutation Invariant Training with Auxiliary Information for Monaural Multi-Talker Speech Recognition. ICASSP 2018: 5974-5978 - [c57]Lianwu Chen, Meng Yu, Yanmin Qian, Dan Su, Dong Yu:
Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation. INTERSPEECH 2018: 302-306 - [c56]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures. INTERSPEECH 2018: 307-311 - [c55]Xuankai Chang, Yanmin Qian, Dong Yu:
Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks. INTERSPEECH 2018: 1586-1590 - [c54]Mingkun Huang, Yongbin You, Zhehuai Chen, Yanmin Qian, Kai Yu:
Knowledge Distillation for Sequence Model. INTERSPEECH 2018: 3703-3707 - [c53]Shuai Wang, Heinrich Dinkel, Yanmin Qian, Kai Yu:
Covariance Based Deep Feature for Text-Dependent Speaker Verification. IScIDE 2018: 231-242 - [c52]Peiyao Sheng, Zhuolin Yang, Hu Hu, Tian Tan, Yanmin Qian:
Data Augmentation using Conditional Generative Adversarial Networks for Robust Speech Recognition. ISCSLP 2018: 121-125 - [c51]Shuai Wang, Zili Huang, Yanmin Qian, Kai Yu:
Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition. ISCSLP 2018: 195-199 - [c50]Yexin Yang, Shuai Wang, Man Sun, Yanmin Qian, Kai Yu:
Generative Adversarial Networks based X-vector Augmentation for Robust Probabilistic Linear Discriminant Analysis in Speaker Verification. ISCSLP 2018: 205-209 - [i7]Shuai Wang, Zili Huang, Yanmin Qian, Kai Yu:
Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition. CoRR abs/1805.01344 (2018) - [i6]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures. CoRR abs/1807.08974 (2018) - [i5]Zhehuai Chen, Yanmin Qian, Kai Yu:
Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting. CoRR abs/1808.00639 (2018) - [i4]Xuankai Chang, Yanmin Qian, Kai Yu, Shinji Watanabe:
End-to-End Monaural Multi-speaker ASR System without Pretraining. CoRR abs/1811.02062 (2018) - 2017
- [j10]Zhehuai Chen, Yimeng Zhuang, Yanmin Qian, Kai Yu:
Phone Synchronous Speech Recognition With CTC Lattices. IEEE ACM Trans. Audio Speech Lang. Process. 25(1): 86-97 (2017) - [j9]Yanmin Qian, Nanxin Chen, Heinrich Dinkel, Zhizheng Wu:
Deep Feature Engineering for Noise Robust Spoofing Detection. IEEE ACM Trans. Audio Speech Lang. Process. 25(10): 1942-1955 (2017) - [c49]Xiaowei Jiang, Shuai Wang, Xu Xiang, Yanmin Qian:
Integrating online i-vector into GMM-UBM for text-dependent speaker verification. APSIPA 2017: 1628-1632 - [c48]Qi Liu, Yanmin Qian, Kai Yu:
Future vector enhanced LSTM language model for LVCSR. ASRU 2017: 104-110 - [c47]Yue Wu, Tianxing He, Zhehuai Chen, Yanmin Qian, Kai Yu:
Multi-view LSTM Language Model with Word-Synchronized Auxiliary Feature for LVCSR. CCL 2017: 398-410 - [c46]Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Kai Yu:
End-to-end spoofing detection with raw waveform CLDNNS. ICASSP 2017: 4860-4864 - [c45]Heinrich Dinkel, Yanmin Qian, Kai Yu:
Small-footprint convolutional neural network for spoofing detection. IJCNN 2017: 3086-3091 - [c44]Xu Xiang, Yanmin Qian, Kai Yu:
Binary Deep Neural Networks for Speech Recognition. INTERSPEECH 2017: 533-537 - [c43]Shuai Wang, Yanmin Qian, Kai Yu:
What Does the Speaker Embedding Encode? INTERSPEECH 2017: 1497-1501 - [c42]Dong Yu, Xuankai Chang, Yanmin Qian:
Recognizing Multi-Talker Speech with Permutation Invariant Training. INTERSPEECH 2017: 2456-2460 - [c41]Zhehuai Chen, Yanmin Qian, Kai Yu:
A Unified Confidence Measure Framework Using Auxiliary Normalization Graph. IScIDE 2017: 123-133 - [p1]Khe Chai Sim, Yanmin Qian, Gautam Mantena, Lahiru Samarakoon, Souvik Kundu, Tian Tan:
Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 219-243 - [i3]Dong Yu, Xuankai Chang, Yanmin Qian:
Recognizing Multi-talker Speech with Permutation Invariant Training. CoRR abs/1704.01985 (2017) - [i2]Yanmin Qian, Xuankai Chang, Dong Yu:
Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training. CoRR abs/1707.06527 (2017) - 2016
- [j8]Yanmin Qian, Nanxin Chen, Kai Yu:
Deep features for automatic spoofing detection. Speech Commun. 85: 43-52 (2016) - [j7]Tian Tan, Yanmin Qian, Kai Yu:
Cluster Adaptive Training for Deep Neural Network Based Acoustic Model. IEEE ACM Trans. Audio Speech Lang. Process. 24(3): 459-468 (2016) - [j6]Yanmin Qian, Tian Tan, Dong Yu:
Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2231-2240 (2016) - [j5]Yanmin Qian, Mengxiao Bi, Tian Tan, Kai Yu:
Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2263-2276 (2016) - [c40]Pavel Korshunov, Sébastien Marcel, Hannah Muckenhirn, André R. Gonçalves, A. G. Souza Mello, Ricardo Paranhos Velloso Violato, Flávio Olmos Simões, Mário Uliani Neto, Marcus de Assis Angeloni, José Augusto Stuchi, Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Dipjyoti Paul, Goutam Saha, Md. Sahidullah:
Overview of BTAS 2016 speaker anti-spoofing competition. BTAS 2016: 1-6 - [c39]Souvik Kundu, Gautam Mantena, Yanmin Qian, Tian Tan, Marc Delcroix, Khe Chai Sim:
Joint acoustic factor learning for robust deep neural network based automatic speech recognition. ICASSP 2016: 5025-5029 - [c38]Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai Sim, Xiong Xiao, Yu Zhang:
Speaker-aware training of LSTM-RNNS for acoustic modelling. ICASSP 2016: 5280-5284 - [c37]Linlin Wang, Chao Zhang, Philip C. Woodland, Mark J. F. Gales, Panagiota Karanasou, Pierre Lanchantin, Xunying Liu, Yanmin Qian:
Improved DNN-based segmentation for multi-genre broadcast audio. ICASSP 2016: 5700-5704 - [c36]Yanmin Qian, Tian Tan, Dong Yu:
An investigation into using parallel data for far-field speech recognition. ICASSP 2016: 5725-5729 - [c35]Yanmin Qian, Tian Tan, Dong Yu, Yu Zhang:
Integrated adaptation with multi-factor joint-learning for far-field speech recognition. ICASSP 2016: 5770-5774 - [c34]Yimeng Zhuang, Xuankai Chang, Yanmin Qian, Kai Yu:
Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC. INTERSPEECH 2016: 938-942 - [c33]Yimeng Zhuang, Sibo Tong, Maofan Yin, Yanmin Qian, Kai Yu:
Multi-task joint-learning for robust voice activity detection. ISCSLP 2016: 1-5 - [c32]Yanmin Qian, Philip C. Woodland:
Very deep convolutional neural networks for robust speech recognition. SLT 2016: 481-488 - [i1]Yanmin Qian, Philip C. Woodland:
Very Deep Convolutional Neural Networks for Robust Speech Recognition. CoRR abs/1610.00277 (2016) - 2015
- [j4]Yuan Liu, Yanmin Qian, Nanxin Chen, Tianfan Fu, Ya Zhang, Kai Yu:
Deep feature for text-dependent speaker verification. Speech Commun. 73: 1-13 (2015) - [c31]Yanmin Qian, Maofan Yin, Yongbin You, Kai Yu:
Multi-task joint-learning of deep neural networks for robust speech recognition. ASRU 2015: 310-316 - [c30]Philip C. Woodland, Xunying Liu, Yanmin Qian, Chao Zhang, Mark J. F. Gales, Penny Karanasou, Pierre Lanchantin, Linlin Wang:
Cambridge university transcription systems for the multi-genre broadcast challenge. ASRU 2015: 639-646 - [c29]Pierre Lanchantin, Mark J. F. Gales, Penny Karanasou, Xunying Liu, Yanmin Qian, Linlin Wang, Philip C. Woodland, Chao Zhang:
The development of the cambridge university alignment systems for the multi-genre broadcast challenge. ASRU 2015: 647-653 - [c28]Penny Karanasou, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Yanmin Qian, Linlin Wang, Philip C. Woodland, Chao Zhang:
Speaker diarisation and longitudinal linking in multi-genre broadcast data. ASRU 2015: 660-666 - [c27]Yongbin You, Yanmin Qian, Kai Yu:
Local trajectory based speech enhancement for robust speech recognition with deep neural network. ChinaSIP 2015: 5-9 - [c26]Yongbin You, Yanmin Qian, Tianxing He, Kai Yu:
An investigation on DNN-derived bottleneck features for GMM-HMM based robust speech recognition. ChinaSIP 2015: 30-34 - [c25]Tian Tan, Yanmin Qian, Maofan Yin, Yimeng Zhuang, Kai Yu:
Cluster adaptive training for deep neural network. ICASSP 2015: 4325-4329 - [c24]Suliang Bu, Yunxin Zhao, Yanmin Qian, Kai Yu:
A novel static parameter calculation method for model compensation. ICASSP 2015: 4510-4514 - [c23]Tianxing He, Xu Xiang, Yanmin Qian, Kai Yu:
Recurrent neural network language model with structured word embeddings for speech recognition. ICASSP 2015: 5396-5400 - [c22]Yanmin Qian, Tianxing He, Wei Deng, Kai Yu:
Automatic model redundancy reduction for fast back-propagation for deep neural networks in speech recognition. IJCNN 2015: 1-6 - [c21]Nanxin Chen, Yanmin Qian, Kai Yu:
Multi-task learning for text-dependent speaker verification. INTERSPEECH 2015: 185-189 - [c20]Nanxin Chen, Yanmin Qian, Heinrich Dinkel, Bo Chen, Kai Yu:
Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge. INTERSPEECH 2015: 2097-2101 - [c19]Mengxiao Bi, Yanmin Qian, Kai Yu:
Very deep convolutional neural networks for LVCSR. INTERSPEECH 2015: 3259-3263 - [c18]Wengong Jin, Tianxing He, Yanmin Qian, Kai Yu:
Paragraph vector based topic model for language model adaptation. INTERSPEECH 2015: 3516-3520 - 2014
- [c17]Wei Deng, Yanmin Qian, Yuchen Fan, Tianfan Fu, Kai Yu:
Stochastic data sweeping for fast DNN training. ICASSP 2014: 240-244 - [c16]Tianxing He, Yuchen Fan, Yanmin Qian, Tian Tan, Kai Yu:
Reshaping deep neural network for fast decoding by node-pruning. ICASSP 2014: 245-249 - [c15]Suliang Bu, Yanmin Qian, Khe Chai Sim, Yongbin You, Kai Yu:
Second order vector taylor series based robust speech recognition. ICASSP 2014: 1769-1773 - [c14]Yuan Liu, Tianfan Fu, Yuchen Fan, Yanmin Qian, Kai Yu:
Speaker verification with deep features. IJCNN 2014: 747-753 - [c13]Tianfan Fu, Yanmin Qian, Yuan Liu, Kai Yu:
Tandem deep features for text-dependent speaker verification. INTERSPEECH 2014: 1327-1331 - [c12]Suliang Bu, Yanmin Qian, Kai Yu:
A novel dynamic parameters calculation approach for model compensation. INTERSPEECH 2014: 2744-2748 - [c11]Jianwei Niu, Yanmin Qian, Kai Yu:
Acoustic emotion recognition using deep neural network. ISCSLP 2014: 128-132 - 2013
- [c10]Yanmin Qian, Kai Yu, Jia Liu:
Combination of data borrowing strategies for low-resource LVCSR. ASRU 2013: 404-409 - [c9]Yanmin Qian, Jia Liu:
MLP-HMM two-stage unsupervised training for low-resource languages on conversational telephone speech recognition. INTERSPEECH 2013: 1816-1820 - 2012
- [c8]Daniel Povey, Mirko Hannemann, Gilles Boulianne, Lukás Burget, Arnab Ghoshal, Milos Janda, Martin Karafiát, Stefan Kombrink, Petr Motlícek, Yanmin Qian, Korbinian Riedhammer, Karel Veselý, Ngoc Thang Vu:
Generating exact lattices in the WFST framework. ICASSP 2012: 4213-4216 - [c7]Yanmin Qian, Jia Liu:
Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition. INTERSPEECH 2012: 2582-2585 - [c6]Yanmin Qian, Jia Liu:
Articulatory Feature based Multilingual MLPs for Low-Resource Speech Recognition. INTERSPEECH 2012: 2602-2605 - 2011
- [j3]Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu:
Language Recognition Based on Acoustic Diversified Phone Recognizers and Phonotactic Feature Fusion. IEICE Trans. Inf. Syst. 94-D(3): 679-689 (2011) - [j2]Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu:
Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition. J. Comput. 6(2): 178-183 (2011) - [c5]Yanmin Qian, Ji Xu, Daniel Povey, Jia Liu:
Strategies for using MLP based features with limited target-language training data. ASRU 2011: 354-358 - [c4]Yanmin Qian, Daniel Povey, Jia Liu:
State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs. INTERSPEECH 2011: 553-560 - 2010
- [c3]Yanmin Qian, Jia Liu:
Phone modeling and combining discriminative training for mandarinenglish bilingual speech recognition. ICASSP 2010: 4918-4921 - [c2]Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu:
Integration of Complementary Phone Recognizers for Phonotactic Language Recognition. ICICA (LNCS) 2010: 237-244 - [c1]Yanmin Qian, Jia Liu:
Mandarin-English bilingual phone modeling and combining MPE based Discriminative training for cross-language speech recognition. ISCSLP 2010: 103-108
2000 – 2009
- 2009
- [j1]Yanmin Qian, Jia Liu, Michael T. Johnson:
Efficient embedded speech recognition for very large vocabulary Mandarin car-navigation systems. IEEE Trans. Consumer Electron. 55(3): 1496-1500 (2009)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-23 21:23 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint