default search action

combined dblp search
author search
venue search
publication search

ask others

Yuxuan Wang 0002

> Home > Persons

Person information

affiliation: ByteDance AI Lab, Mountain View, CA, USA
affiliation: Google, Mountain View, CA, USA
affiliation (former, PhD): Ohio State University, Columbus, OH, USA

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j15]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LiLCZMWMTWW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LiLCZMWMTWW24
Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing. IEEE ACM Trans. Audio Speech Lang. Process. 32: 517-528 (2024)
[j14]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LiuYLMKTWWWP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LiuYLMKTWWWP24
Haohe Liu, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Qiao Tian, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley:
AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2871-2883 (2024)
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuLZWXT024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuLZWXT024
Yuzhuo Liu, Xubo Liu, Yan Zhao, Yuanyuan Wang, Rui Xia, Pingchuan Tain, Yuxuan Wang:
Audio Prompt Tuning for Universal Sound Separation. ICASSP 2024: 1446-1450
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Ying0DKTH024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Ying0DKTH024
Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Qiao Tian, Yuanyuan Huo, Yuxuan Wang:
A Unified Front-End Framework for English Text-to-Speech Synthesis. ICASSP 2024: 10181-10185
[c49]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/DongH00KZF0WCYB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/DongH00KZF0WCYB24
Qianqian Dong, Zhiying Huang, Qi Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang:
PolyVoice: Language Models for Speech to Speech Translation. ICLR 2024
[c48]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/SunYTC000M0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/SunYTC000M0024
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. ICML 2024
[c47]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/HanDHHGC0QS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/HanDHHGC0QS24
Bing Han, Junyu Dai, Weituo Hao, Xinyan He, Dong Guo, Jitong Chen, Yuxuan Wang, Yanmin Qian, Xuchen Song:
InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models. IJCAI 2024: 5835-5843
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-06674
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-06674
Philip Anastassiou, Zhenyu Tang, Kainan Peng, Dongya Jia, Jiaxin Li, Ming Tu, Yuping Wang, Yuxuan Wang, Mingbo Ma:
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing. CoRR abs/2404.06674 (2024)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07914
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07914
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Jun Zhang, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
Can Large Language Models Understand Spatial Audio? CoRR abs/2406.07914 (2024)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-13340
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-13340
Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu:
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words. CoRR abs/2406.13340 (2024)
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-15704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-15704
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. CoRR abs/2406.15704 (2024)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-17272
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-17272
Van Tung Pham, Yist Y. Lin, Tao Han, Wei Li, Jun Zhang, Lu Lu, Yuxuan Wang:
A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR. CoRR abs/2406.17272 (2024)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-04416
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-04416
Yi Yuan, Dongya Jia, Xiaobin Zhuang, Yuanzhe Chen, Zhengxi Liu, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xubo Liu, Mark D. Plumbley, Wenwu Wang:
Improving Audio Generation with Visual Enhanced Caption. CoRR abs/2407.04416 (2024)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-04675
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-04675
Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li, Xiaoyang Li, Zeyang Li, Zehua Lin, Rui Liu, Shouda Liu, Lu Lu, Yizhou Lu, Jingting Ma, Shengtao Ma, Yulin Pei, Chen Shen, Tian Tan, Xiaogang Tian, Ming Tu, Bo Wang, Hao Wang, Yuping Wang, Yuxuan Wang, Hanzhang Xia, Rui Xia, Shuangyi Xie, Hongmin Xu, Meng Yang, Bihong Zhang, Jun Zhang, Wanyi Zhang, Yang Zhang, Yawei Zhang, Yijie Zheng, Ming Zou:
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition. CoRR abs/2407.04675 (2024)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-08680
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-08680
Minglun Han, Ye Bai, Chen Shen, Youjia Huang, Mingkun Huang, Zehua Lin, Linhao Dong, Lu Lu, Yuxuan Wang:
NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training. CoRR abs/2409.08680 (2024)
2023
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenTLLKLWTWW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenTLLKLWTWW23
Yuanzhe Chen, Ming Tu, Tang Li, Xin Li, Qiuqiang Kong, Jiaxin Li, Zhichao Wang, Qiao Tian, Yuping Wang, Yuxuan Wang:
Streaming Voice Conversion via Intermediate Bottleneck Features and Non-Streaming Teacher Guidance. ICASSP 2023: 1-5
[c45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FengTXH023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FengTXH023
Yukun Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang:
Memory Augmented Lookup Dictionary Based Language Modeling for Automatic Speech Recognition. INTERSPEECH 2023: 481-485
[c44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FengTXH023a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FengTXH023a
Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang:
Language-universal Phonetic Encoder for Low-resource Speech Recognition. INTERSPEECH 2023: 1429-1433
[c43]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LamT0YFTJXMSCW023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LamT0YFTJXMSCW023
Max W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Feng, Ming Tu, Yuliang Ji, Rui Xia, Mingbo Ma, Xuchen Song, Jitong Chen, Yuping Wang, Yuxuan Wang:
Efficient Neural Music Generation. NeurIPS 2023
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-00066
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-00066
Yukun Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang:
Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition. CoRR abs/2301.00066 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05203
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05203
Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing. CoRR abs/2305.05203 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-10666
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-10666
Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Yuanyuan Huo, Yuping Wang, Yuxuan Wang:
a unified front-end framework for english text-to-speech synthesis. CoRR abs/2305.10666 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11569
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11569
Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang:
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition. CoRR abs/2305.11569 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11576
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11576
Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang:
Language-universal phonetic encoder for low-resource speech recognition. CoRR abs/2305.11576 (2023)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-15719
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-15719
Max W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Feng, Ming Tu, Yuliang Ji, Rui Xia, Mingbo Ma, Xuchen Song, Jitong Chen, Yuping Wang, Yuxuan Wang:
Efficient Neural Music Generation. CoRR abs/2305.15719 (2023)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-02982
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-02982
Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang:
PolyVoice: Language Models for Speech to Speech Translation. CoRR abs/2306.02982 (2023)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-05037
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-05037
Xubo Liu, Qiuqiang Kong, Yan Zhao, Haohe Liu, Yi Yuan, Yuzhuo Liu, Rui Xia, Yuxuan Wang, Mark D. Plumbley, Wenwu Wang:
Separate Anything You Describe. CoRR abs/2308.05037 (2023)
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-05734
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-05734
Haohe Liu, Qiao Tian, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D. Plumbley:
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining. CoRR abs/2308.05734 (2023)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-14360
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-14360
Bing Han, Junyu Dai, Xuchen Song, Weituo Hao, Xinyan He, Dong Guo, Jitong Chen, Yuxuan Wang, Yanmin Qian:
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models. CoRR abs/2308.14360 (2023)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-18399
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-18399
Yuzhuo Liu, Xubo Liu, Yan Zhao, Yuanyuan Wang, Rui Xia, Pingchuan Tain, Yuxuan Wang:
Audio Prompt Tuning for Universal Sound Separation. CoRR abs/2311.18399 (2023)
2022
[j13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tismir/KongLCW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tismir/KongLCW22
Qiuqiang Kong, Bochen Li, Jitong Chen, Yuxuan Wang:
GiantMIDI-Piano: A Large-Scale MIDI Dataset for Classical Piano Music. Trans. Int. Soc. Music. Inf. Retr. 5(1): 87-98 (2022)
[c42]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiMWMTWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiMWMTWW22
Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. ICASSP 2022: 8007-8011
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DaiCCTLXTWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DaiCCTLXTWW22
Dongyang Dai, Yuanzhe Chen, Li Chen, Ming Tu, Lu Liu, Rui Xia, Qiao Tian, Yuping Wang, Yuxuan Wang:
Cloning One's Voice Using Very Limited Data in the Wild. ICASSP 2022: 8322-8326
[c40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShuCSZZZHW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShuCSZZZHW22
Xiaofeng Shu, Yanjie Chen, Chuxiang Shang, Yan Zhao, Chengshuai Zhao, Yehang Zhu, Chuanzeng Huang, Yuxuan Wang:
Non-intrusive Speech Quality Assessment with a Multi-Task Learning based Subband Adaptive Attention Temporal Convolutional Neural Network. INTERSPEECH 2022: 3298-3302
[c39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuLKTZWHW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuLKTZWHW22
Haohe Liu, Xubo Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang:
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration. INTERSPEECH 2022: 4232-4236
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LiMW0JMTWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LiMW0JMTWW22
Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. ACM Multimedia 2022: 5811-5820
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16838
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16838
Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. CoRR abs/2203.16838 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-05841
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-05841
Haohe Liu, Xubo Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang:
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration. CoRR abs/2204.05841 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-06088
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-06088
Zhengxi Liu, Qiao Tian, Chenxu Hu, Xudong Liu, Menglin Wu, Yuping Wang, Hang Zhao, Yuxuan Wang:
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech. CoRR abs/2207.06088 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-12345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-12345
Qiuqiang Kong, Shilei Liu, Junjie Shi, Xuzhou Ye, Yin Cao, Qiaoxi Zhu, Yong Xu, Yuxuan Wang:
Neural Sound Field Decomposition with Super-resolution of Sound Direction. CoRR abs/2210.12345 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15158
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15158
Yuanzhe Chen, Ming Tu, Tang Li, Xin Li, Qiuqiang Kong, Jiaxin Li, Zhichao Wang, Qiao Tian, Yuping Wang, Yuxuan Wang:
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance. CoRR abs/2210.15158 (2022)
2021
[j12]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/KongLSWW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/KongLSWW21
Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan, Yuxuan Wang:
High-Resolution Piano Transcription With Pedals by Regressing Onset and Offset Times. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3707-3717 (2021)
[c37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HuangWSSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HuangWSSW21
Jiawen Huang, Ju-Chiang Wang, Jordan B. L. Smith, Xuchen Song, Yuxuan Wang:
Modeling the Compatibility of Stem Tracks to Generate Music Mashups. AAAI 2021: 187-195
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangSCSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangSCSW21
Ju-Chiang Wang, Jordan B. L. Smith, Jitong Chen, Xuchen Song, Yuxuan Wang:
Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-Task Learning. ICASSP 2021: 566-570
[c35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KongLDCXW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KongLDCXW21
Qiuqiang Kong, Haohe Liu, Xingjian Du, Li Chen, Rui Xia, Yuxuan Wang:
Speech Enhancement with Weakly Labelled Data from AudioSet. Interspeech 2021: 191-195
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/GuYRWTZCWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/GuYRWTZCWM21
Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma:
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders. ISCSLP 2021: 1-5
[c33]
- view
  - electronic edition @ ismir.net
  - details & citations
- export record
  dblp key:
  - conf/ismir/Choi021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/Choi021
Keunwoo Choi, Yuxuan Wang:
Listen, Read, and Identify: Multimodal Singing Language Identification of Music. ISMIR 2021: 121-127
[c32]
- view
  - electronic edition @ ismir.net
  - details & citations
- export record
  dblp key:
  - conf/ismir/KongCLC021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/KongCLC021
Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang:
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation. ISMIR 2021: 342-349
[c31]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/HuTLWWZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HuTLWWZ21
Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao:
Neural Dubber: Dubbing for Videos According to Scripts. NeurIPS 2021: 16582-16595
[i31]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-09966
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-09966
Xuchen Song, Qiuqiang Kong, Xingjian Du, Yuxuan Wang:
CatNet: music source separation system with mix-audio augmentation. CoRR abs/2102.09966 (2021)
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-09971
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-09971
Qiuqiang Kong, Haohe Liu, Xingjian Du, Li Chen, Rui Xia, Yuxuan Wang:
Speech enhancement with weakly labelled data from AudioSet. CoRR abs/2102.09971 (2021)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-01893
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-01893
Keunwoo Choi, Yuxuan Wang:
Listen, Read, and Identify: Multimodal Singing Language Identification of Music. CoRR abs/2103.01893 (2021)
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-14208
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-14208
Jiawen Huang, Ju-Chiang Wang, Jordan B. L. Smith, Xuchen Song, Yuxuan Wang:
Modeling the Compatibility of Stem Tracks to Generate Music Mashups. CoRR abs/2103.14208 (2021)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-14253
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-14253
Ju-Chiang Wang, Jordan B. L. Smith, Jitong Chen, Xuchen Song, Yuxuan Wang:
Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-task Learning. CoRR abs/2103.14253 (2021)
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-09298
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-09298
Xiaofeng Shu, Yehang Zhu, Yanjie Chen, Li Chen, Haohe Liu, Chuanzeng Huang, Yuxuan Wang:
Joint Echo Cancellation and Noise Suppression based on Cascaded Magnitude and Complex Mask Estimation. CoRR abs/2107.09298 (2021)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-05418
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-05418
Qiuqiang Kong, Yin Cao, Haohe Liu, Keunwoo Choi, Yuxuan Wang:
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation. CoRR abs/2109.05418 (2021)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-13731
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-13731
Haohe Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang:
VoiceFixer: Toward General Speech Restoration With Neural Vocoder. CoRR abs/2109.13731 (2021)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-03347
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-03347
Dongyang Dai, Yuanzhe Chen, Li Chen, Ming Tu, Lu Liu, Rui Xia, Qiao Tian, Yuping Wang, Yuxuan Wang:
Cloning one's voice using very limited data in the wild. CoRR abs/2110.03347 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-08243
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-08243
Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao:
Neural Dubber: Dubbing for Silent Videos According to Scripts. CoRR abs/2110.08243 (2021)
2020
[j11]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/YangWX20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/YangWX20
Shan Yang, Yuxuan Wang, Lei Xie:
Adversarial Feature Learning and Unsupervised Clustering Based Speech Synthesis for Found Data With Acoustic and Textual Noise. IEEE Signal Process. Lett. 27: 1730-1734 (2020)
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/KongCIWWP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/KongCIWWP20
Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, Mark D. Plumbley:
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2880-2894 (2020)
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/XuCWCZZWCYZJWL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/XuCWCZZWCYZJWL20
Runxin Xu, Jun Cao, Mingxuan Wang, Jiaze Chen, Hao Zhou, Ying Zeng, Yuping Wang, Li Chen, Xiang Yin, Xijin Zhang, Songcheng Jiang, Yuxuan Wang, Lei Li:
Xiaomingbot: A Multilingual Robot News Reporter. ACL (demo) 2020: 1-8
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/bigdataconf/FengTXWK20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/bigdataconf/FengTXWK20
Zishun Feng, Ming Tu, Rui Xia, Yuxuan Wang, Ashok K. Krishnamurthy:
Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos. IEEE BigData 2020: 5671-5672
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KongWSCWP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KongWSCWP20
Qiuqiang Kong, Yuxuan Wang, Xuchen Song, Yin Cao, Wenwu Wang, Mark D. Plumbley:
Source Separation with Weakly Labelled Data: an Approach to Computational Auditory Scene Analysis. ICASSP 2020: 101-105
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PanYZLZMW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PanYZLZMW20
Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang:
A Unified Sequence-to-Sequence Front-End Model for Mandarin Text-to-Speech Synthesis. ICASSP 2020: 6689-6693
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangPY0LZWM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangPY0LZWM20
Junhui Zhang, Junjie Pan, Xiang Yin, Chen Li, Shichao Liu, Yang Zhang, Yuxuan Wang, Zejun Ma:
A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin. ICASSP 2020: 6694-6698
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-02065
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-02065
Qiuqiang Kong, Yuxuan Wang, Xuchen Song, Yin Cao, Wenwu Wang, Mark D. Plumbley:
Source separation with weakly labelled data: An approach to computational auditory scene analysis. CoRR abs/2002.02065 (2020)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-11012
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-11012
Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma:
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders. CoRR abs/2004.11012 (2020)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-13595
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-13595
Shan Yang, Yuxuan Wang, Lei Xie:
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise. CoRR abs/2004.13595 (2020)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-09271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-09271
Wenjie Li, Benlai Tang, Xiang Yin, Yushi Zhao, Wei Li, Kang Wang, Hao Huang, Yuxuan Wang, Zejun Ma:
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech. CoRR abs/2005.09271 (2020)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-12531
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-12531
Dongyang Dai, Li Chen, Yuping Wang, Mu Wang, Rui Xia, Xuchen Song, Zhiyong Wu, Yuxuan Wang:
Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement. CoRR abs/2005.12531 (2020)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2007-08005
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2007-08005
Runxin Xu, Jun Cao, Mingxuan Wang, Jiaze Chen, Hao Zhou, Ying Zeng, Yuping Wang, Li Chen, Xiang Yin, Xijin Zhang, Songcheng Jiang, Yuxuan Wang, Lei Li:
Xiaomingbot: A Multilingual Robot News Reporter. CoRR abs/2007.08005 (2020)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-01815
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-01815
Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan, Yuxuan Wang:
High-resolution Piano Transcription with Pedals by Regressing Onsets and Offsets Times. CoRR abs/2010.01815 (2020)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-07061
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-07061
Qiuqiang Kong, Bochen Li, Jitong Chen, Yuxuan Wang:
GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music. CoRR abs/2010.07061 (2020)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-14805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-14805
Qiuqiang Kong, Keunwoo Choi, Yuxuan Wang:
Large-Scale MIDI-based Composer Classification. CoRR abs/2010.14805 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/AnWYMX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/AnWYMX19
Xiaochun An, Yuxuan Wang, Shan Yang, Zejun Ma, Lei Xie:
Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis. ASRU 2019: 184-191
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HsuZWCWWG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HsuZWCWWG19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Yu-An Chung, Yuxuan Wang, Yonghui Wu, James R. Glass:
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization. ICASSP 2019: 5901-5905
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChungWHZS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChungWHZS19
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis. ICASSP 2019: 6940-6944
[c22]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HsuZWZWWCJCSNP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HsuZWZWWCJCSNP19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. ICLR (Poster) 2019
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-04111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-04111
Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang:
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis. CoRR abs/1911.04111 (2019)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-04128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-04128
Junhui Zhang, Junjie Pan, Xiang Yin, Chen Li, Shichao Liu, Yang Zhang, Yuxuan Wang, Zejun Ma:
A hybrid text normalization system using multi-head self-attention for mandarin. CoRR abs/1911.04128 (2019)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1912-10211
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-10211
Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, Mark D. Plumbley:
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition. CoRR abs/1912.10211 (2019)
2018
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShenPWSJYCZWRSA18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShenPWSJYCZWRSA18
Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions. ICASSP 2018: 4779-4783
[c20]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/Skerry-RyanBXWS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Skerry-RyanBXWS18
R. J. Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous:
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. ICML 2018: 4700-4709
[c19]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/WangSZRBSXJRS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/WangSZRBSXJRS18
Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Ye Jia, Fei Ren, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML 2018: 5167-5176
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/StantonWS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/StantonWS18
Daisy Stanton, Yuxuan Wang, R. J. Skerry-Ryan:
Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis. SLT 2018: 595-602
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1803-09017
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1803-09017
Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. CoRR abs/1803.09017 (2018)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1803-09047
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1803-09047
R. J. Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous:
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. CoRR abs/1803.09047 (2018)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-01410
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-01410
Daisy Stanton, Yuxuan Wang, R. J. Skerry-Ryan:
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis. CoRR abs/1808.01410 (2018)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-10128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-10128
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis. CoRR abs/1808.10128 (2018)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-07217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-07217
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. CoRR abs/1810.07217 (2018)
2017
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangGHLS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangGHLS17
Yuxuan Wang, Pascal Getreuer, Thad Hughes, Richard F. Lyon, Rif A. Saurous:
Trainable frontend for robust and far-field keyword spotting. ICASSP 2017: 5670-5674
[c16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangSSWWJYXCBLA17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangSSWWJYXCBLA17
Yuxuan Wang, R. J. Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc V. Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous:
Tacotron: Towards End-to-End Speech Synthesis. INTERSPEECH 2017: 4006-4010
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WangSSWWJYXCBLA17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangSSWWJYXCBLA17
Yuxuan Wang, R. J. Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc V. Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous:
Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model. CoRR abs/1703.10135 (2017)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1711-00520
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1711-00520
Yuxuan Wang, R. J. Skerry-Ryan, Ying Xiao, Daisy Stanton, Joel Shor, Eric Battenberg, Rob Clark, Rif A. Saurous:
Uncovering Latent Style Factors for Expressive Speech Synthesis. CoRR abs/1711.00520 (2017)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1712-05884
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1712-05884
Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. CoRR abs/1712.05884 (2017)
2016
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/speech/ChenWW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/speech/ChenWW16
Jitong Chen, Yuxuan Wang, DeLiang Wang:
Noise perturbation for supervised speech separation. Speech Commun. 78: 1-10 (2016)
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WilliamsonWW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WilliamsonWW16
Donald S. Williamson, Yuxuan Wang, DeLiang Wang:
Complex Ratio Masking for Monaural Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 24(3): 483-492 (2016)
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WilliamsonWW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WilliamsonWW16
Donald S. Williamson, Yuxuan Wang, DeLiang Wang:
Complex ratio masking for joint enhancement of magnitude and phase. ICASSP 2016: 5220-5224
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WangGHLS16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangGHLS16
Yuxuan Wang, Pascal Getreuer, Thad Hughes, Richard F. Lyon, Rif A. Saurous:
Trainable Frontend For Robust and Far-Field Keyword Spotting. CoRR abs/1607.05666 (2016)
2015
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/HanWWWMZ15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/HanWWWMZ15
Kun Han, Yuxuan Wang, DeLiang Wang, William S. Woods, Ivo Merks, Tao Zhang:
Learning Spectral Mapping for Speech Dereverberation and Denoising. IEEE ACM Trans. Audio Speech Lang. Process. 23(6): 982-992 (2015)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ZhaoWW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ZhaoWW15
Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Cochannel Speaker Identification in Anechoic and Reverberant Conditions. IEEE ACM Trans. Audio Speech Lang. Process. 23(11): 1727-1736 (2015)
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/ica/ChenWW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ica/ChenWW15
Jitong Chen, Yuxuan Wang, DeLiang Wang:
Noise Perturbation Improves Supervised Speech Separation. LVA/ICA 2015: 83-90
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangW15
Yuxuan Wang, DeLiang Wang:
A deep neural network for time-domain signal reconstruction. ICASSP 2015: 4390-4394
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoWW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoWW15
Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Deep neural networks for cochannel speaker identification. ICASSP 2015: 4824-4828
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WilliamsonWW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WilliamsonWW15
Donald S. Williamson, Yuxuan Wang, DeLiang Wang:
Deep neural networks for estimating speech model activations. ICASSP 2015: 5113-5117
2014
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ZhaoWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ZhaoWW14
Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Robust Speaker Identification in Noisy and Reverberant Conditions. IEEE ACM Trans. Audio Speech Lang. Process. 22(4): 836-845 (2014)
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangNW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangNW14
Yuxuan Wang, Arun Narayanan, DeLiang Wang:
On training targets for supervised speech separation. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 1849-1858 (2014)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ChenWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ChenWW14
Jitong Chen, Yuxuan Wang, DeLiang Wang:
A feature study for classification-based speech separation at low signal-to-noise ratios. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 1993-2002 (2014)
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoWW14
Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Robust speaker identification in noisy and reverberant conditions. ICASSP 2014: 3997-4001
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HanWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HanWW14
Kun Han, Yuxuan Wang, DeLiang Wang:
Learning spectral mapping for speech dereverberation. ICASSP 2014: 4628-4632
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangW14
Yuxuan Wang, DeLiang Wang:
A structure-preserving training target for supervised speech separation. ICASSP 2014: 6107-6111
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WilliamsonWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WilliamsonWW14
Donald S. Williamson, Yuxuan Wang, DeLiang Wang:
A two-stage approach for improving the perceptual quality of separated speech. ICASSP 2014: 7034-7038
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWW14
Jitong Chen, Yuxuan Wang, DeLiang Wang:
A feature study for classification-based speech separation at very low signal-to-noise ratio. ICASSP 2014: 7039-7043
2013
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangHW13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangHW13
Yuxuan Wang, Kun Han, DeLiang Wang:
Exploring Monaural Features for Classification-Based Speech Segregation. IEEE Trans. Speech Audio Process. 21(2): 270-279 (2013)
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangW13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangW13
Yuxuan Wang, DeLiang Wang:
Towards Scaling Up Classification-Based Speech Separation. IEEE Trans. Speech Audio Process. 21(7): 1381-1390 (2013)
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WilliamsonWW13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WilliamsonWW13
Donald S. Williamson, Yuxuan Wang, DeLiang Wang:
A sparse representation approach for perceptual quality improvement of separated speech. ICASSP 2013: 7015-7019
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangW13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangW13
Yuxuan Wang, DeLiang Wang:
Feature denoising for speech separation in unknown noisy environments. ICASSP 2013: 7472-7476
2012
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangW12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangW12
Yuxuan Wang, DeLiang Wang:
Boosting Classification Based Speech Separation Using Temporal Dynamics. INTERSPEECH 2012: 1528-1531
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangHW12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangHW12
Yuxuan Wang, Kun Han, DeLiang Wang:
Acoustic Features for Classification Based Speech Separation. INTERSPEECH 2012: 1532-1535
[c1]
- view
- export record
  dblp key:
  - conf/nips/WangW12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangW12
Yuxuan Wang, DeLiang Wang:
Cocktail Party Processing via Structured Prediction. NIPS 2012: 224-232

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.