default search action

combined dblp search
author search
venue search
publication search

ask others

Sheng Zhao

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

Journal Articles

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/bdcc/ZhaoHWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/bdcc/ZhaoHWL24
Sheng Zhao, Shengwen Huang, Huiying Wen, Weiming Liu:
Analysis of Highway Vehicle Lane Change Duration Based on Survival Model. Big Data Cogn. Comput. 8(9): 114 (2024)
[j22]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/TanCLCZLWLYHZQSL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/TanCLCZLWLYHZQSL24
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Sheng Zhao, Tao Qin, Frank K. Soong, Tie-Yan Liu:
NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality. IEEE Trans. Pattern Anal. Mach. Intell. 46(6): 4234-4245 (2024)
[j21]
- view
  authority control:
- export record
  dblp key:
  - journals/tits/YanWWZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tits/YanWWZ24
Liang Yan, Xiaodong Wu, Chongfeng Wei, Sheng Zhao:
Human-Vehicle Shared Steering Control for Obstacle Avoidance: A Reference-Free Approach With Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 25(11): 17888-17901 (2024)
2023
[j20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/gpb/XingAWLLHWLWZZW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/gpb/XingAWLLHWLWZZW23
Xiumei Xing, Cheng Ai, Tianjiao Wang, Yang Li, Huitao Liu, Pengfei Hu, Guiwu Wang, Huamiao Liu, Hongliang Wang, Ranran Zhang, Junjun Zheng, Xiaobo Wang, Lei Wang, Yuxiao Chang, Qian Qian, Jinghua Yu, Lixin Tang, Shigang Wu, Xiujuan Shao, Alun Li, Peng Cui, Wei Zhan, Sheng Zhao, Zhichao Wu, Xiqun Shao, Yimeng Dong, Min Rong, Yihong Tan, Xuezhe Cui, Shuzhuo Chang, Xingchao Song, Tongao Yang, Limin Sun, Yan Ju, Pei Zhao, Huanhuan Fan, Ying Liu, Xinhui Wang, Wanyun Yang, Min Yang, Tao Wei, Shanshan Song, Jiaping Xu, Zhigang Yue, Qiqi Liang, Chunyi Li, Jue Ruan, Fuhe Yang:
The First High-quality Reference Genome of Sika Deer Provides Insights into High-tannin Adaptation. Genom. Proteom. Bioinform. 21(1): 203-215 (2023)
[j19]
- view
  authority control:
- export record
  dblp key:
  - journals/jfi/ZhangZZLHLH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jfi/ZhangZZLHLH23
Zhenhui Zhang, Zhengjiang Zhang, Sheng Zhao, Quanfang Li, Zhihui Hong, Fuhua Li, Shipei Huang:
Robust adaptive Unscented Kalman Filter with gross error detection and identification for power system forecasting-aided state estimation. J. Frankl. Inst. 360(13): 10297-10336 (2023)
[j18]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/LingTCLZZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/LingTCLZZS23
Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song:
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation. IEEE J. Sel. Top. Signal Process. 17(6): 1232-1247 (2023)
[j17]
- view
  authority control:
- export record
  dblp key:
  - journals/tits/SongSQZH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tits/SongSQZH23
Hongyu Song, Wei ShangGuan, Weizhi Qiu, Sheng Zhao, Steven Harrod:
Two-Stage Optimal Trajectory Planning Based on Resilience Adjustment Model for Virtually Coupled Trains. IEEE Trans. Intell. Transp. Syst. 24(12): 15219-15235 (2023)
2022
[j16]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangLLMZK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangLLMZK22
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3089-3097 (2022)
2021
[j15]
- view
  authority control:
- export record
  dblp key:
  - journals/tie/ShuWYDZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tie/ShuWYDZ21
Liang Shu, Ziran Wu, Yingmin You, Marcelo J. Dapino, Sheng Zhao:
Design and Adaptive Control of Matrix Transformer Based Indirect Converter for Large-Capacity Circuit Breaker Testing Application. IEEE Trans. Ind. Electron. 68(6): 5314-5324 (2021)
2020
[j14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/ieiceee/YangZHL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ieiceee/YangZHL20
Zi-Kai Yang, Sheng Zhao, Xiangdong Huang, Wei Lu:
Accurate Doppler radar-based heart rate measurement using matched filter. IEICE Electron. Express 17(8): 20200062 (2020)
[j13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/sensors/YangSZH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/sensors/YangSZH20
Zi-Kai Yang, Heping Shi, Sheng Zhao, Xiangdong Huang:
Vital Sign Detection during Large-Scale and Fast Body Movements Based on an Adaptive Noise Cancellation Algorithm Using a Single Doppler Radar Sensor. Sensors 20(15): 4183 (2020)
2019
[j12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/access/WenWDQZ19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/WenWDQZ19
Huiying Wen, Jiabin Wu, Yuchen Duan, Weiwei Qi, Sheng Zhao:
A Methodology of Timing Co-Evolutionary Path Optimization for Accident Emergency Rescue Considering Future Environmental Uncertainty. IEEE Access 7: 131459-131472 (2019)
[j11]
- view
  authority control:
- export record
  dblp key:
  - journals/jms/YuanJGZLG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jms/YuanJGZLG19
Congru Yuan, Feng Jin, Xiuling Guo, Sheng Zhao, Wei Li, Haidong Guo:
Correlation Analysis of Breast Cancer DWI Combined with DCE-MRI Imaging Features with Molecular Subtypes and Prognostic Factors. J. Medical Syst. 43(4): 83:1-83:10 (2019)
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/jms/WuYGZWL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jms/WuYGZWL19
Jing Wu, Xi Yang, Jianmei Gao, Sheng Zhao, Liang Wang, Tianyou Luo:
Application of MRI and CT Energy Spectrum Imaging in Hand and Foot Tendon Lesions. J. Medical Syst. 43(5): 116:1-116:9 (2019)
2016
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/tbe/QiuGSZSGHG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tbe/QiuGSZSGHG16
Shuang Qiu, Nai J. Ge, Dong K. Sun, Sheng Zhao, Jian F. Sun, Zhao B. Guo, Ke Hu, Ning Gu:
Synthesis and Characterization of Magnetic Polyvinyl Alcohol (PVA) Hydrogel Microspheres for the Embolization of Blood Vessel. IEEE Trans. Biomed. Eng. 63(4): 730-736 (2016)
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/tcst/ChenZF16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcst/ChenZF16
Yiming Chen, Sheng Zhao, Jay A. Farrell:
Computationally Efficient Carrier Integer Ambiguity Resolution in Multiepoch GPS/INS: A Common-Position-Shift Approach. IEEE Trans. Control. Syst. Technol. 24(5): 1541-1556 (2016)
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/tits/ZhaoCF16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tits/ZhaoCF16
Sheng Zhao, Yiming Chen, Jay A. Farrell:
High-Precision Vehicle Navigation in Urban Environments Using an MEM's IMU and Single-Frequency GPS Receiver. IEEE Trans. Intell. Transp. Syst. 17(10): 2854-2867 (2016)
2015
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/ijoe/WangLZH15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijoe/WangLZH15
Haibin Wang, Xia Liu, Sheng Zhao, Lina Huo:
Multi-authority E-voting System Based on Group Blind Signature. Int. J. Online Eng. 11(9): 89-93 (2015)
2012
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/ijra/ZhaoK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijra/ZhaoK12
Sheng Zhao, Manish Kumar:
Self-Localization and Tracking of Multiple robots in Experimental setups. Int. J. Robotics Autom. 27(3) (2012)
2008
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/biosystems/ZhaoZLWZWJ08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/biosystems/ZhaoZLWZWJ08
Sheng Zhao, Qin Zhang, Xiaolin Liu, Xuemin Wang, Huilin Zhang, Yan Wu, Fei Jiang:
Analysis of synonymous codon usage in 11 Human Bocavirus isolates. Biosyst. 92(3): 207-214 (2008)
2005
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/jcb/ZhaoF05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jcb/ZhaoF05
Sheng Zhao, Russell D. Fernald:
Comprehensive Algorithm for Quantitative Real-Time Polymerase Chain Reaction. J. Comput. Biol. 12(8): 1047-1064 (2005)
1996
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/bioinformatics/WuZCLM96
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/bioinformatics/WuZCLM96
Cathy H. Wu, Sheng Zhao, Hsi-Lien Chen, Chin-Ju Lo, Jerry McLarty:
Motif identification neural design for rapid and sensitive protein family search. Comput. Appl. Biosci. 12(2): 109-118 (1996)
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/jcb/WuZC96
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jcb/WuZC96
Cathy H. Wu, Sheng Zhao, Hsi-Lien Chen:
A Protein Class Database Organized with ProSite Protein Groups and PIR Superfamilies. J. Comput. Biol. 3(4): 547-561 (1996)

Conference and Workshop Papers

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c68]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HeG0WZAL0W0WZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HeG0WZAL0W0WZ024
Tianyu He, Junliang Guo, Runyi Yu, Yuchi Wang, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian:
GAIA: Zero-shot Talking Avatar Generation. ICLR 2024
[c67]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/LengGSJ0LLYZS0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/LengGSJ0LLYZS0024
Yichong Leng, Zhifang Guo, Kai Shen, Zeqian Ju, Xu Tan, Eric Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiangyang Li, Sheng Zhao, Tao Qin, Jiang Bian:
PromptTTS 2: Describing and Generating Voices with Text Prompt. ICLR 2024
[c66]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ShenJ0LL00Z024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ShenJ0LL00Z024
Kai Shen, Zeqian Ju, Xu Tan, Eric Liu, Yichong Leng, Lei He, Tao Qin, Sheng Zhao, Jiang Bian:
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers. ICLR 2024
[c65]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/JuWS0XYLLST000024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/JuWS0XYLLST000024
Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Eric Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiangyang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. ICML 2024
[c64]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/YangT0HLGCSZ0ZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/YangT0HLGCSZ0ZW24
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Haohan Guo, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Zhou Zhao, Xixin Wu, Helen M. Meng:
UniAudio: Towards Universal Audio Generation with Large Language Models. ICML 2024
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/Xiao000ZZ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/Xiao000ZZ024
Yujia Xiao, Xi Wang, Xu Tan, Lei He, Xinfa Zhu, Sheng Zhao, Tan Lee:
Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis. ACM Multimedia 2024: 2099-2107
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/ZhuTWHX00Z024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/ZhuTWHX00Z024
Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie:
UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis. ACM Multimedia 2024: 7513-7522
2023
[c61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/WuG0Z0S0ZM023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/WuG0Z0S0ZM023
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. AAAI 2023: 13772-13779
[c60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/XuZ0ZW0Z23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/XuZ0ZW0Z23
Zhihang Xu, Shaofei Zhang, Xi Wang, Jiajun Zhang, Wenning Wei, Lei He, Sheng Zhao:
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023. Blizzard Challenge 2023
[c59]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GuoLWZT23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GuoLWZT23
Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
Prompttts: Controllable Text-To-Speech With Text Descriptions. ICASSP 2023: 1-5
[c58]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangLLZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangLLZ23
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao:
Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation. ICASSP 2023: 1-5
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangBLLWSZH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangBLLWSZH23
Chen Zhang, Shubham Bansal, Aakash Lakhera, Jinzhu Li, Gang Wang, Sandeepkumar Satpal, Sheng Zhao, Lei He:
LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023. ICASSP 2023: 1-2
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ChaiZH0BWLZY023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ChaiZH0BWLZY023
Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrusaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian:
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details. ICCV 2023: 9053-9064
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/iccvw/Chen0LBL0Z23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccvw/Chen0LBL0Z23
Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. ICCV (Workshops) 2023: 2969-2979
[c54]
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/WalshHNWRZ0ZDFW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WalshHNWRZ0ZDFW23
Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer:
Large-Scale Automatic Audiobook Creation. INTERSPEECH 2023: 3675-3676
[c53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaoZ000ZSL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaoZ000ZSL23
Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee:
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading. INTERSPEECH 2023: 4883-4887
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/DuCHT00Z023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/DuCHT00Z023
Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. ACM Multimedia 2023: 4281-4289
[c51]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WangJ0H00Z23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangJ0H00Z23
Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao:
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models. NeurIPS 2023
2022
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangLTQSTZL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangLTQSTZL22
Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee:
A Study on the Efficacy of Model Pre-Training In Developing Neural Text-to-Speech System. ICASSP 2022: 6087-6091
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWLLTZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWLLTZ22
Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. ICASSP 2022: 7247-7251
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenTWPMHZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenTWPMHZ22
Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo P. Mandic, Lei He, Sheng Zhao:
Infergrad: Improving Diffusion Models for Vocoder by Considering Inference in Training. ICASSP 2022: 8432-8436
[c47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangS0TYLWZQLZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangS0TYLWZQLZ22
Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao:
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech. INTERSPEECH 2022: 456-460
[c46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YinTLWZZXZL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YinTLWZZXZL22
Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo:
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion. INTERSPEECH 2022: 1571-1575
[c45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuXH0Z22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuXH0Z22
Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao:
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders. INTERSPEECH 2022: 1581-1585
[c44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wu00HZSQL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wu00HZSQL22
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. INTERSPEECH 2022: 2568-2572
[c43]
- view
  - electronic edition @ ismir.net
  - details & citations
- export record
  dblp key:
  - conf/ismir/Lu0YQZL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/Lu0YQZL22
Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks. ISMIR 2022: 567-574
[c42]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LengCGLCTMHLQzL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LengCGLCTMHLQzL22
Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo P. Mandic, Lei He, Xiangyang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu:
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis. NeurIPS 2022
2021
[c41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/LiuX0C00L0Z21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/LiuX0C00L0Z21
Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao:
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021. Blizzard Challenge 2021
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Leng0ZS0Q21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Leng0ZS0Q21
Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiang-Yang Li, Tao Qin:
MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network. ICASSP 2021: 391-395
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Luo0WQLZCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Luo0WQLZCL21
Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu:
Lightspeech: Lightweight and Fast Text to Speech with Neural Architecture Search. ICASSP 2021: 5699-5703
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Yan0LQZSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Yan0LQZSL21
Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu:
Adaspeech 2: Adaptive Text to Speech with Untranscribed Data. ICASSP 2021: 6613-6617
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Zhang00LZQZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Zhang00LZQZL21
Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu:
Denoispeech: Denoising Text to Speech with Frame-Level Noise Modeling. ICASSP 2021: 7063-7067
[c36]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/0006H0QZZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/0006H0QZZL21
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. ICLR 2021
[c35]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/Chen0LLQZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/Chen0LLQZL21
Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
AdaSpeech: Adaptive Text to Speech for Custom Voice. ICLR 2021
[c34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLZ021
Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li:
A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems. Interspeech 2021: 1982-1986
[c33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YanTLZQZSZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YanTLZQZSZL21
Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu:
Adaptive Text to Speech for Spontaneous Style. Interspeech 2021: 4668-4672
2020
[c32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/LiLW0ZL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/LiLW0ZL20
Naihan Li, Yanqing Liu, Yu Wu, Shujie Liu, Sheng Zhao, Ming Liu:
RobuTrans: A Robust Transformer-Based Text-to-Speech Model. AAAI 2020: 8228-8235
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/RenLTZZL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/RenLTZZL20
Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, Sheng Zhao, Tie-Yan Liu:
A Study of Non-autoregressive Model for Sequence Generation. ACL 2020: 149-159
[c30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/00020DL0LRYZ020
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/00020DL0LRYZ020
Chengyi Wang, Yu Wu, Yujiao Du, Jinyu Li, Shujie Liu, Liang Lu, Shuo Ren, Guoli Ye, Sheng Zhao, Ming Zhou:
Semantic Mask for Transformer Based End-to-End Speech Recognition. INTERSPEECH 2020: 971-975
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiangWLLZM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiangWLLZM20
Xiangyu Liang, Zhiyong Wu, Runnan Li, Yanqing Liu, Sheng Zhao, Helen Meng:
Enhancing Monotonicity for Robust Autoregressive Transformer TTS. INTERSPEECH 2020: 3181-3185
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZMLWPMWHZG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZMLWPMWHZG20
Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. INTERSPEECH 2020: 3590-3594
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Li0LZLZ20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Li0LZLZ20
Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
MoBoAligner: A Neural Alignment Model for Non-Autoregressive TTS with Monotonic Boundary Search. INTERSPEECH 2020: 3999-4003
[c26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenTRXSZQ20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenTRXSZQ20
Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin:
MultiSpeech: Multi-Speaker Text to Speech with Transformer. INTERSPEECH 2020: 4024-4028
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/kdd/XuTRQLZL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/kdd/XuTRQLZL20
Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu:
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition. KDD 2020: 2802-2812
2019
[c24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/Li0LZL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Li0LZL19
Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu:
Neural Speech Synthesis with Transformer Network. AAAI 2019: 6706-6713
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SunTGZHLQL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SunTGZHLQL19
Hao Sun, Xu Tan, Jun-Wei Gan, Sheng Zhao, Dongxu Han, Hongzhi Liu, Tao Qin, Tie-Yan Liu:
Knowledge Distillation from Bert in Pre-Training and Fine-Tuning for Polyphone Disambiguation. ASRU 2019: 168-175
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiW0ZM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiW0ZM19
Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng:
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition. ICASSP 2019: 6675-6679
[c21]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/RenTQZZL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/RenTQZZL19
Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
Almost Unsupervised Text to Speech and Automatic Speech Recognition. ICML 2019: 5410-5419
[c20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/LiW0BZM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/LiW0BZM19
Runnan Li, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, Helen Meng:
Towards Discriminative Representation Learning for Speech Emotion Recognition. IJCAI 2019: 5060-5066
[c19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SunTGLZQL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SunTGLZQL19
Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu:
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. INTERSPEECH 2019: 2115-2119
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/itsc/SongWCZ19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/itsc/SongWCZ19
Hongyu Song, Wei ShangGuan, Bai-gen Cai, Sheng Zhao:
A Resilience Adjustment Method for Real-time Cooperative Optimization of High-speed Trains. ITSC 2019: 3194-3199
[c17]
- view
- export record
  dblp key:
  - conf/nips/RenRTQZZL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/RenRTQZZL19
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech: Fast, Robust and Controllable Text to Speech. NeurIPS 2019: 3165-3174
2018
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icce-berlin/KarthikRDPMZVM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icce-berlin/KarthikRDPMZVM18
Shravan Karthik, Karthik Ramanan, Nikhil Devshatwar, Subhajit Paul, Vishal Mahaveer, Sheng Zhao, Manoj Vishwanathan, Chetan Matad:
Hypervisor based approach for integrated cockpit solutions. ICCE-Berlin 2018: 1-6
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/itsc/ZhaoCW18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/itsc/ZhaoCW18
Sheng Zhao, Bai-gen Cai, Wei ShangGuan:
A Two-stage Method to Optimise Driving Strategy and Timetable for High-speed Trains. ITSC 2018: 2283-2288
2014
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/cdc/ChenZZF14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cdc/ChenZZF14
Yiming Chen, Sheng Zhao, Dongfang Zheng, Jay A. Farrell:
High reliability integer ambiguity resolution of 6DOF RTK GPS/INS. CDC 2014: 6609-6614
2013
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/IEEEcca/ZhaoF13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/IEEEcca/ZhaoF13
Sheng Zhao, Jay A. Farrell:
2D LIDAR Aided INS for vehicle positioning in urban environments. CCA 2013: 376-381
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/amcc/ZhaoDF13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/amcc/ZhaoDF13
Sheng Zhao, Wenjie Dong, Jay A. Farrell:
Quaternion-based trajectory tracking control of VTOL-UAVs using command filtered backstepping. ACC 2013: 1018-1023
2012
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/cgc/ZhaoXCLM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cgc/ZhaoXCLM12
Sheng Zhao, Feng Xia, Zhen Chen, Zhen Li, Jianhua Ma:
MobiMsg: A Resource-Efficient Location-Based Mobile Instant Messaging System. CGC 2012: 466-471
[c10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeQSZ12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeQSZ12
Ji He, Yao Qian, Frank K. Soong, Sheng Zhao:
Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS. INTERSPEECH 2012: 963-966
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/trustcom/LuJLZM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/trustcom/LuJLZM12
Kun Lu, Hua Jiang, Mingchu Li, Sheng Zhao, Jianhua Ma:
Resources Collaborative Scheduling Model Based on Trust Mechanism in Cloud. TrustCom 2012: 863-868
2011
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/amcc/ZhaoRK11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/amcc/ZhaoRK11
Sheng Zhao, Subramanian Ramakrishnan, Manish Kumar:
Density-based control of multiple robots. ACC 2011: 481-486
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/cdc/ZhaoF11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cdc/ZhaoF11
Sheng Zhao, Jay A. Farrell:
Optimization-based road curve fitting. CDC/ECC 2011: 5293-5298
2010
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/amcc/ZhaoK10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/amcc/ZhaoK10
Sheng Zhao, Manish Kumar:
A novel way to implement self-localization in a multi-robot experimental platform. ACC 2010: 6834-6839
2003
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoTJ03
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoTJ03
Sheng Zhao, Jianhua Tao, DanLing Jiang:
Chinese prosodic phrasing with extended features. ICASSP (1) 2003: 492-495
2002
[c4]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl-sighan/ZhaoTC02
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl-sighan/ZhaoTC02
Sheng Zhao, Jianhua Tao, Lianhong Cai:
Learning Rules for Chinese Prosodic Phrase Prediction. SIGHAN@COLING 2002
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhaoTC02
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhaoTC02
Sheng Zhao, Jianhua Tao, Lianhong Cai:
Prosodic phrasing with inductive learning. INTERSPEECH 2002: 2417-2420
[c2]
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/iscslp/0001ZC02
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/0001ZC02
Jianhua Tao, Sheng Zhao, Lian-Hong Cai:
Automatic stress prediction of Chinese speech synthesis. ISCSLP 2002
1997
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/icnn/WuZSS97
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icnn/WuZSS97
Cathy H. Wu, Sheng Zhao, Kevin Simmons, Sailaja Shivakumar:
Motif neural network design for large-scale protein family identification. ICNN 1997: 86-89

Informal and Other Publications

see FAQ

What is the meaning of the colors in the publication lists?

2024
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-07383
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-07383
Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng:
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like. CoRR abs/2402.07383 (2024)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-03100
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-03100
Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. CoRR abs/2403.03100 (2024)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-03204
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-03204
Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie Liu, Jinyu Li, Sheng Zhao:
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis. CoRR abs/2404.03204 (2024)
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-06690
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-06690
Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng:
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations. CoRR abs/2404.06690 (2024)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-17809
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-17809
Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng:
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation. CoRR abs/2405.17809 (2024)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-05370
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-05370
Sanyuan Chen, Shujie Liu, Long Zhou, Yanqing Liu, Xu Tan, Jinyu Li, Sheng Zhao, Yao Qian, Furu Wei:
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers. CoRR abs/2406.05370 (2024)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-05699
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-05699
Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Yufei Xia, Jinzhu Li, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS. CoRR abs/2406.05699 (2024)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07855
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07855
Bing Han, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Yanming Qian, Yanqing Liu, Sheng Zhao, Jinyu Li, Furu Wei:
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment. CoRR abs/2406.07855 (2024)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-18009
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-18009
Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda:
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS. CoRR abs/2406.18009 (2024)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-08551
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-08551
Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei:
Autoregressive Speech Synthesis without Vector Quantization. CoRR abs/2407.08551 (2024)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-12229
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-12229
Haibin Wu, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Daniel Tompkins, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech. CoRR abs/2407.12229 (2024)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-04016
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-04016
Jiaqi Li, Dongmei Wang, Xiaofei Wang, Yao Qian, Long Zhou, Shujie Liu, Midia Yousefi, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yanqing Liu, Junkun Chen, Sheng Zhao, Jinyu Li, Zhizheng Wu, Michael Zeng:
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation. CoRR abs/2409.04016 (2024)
2023
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-02111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-02111
Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers. CoRR abs/2301.02111 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-11192
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-11192
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao:
Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation. CoRR abs/2302.11192 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-02939
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-02939
Ruiqing Xue, Yanqing Liu, Lei He, Xu Tan, Linquan Liu, Edward Lin, Sheng Zhao:
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model. CoRR abs/2303.02939 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-03926
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-03926
Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling. CoRR abs/2303.03926 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-11225
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-11225
Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrusaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian:
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details. CoRR abs/2303.11225 (2023)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-17550
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-17550
Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. CoRR abs/2303.17550 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-00830
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-00830
Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao:
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models. CoRR abs/2304.00830 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-09116
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-09116
Kai Shen, Zeqian Ju, Xu Tan, Yanqing Liu, Yichong Leng, Lei He, Tao Qin, Sheng Zhao, Jiang Bian:
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers. CoRR abs/2304.09116 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-00729
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-00729
Sheng Zhao, Qilong Yuan, Yibo Duan, Zhuoyue Chen:
An End-to-End Multi-Module Audio Deepfake Generation System for ADD Challenge 2023. CoRR abs/2307.00729 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-00782
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-00782
Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee:
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading. CoRR abs/2307.00782 (2023)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-14591
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-14591
Junchao Huang, Xiaoqi He, Sheng Zhao:
The detection and rectification for identity-switch based on unfalsified control. CoRR abs/2307.14591 (2023)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-04830
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-04830
Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. CoRR abs/2308.04830 (2023)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-02285
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-02285
Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian:
PromptTTS 2: Describing and Generating Voices with Text Prompt. CoRR abs/2309.02285 (2023)
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-02743
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-02743
Zhihang Xu, Shaofei Zhang, Xi Wang, Jiajun Zhang, Wenning Wei, Lei He, Sheng Zhao:
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023. CoRR abs/2309.02743 (2023)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-03926
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-03926
Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer:
Large-Scale Automatic Audiobook Creation. CoRR abs/2309.03926 (2023)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-00704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-00704
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-15230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-15230
Tianyu He, Junliang Guo, Runyi Yu, Yuchi Wang, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian:
GAIA: Zero-shot Talking Avatar Generation. CoRR abs/2311.15230 (2023)
2022
[i35]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-03751
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-03751
Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo P. Mandic, Lei He, Sheng Zhao:
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training. CoRR abs/2202.03751 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-00888
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-00888
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems. CoRR abs/2203.00888 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-17190
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-17190
Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao:
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech. CoRR abs/2203.17190 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-00436
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-00436
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. CoRR abs/2204.00436 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-04421
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-04421
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank K. Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu:
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality. CoRR abs/2205.04421 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-14807
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-14807
Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo P. Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu:
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis. CoRR abs/2205.14807 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-13865
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-13865
Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo:
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion. CoRR abs/2206.13865 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-04646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-04646
Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao:
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders. CoRR abs/2207.04646 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-13717
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-13717
Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song:
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation. CoRR abs/2208.13717 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-14345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-14345
Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks. CoRR abs/2208.14345 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-12171
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-12171
Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
PromptTTS: Controllable Text-to-Speech with Text Descriptions. CoRR abs/2211.12171 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-16934
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-16934
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. CoRR abs/2211.16934 (2022)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-05005
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-05005
Anni Tang, Tianyu He, Xu Tan, Jun Ling, Runnan Li, Sheng Zhao, Li Song, Jiang Bian:
Memories are One-to-Many Mapping Alleviators in Talking Face Generation. CoRR abs/2212.05005 (2022)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-14518
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-14518
Zehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo P. Mandic:
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech. CoRR abs/2212.14518 (2022)
2021
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-04040
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-04040
Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu:
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. CoRR abs/2102.04040 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-00110
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-00110
Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiangyang Li, Tao Qin:
MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network. CoRR abs/2103.00110 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-00993
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-00993
Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
AdaSpeech: Adaptive Text to Speech for Custom Voice. CoRR abs/2103.00993 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-09715
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-09715
Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu:
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data. CoRR abs/2104.09715 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-02530
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-02530
Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu:
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style. CoRR abs/2107.02530 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-07493
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-07493
Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li:
A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems. CoRR abs/2108.07493 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-03857
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-03857
Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee:
A study on the efficacy of model pre-training in developing neural text-to-speech system. CoRR abs/2110.03857 (2021)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-12612
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-12612
Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao:
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021. CoRR abs/2110.12612 (2021)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-09771
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-09771
Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. CoRR abs/2111.09771 (2021)
2020
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-10454
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-10454
Yi Ren, Jinglin Liu, Xu Tan, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
A Study of Non-autoregressive Model for Sequence Generation. CoRR abs/2004.10454 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-08528
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-08528
Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search. CoRR abs/2005.08528 (2020)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-04558
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-04558
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. CoRR abs/2006.04558 (2020)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-04664
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-04664
Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin:
MultiSpeech: Multi-Speaker Text to Speech with Transformer. CoRR abs/2006.04664 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2007-15188
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2007-15188
Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. CoRR abs/2007.15188 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-03687
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-03687
Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu:
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition. CoRR abs/2008.03687 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-09547
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-09547
Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu:
DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling. CoRR abs/2012.09547 (2020)
2019
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1904-03446
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-03446
Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu:
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. CoRR abs/1904.03446 (2019)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1905-06791
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-06791
Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
Almost Unsupervised Text to Speech and Automatic Speech Recognition. CoRR abs/1905.06791 (2019)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1905-09263
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-09263
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech: Fast, Robust and Controllable Text to Speech. CoRR abs/1905.09263 (2019)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1912-03010
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-03010
Chengyi Wang, Yu Wu, Yujiao Du, Jinyu Li, Shujie Liu, Liang Lu, Shuo Ren, Guoli Ye, Sheng Zhao, Ming Zhou:
Semantic Mask for Transformer based End-to-End Speech Recognition. CoRR abs/1912.03010 (2019)
2018
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1809-08895
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1809-08895
Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
Close to Human Quality TTS with Transformer. CoRR abs/1809.08895 (2018)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.