default search action
Sheng Zhao
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j23]Sheng Zhao, Shengwen Huang, Huiying Wen, Weiming Liu:
Analysis of Highway Vehicle Lane Change Duration Based on Survival Model. Big Data Cogn. Comput. 8(9): 114 (2024) - [j22]Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Sheng Zhao, Tao Qin, Frank K. Soong, Tie-Yan Liu:
NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality. IEEE Trans. Pattern Anal. Mach. Intell. 46(6): 4234-4245 (2024) - [j21]Liang Yan, Xiaodong Wu, Chongfeng Wei, Sheng Zhao:
Human-Vehicle Shared Steering Control for Obstacle Avoidance: A Reference-Free Approach With Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 25(11): 17888-17901 (2024) - [c68]Tianyu He, Junliang Guo, Runyi Yu, Yuchi Wang, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian:
GAIA: Zero-shot Talking Avatar Generation. ICLR 2024 - [c67]Yichong Leng, Zhifang Guo, Kai Shen, Zeqian Ju, Xu Tan, Eric Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiangyang Li, Sheng Zhao, Tao Qin, Jiang Bian:
PromptTTS 2: Describing and Generating Voices with Text Prompt. ICLR 2024 - [c66]Kai Shen, Zeqian Ju, Xu Tan, Eric Liu, Yichong Leng, Lei He, Tao Qin, Sheng Zhao, Jiang Bian:
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers. ICLR 2024 - [c65]Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Eric Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiangyang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. ICML 2024 - [c64]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Haohan Guo, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Zhou Zhao, Xixin Wu, Helen M. Meng:
UniAudio: Towards Universal Audio Generation with Large Language Models. ICML 2024 - [c63]Yujia Xiao, Xi Wang, Xu Tan, Lei He, Xinfa Zhu, Sheng Zhao, Tan Lee:
Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis. ACM Multimedia 2024: 2099-2107 - [c62]Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie:
UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis. ACM Multimedia 2024: 7513-7522 - [i64]Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng:
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like. CoRR abs/2402.07383 (2024) - [i63]Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. CoRR abs/2403.03100 (2024) - [i62]Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie Liu, Jinyu Li, Sheng Zhao:
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis. CoRR abs/2404.03204 (2024) - [i61]Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng:
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations. CoRR abs/2404.06690 (2024) - [i60]Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng:
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation. CoRR abs/2405.17809 (2024) - [i59]Sanyuan Chen, Shujie Liu, Long Zhou, Yanqing Liu, Xu Tan, Jinyu Li, Sheng Zhao, Yao Qian, Furu Wei:
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers. CoRR abs/2406.05370 (2024) - [i58]Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Yufei Xia, Jinzhu Li, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS. CoRR abs/2406.05699 (2024) - [i57]Bing Han, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Yanming Qian, Yanqing Liu, Sheng Zhao, Jinyu Li, Furu Wei:
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment. CoRR abs/2406.07855 (2024) - [i56]Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda:
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS. CoRR abs/2406.18009 (2024) - [i55]Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei:
Autoregressive Speech Synthesis without Vector Quantization. CoRR abs/2407.08551 (2024) - [i54]Haibin Wu, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Daniel Tompkins, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech. CoRR abs/2407.12229 (2024) - [i53]Jiaqi Li, Dongmei Wang, Xiaofei Wang, Yao Qian, Long Zhou, Shujie Liu, Midia Yousefi, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yanqing Liu, Junkun Chen, Sheng Zhao, Jinyu Li, Zhizheng Wu, Michael Zeng:
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation. CoRR abs/2409.04016 (2024) - 2023
- [j20]Xiumei Xing, Cheng Ai, Tianjiao Wang, Yang Li, Huitao Liu, Pengfei Hu, Guiwu Wang, Huamiao Liu, Hongliang Wang, Ranran Zhang, Junjun Zheng, Xiaobo Wang, Lei Wang, Yuxiao Chang, Qian Qian, Jinghua Yu, Lixin Tang, Shigang Wu, Xiujuan Shao, Alun Li, Peng Cui, Wei Zhan, Sheng Zhao, Zhichao Wu, Xiqun Shao, Yimeng Dong, Min Rong, Yihong Tan, Xuezhe Cui, Shuzhuo Chang, Xingchao Song, Tongao Yang, Limin Sun, Yan Ju, Pei Zhao, Huanhuan Fan, Ying Liu, Xinhui Wang, Wanyun Yang, Min Yang, Tao Wei, Shanshan Song, Jiaping Xu, Zhigang Yue, Qiqi Liang, Chunyi Li, Jue Ruan, Fuhe Yang:
The First High-quality Reference Genome of Sika Deer Provides Insights into High-tannin Adaptation. Genom. Proteom. Bioinform. 21(1): 203-215 (2023) - [j19]Zhenhui Zhang, Zhengjiang Zhang, Sheng Zhao, Quanfang Li, Zhihui Hong, Fuhua Li, Shipei Huang:
Robust adaptive Unscented Kalman Filter with gross error detection and identification for power system forecasting-aided state estimation. J. Frankl. Inst. 360(13): 10297-10336 (2023) - [j18]Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song:
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation. IEEE J. Sel. Top. Signal Process. 17(6): 1232-1247 (2023) - [j17]Hongyu Song, Wei ShangGuan, Weizhi Qiu, Sheng Zhao, Steven Harrod:
Two-Stage Optimal Trajectory Planning Based on Resilience Adjustment Model for Virtually Coupled Trains. IEEE Trans. Intell. Transp. Syst. 24(12): 15219-15235 (2023) - [c61]Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. AAAI 2023: 13772-13779 - [c60]Zhihang Xu, Shaofei Zhang, Xi Wang, Jiajun Zhang, Wenning Wei, Lei He, Sheng Zhao:
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023. Blizzard Challenge 2023 - [c59]Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
Prompttts: Controllable Text-To-Speech With Text Descriptions. ICASSP 2023: 1-5 - [c58]Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao:
Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation. ICASSP 2023: 1-5 - [c57]Chen Zhang, Shubham Bansal, Aakash Lakhera, Jinzhu Li, Gang Wang, Sandeepkumar Satpal, Sheng Zhao, Lei He:
LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023. ICASSP 2023: 1-2 - [c56]Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrusaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian:
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details. ICCV 2023: 9053-9064 - [c55]Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. ICCV (Workshops) 2023: 2969-2979 - [c54]Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer:
Large-Scale Automatic Audiobook Creation. INTERSPEECH 2023: 3675-3676 - [c53]Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee:
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading. INTERSPEECH 2023: 4883-4887 - [c52]Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. ACM Multimedia 2023: 4281-4289 - [c51]Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao:
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models. NeurIPS 2023 - [i52]Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers. CoRR abs/2301.02111 (2023) - [i51]Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao:
Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation. CoRR abs/2302.11192 (2023) - [i50]Ruiqing Xue, Yanqing Liu, Lei He, Xu Tan, Linquan Liu, Edward Lin, Sheng Zhao:
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model. CoRR abs/2303.02939 (2023) - [i49]Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei:
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling. CoRR abs/2303.03926 (2023) - [i48]Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrusaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian:
HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details. CoRR abs/2303.11225 (2023) - [i47]Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. CoRR abs/2303.17550 (2023) - [i46]Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao:
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models. CoRR abs/2304.00830 (2023) - [i45]Kai Shen, Zeqian Ju, Xu Tan, Yanqing Liu, Yichong Leng, Lei He, Tao Qin, Sheng Zhao, Jiang Bian:
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers. CoRR abs/2304.09116 (2023) - [i44]Sheng Zhao, Qilong Yuan, Yibo Duan, Zhuoyue Chen:
An End-to-End Multi-Module Audio Deepfake Generation System for ADD Challenge 2023. CoRR abs/2307.00729 (2023) - [i43]Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee:
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading. CoRR abs/2307.00782 (2023) - [i42]Junchao Huang, Xiaoqi He, Sheng Zhao:
The detection and rectification for identity-switch based on unfalsified control. CoRR abs/2307.14591 (2023) - [i41]Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. CoRR abs/2308.04830 (2023) - [i40]Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian:
PromptTTS 2: Describing and Generating Voices with Text Prompt. CoRR abs/2309.02285 (2023) - [i39]Zhihang Xu, Shaofei Zhang, Xi Wang, Jiajun Zhang, Wenning Wei, Lei He, Sheng Zhao:
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023. CoRR abs/2309.02743 (2023) - [i38]Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer:
Large-Scale Automatic Audiobook Creation. CoRR abs/2309.03926 (2023) - [i37]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023) - [i36]Tianyu He, Junliang Guo, Runyi Yu, Yuchi Wang, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian:
GAIA: Zero-shot Talking Avatar Generation. CoRR abs/2311.15230 (2023) - 2022
- [j16]Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3089-3097 (2022) - [c50]Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee:
A Study on the Efficacy of Model Pre-Training In Developing Neural Text-to-Speech System. ICASSP 2022: 6087-6091 - [c49]Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. ICASSP 2022: 7247-7251 - [c48]Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo P. Mandic, Lei He, Sheng Zhao:
Infergrad: Improving Diffusion Models for Vocoder by Considering Inference in Training. ICASSP 2022: 8432-8436 - [c47]Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao:
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech. INTERSPEECH 2022: 456-460 - [c46]Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo:
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion. INTERSPEECH 2022: 1571-1575 - [c45]Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao:
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders. INTERSPEECH 2022: 1581-1585 - [c44]Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. INTERSPEECH 2022: 2568-2572 - [c43]Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks. ISMIR 2022: 567-574 - [c42]Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo P. Mandic, Lei He, Xiangyang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu:
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis. NeurIPS 2022 - [i35]Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo P. Mandic, Lei He, Sheng Zhao:
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training. CoRR abs/2202.03751 (2022) - [i34]Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems. CoRR abs/2203.00888 (2022) - [i33]Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao:
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech. CoRR abs/2203.17190 (2022) - [i32]Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. CoRR abs/2204.00436 (2022) - [i31]Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank K. Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu:
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality. CoRR abs/2205.04421 (2022) - [i30]Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo P. Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu:
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis. CoRR abs/2205.14807 (2022) - [i29]Dacheng Yin, Chuanxin Tang, Yanqing Liu, Xiaoqiang Wang, Zhiyuan Zhao, Yucheng Zhao, Zhiwei Xiong, Sheng Zhao, Chong Luo:
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion. CoRR abs/2206.13865 (2022) - [i28]Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao:
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders. CoRR abs/2207.04646 (2022) - [i27]Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song:
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation. CoRR abs/2208.13717 (2022) - [i26]Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks. CoRR abs/2208.14345 (2022) - [i25]Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
PromptTTS: Controllable Text-to-Speech with Text Descriptions. CoRR abs/2211.12171 (2022) - [i24]Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. CoRR abs/2211.16934 (2022) - [i23]Anni Tang, Tianyu He, Xu Tan, Jun Ling, Runnan Li, Sheng Zhao, Li Song, Jiang Bian:
Memories are One-to-Many Mapping Alleviators in Talking Face Generation. CoRR abs/2212.05005 (2022) - [i22]Zehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo P. Mandic:
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech. CoRR abs/2212.14518 (2022) - 2021
- [j15]Liang Shu, Ziran Wu, Yingmin You, Marcelo J. Dapino, Sheng Zhao:
Design and Adaptive Control of Matrix Transformer Based Indirect Converter for Large-Capacity Circuit Breaker Testing Application. IEEE Trans. Ind. Electron. 68(6): 5314-5324 (2021) - [c41]Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao:
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021. Blizzard Challenge 2021 - [c40]Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiang-Yang Li, Tao Qin:
MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network. ICASSP 2021: 391-395 - [c39]Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu:
Lightspeech: Lightweight and Fast Text to Speech with Neural Architecture Search. ICASSP 2021: 5699-5703 - [c38]Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu:
Adaspeech 2: Adaptive Text to Speech with Untranscribed Data. ICASSP 2021: 6613-6617 - [c37]Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu:
Denoispeech: Denoising Text to Speech with Frame-Level Noise Modeling. ICASSP 2021: 7063-7067 - [c36]Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. ICLR 2021 - [c35]Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
AdaSpeech: Adaptive Text to Speech for Custom Voice. ICLR 2021 - [c34]Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li:
A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems. Interspeech 2021: 1982-1986 - [c33]Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu:
Adaptive Text to Speech for Spontaneous Style. Interspeech 2021: 4668-4672 - [i21]Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu:
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. CoRR abs/2102.04040 (2021) - [i20]Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiangyang Li, Tao Qin:
MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network. CoRR abs/2103.00110 (2021) - [i19]Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu:
AdaSpeech: Adaptive Text to Speech for Custom Voice. CoRR abs/2103.00993 (2021) - [i18]Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu:
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data. CoRR abs/2104.09715 (2021) - [i17]Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu:
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style. CoRR abs/2107.02530 (2021) - [i16]Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li:
A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems. CoRR abs/2108.07493 (2021) - [i15]Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee:
A study on the efficacy of model pre-training in developing neural text-to-speech system. CoRR abs/2110.03857 (2021) - [i14]Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao:
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021. CoRR abs/2110.12612 (2021) - [i13]Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. CoRR abs/2111.09771 (2021) - 2020
- [j14]Zi-Kai Yang, Sheng Zhao, Xiangdong Huang, Wei Lu:
Accurate Doppler radar-based heart rate measurement using matched filter. IEICE Electron. Express 17(8): 20200062 (2020) - [j13]Zi-Kai Yang, Heping Shi, Sheng Zhao, Xiangdong Huang:
Vital Sign Detection during Large-Scale and Fast Body Movements Based on an Adaptive Noise Cancellation Algorithm Using a Single Doppler Radar Sensor. Sensors 20(15): 4183 (2020) - [c32]Naihan Li, Yanqing Liu, Yu Wu, Shujie Liu, Sheng Zhao, Ming Liu:
RobuTrans: A Robust Transformer-Based Text-to-Speech Model. AAAI 2020: 8228-8235 - [c31]Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, Sheng Zhao, Tie-Yan Liu:
A Study of Non-autoregressive Model for Sequence Generation. ACL 2020: 149-159 - [c30]Chengyi Wang, Yu Wu, Yujiao Du, Jinyu Li, Shujie Liu, Liang Lu, Shuo Ren, Guoli Ye, Sheng Zhao, Ming Zhou:
Semantic Mask for Transformer Based End-to-End Speech Recognition. INTERSPEECH 2020: 971-975 - [c29]Xiangyu Liang, Zhiyong Wu, Runnan Li, Yanqing Liu, Sheng Zhao, Helen Meng:
Enhancing Monotonicity for Robust Autoregressive Transformer TTS. INTERSPEECH 2020: 3181-3185 - [c28]Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. INTERSPEECH 2020: 3590-3594 - [c27]Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
MoBoAligner: A Neural Alignment Model for Non-Autoregressive TTS with Monotonic Boundary Search. INTERSPEECH 2020: 3999-4003 - [c26]Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin:
MultiSpeech: Multi-Speaker Text to Speech with Transformer. INTERSPEECH 2020: 4024-4028 - [c25]Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu:
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition. KDD 2020: 2802-2812 - [i12]Yi Ren, Jinglin Liu, Xu Tan, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
A Study of Non-autoregressive Model for Sequence Generation. CoRR abs/2004.10454 (2020) - [i11]Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search. CoRR abs/2005.08528 (2020) - [i10]Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. CoRR abs/2006.04558 (2020) - [i9]Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin:
MultiSpeech: Multi-Speaker Text to Speech with Transformer. CoRR abs/2006.04664 (2020) - [i8]Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. CoRR abs/2007.15188 (2020) - [i7]Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu:
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition. CoRR abs/2008.03687 (2020) - [i6]Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu:
DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling. CoRR abs/2012.09547 (2020)
2010 – 2019
- 2019
- [j12]Huiying Wen, Jiabin Wu, Yuchen Duan, Weiwei Qi, Sheng Zhao:
A Methodology of Timing Co-Evolutionary Path Optimization for Accident Emergency Rescue Considering Future Environmental Uncertainty. IEEE Access 7: 131459-131472 (2019) - [j11]Congru Yuan, Feng Jin, Xiuling Guo, Sheng Zhao, Wei Li, Haidong Guo:
Correlation Analysis of Breast Cancer DWI Combined with DCE-MRI Imaging Features with Molecular Subtypes and Prognostic Factors. J. Medical Syst. 43(4): 83:1-83:10 (2019) - [j10]Jing Wu, Xi Yang, Jianmei Gao, Sheng Zhao, Liang Wang, Tianyou Luo:
Application of MRI and CT Energy Spectrum Imaging in Hand and Foot Tendon Lesions. J. Medical Syst. 43(5): 116:1-116:9 (2019) - [c24]Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu:
Neural Speech Synthesis with Transformer Network. AAAI 2019: 6706-6713 - [c23]Hao Sun, Xu Tan, Jun-Wei Gan, Sheng Zhao, Dongxu Han, Hongzhi Liu, Tao Qin, Tie-Yan Liu:
Knowledge Distillation from Bert in Pre-Training and Fine-Tuning for Polyphone Disambiguation. ASRU 2019: 168-175 - [c22]Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng:
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition. ICASSP 2019: 6675-6679 - [c21]Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
Almost Unsupervised Text to Speech and Automatic Speech Recognition. ICML 2019: 5410-5419 - [c20]Runnan Li, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, Helen Meng:
Towards Discriminative Representation Learning for Speech Emotion Recognition. IJCAI 2019: 5060-5066 - [c19]Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu:
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. INTERSPEECH 2019: 2115-2119 - [c18]Hongyu Song, Wei ShangGuan, Bai-gen Cai, Sheng Zhao:
A Resilience Adjustment Method for Real-time Cooperative Optimization of High-speed Trains. ITSC 2019: 3194-3199 - [c17]Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech: Fast, Robust and Controllable Text to Speech. NeurIPS 2019: 3165-3174 - [i5]Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu:
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. CoRR abs/1904.03446 (2019) - [i4]Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
Almost Unsupervised Text to Speech and Automatic Speech Recognition. CoRR abs/1905.06791 (2019) - [i3]Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu:
FastSpeech: Fast, Robust and Controllable Text to Speech. CoRR abs/1905.09263 (2019) - [i2]Chengyi Wang, Yu Wu, Yujiao Du, Jinyu Li, Shujie Liu, Liang Lu, Shuo Ren, Guoli Ye, Sheng Zhao, Ming Zhou:
Semantic Mask for Transformer based End-to-End Speech Recognition. CoRR abs/1912.03010 (2019) - 2018
- [c16]Shravan Karthik, Karthik Ramanan, Nikhil Devshatwar, Subhajit Paul, Vishal Mahaveer, Sheng Zhao, Manoj Vishwanathan, Chetan Matad:
Hypervisor based approach for integrated cockpit solutions. ICCE-Berlin 2018: 1-6 - [c15]Sheng Zhao, Bai-gen Cai, Wei ShangGuan:
A Two-stage Method to Optimise Driving Strategy and Timetable for High-speed Trains. ITSC 2018: 2283-2288 - [i1]Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, Ming Liu, Ming Zhou:
Close to Human Quality TTS with Transformer. CoRR abs/1809.08895 (2018) - 2016
- [j9]Shuang Qiu, Nai J. Ge, Dong K. Sun, Sheng Zhao, Jian F. Sun, Zhao B. Guo, Ke Hu, Ning Gu:
Synthesis and Characterization of Magnetic Polyvinyl Alcohol (PVA) Hydrogel Microspheres for the Embolization of Blood Vessel. IEEE Trans. Biomed. Eng. 63(4): 730-736 (2016) - [j8]Yiming Chen, Sheng Zhao, Jay A. Farrell:
Computationally Efficient Carrier Integer Ambiguity Resolution in Multiepoch GPS/INS: A Common-Position-Shift Approach. IEEE Trans. Control. Syst. Technol. 24(5): 1541-1556 (2016) - [j7]Sheng Zhao, Yiming Chen, Jay A. Farrell:
High-Precision Vehicle Navigation in Urban Environments Using an MEM's IMU and Single-Frequency GPS Receiver. IEEE Trans. Intell. Transp. Syst. 17(10): 2854-2867 (2016) - 2015
- [j6]Haibin Wang, Xia Liu, Sheng Zhao, Lina Huo:
Multi-authority E-voting System Based on Group Blind Signature. Int. J. Online Eng. 11(9): 89-93 (2015) - 2014
- [c14]Yiming Chen, Sheng Zhao, Dongfang Zheng, Jay A. Farrell:
High reliability integer ambiguity resolution of 6DOF RTK GPS/INS. CDC 2014: 6609-6614 - 2013
- [c13]Sheng Zhao, Jay A. Farrell:
2D LIDAR Aided INS for vehicle positioning in urban environments. CCA 2013: 376-381 - [c12]Sheng Zhao, Wenjie Dong, Jay A. Farrell:
Quaternion-based trajectory tracking control of VTOL-UAVs using command filtered backstepping. ACC 2013: 1018-1023 - 2012
- [j5]Sheng Zhao, Manish Kumar:
Self-Localization and Tracking of Multiple robots in Experimental setups. Int. J. Robotics Autom. 27(3) (2012) - [c11]Sheng Zhao, Feng Xia, Zhen Chen, Zhen Li, Jianhua Ma:
MobiMsg: A Resource-Efficient Location-Based Mobile Instant Messaging System. CGC 2012: 466-471 - [c10]Ji He, Yao Qian, Frank K. Soong, Sheng Zhao:
Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS. INTERSPEECH 2012: 963-966 - [c9]Kun Lu, Hua Jiang, Mingchu Li, Sheng Zhao, Jianhua Ma:
Resources Collaborative Scheduling Model Based on Trust Mechanism in Cloud. TrustCom 2012: 863-868 - 2011
- [c8]Sheng Zhao, Subramanian Ramakrishnan, Manish Kumar:
Density-based control of multiple robots. ACC 2011: 481-486 - [c7]Sheng Zhao, Jay A. Farrell:
Optimization-based road curve fitting. CDC/ECC 2011: 5293-5298 - 2010
- [c6]Sheng Zhao, Manish Kumar:
A novel way to implement self-localization in a multi-robot experimental platform. ACC 2010: 6834-6839
2000 – 2009
- 2008
- [j4]Sheng Zhao, Qin Zhang, Xiaolin Liu, Xuemin Wang, Huilin Zhang, Yan Wu, Fei Jiang:
Analysis of synonymous codon usage in 11 Human Bocavirus isolates. Biosyst. 92(3): 207-214 (2008) - 2005
- [j3]Sheng Zhao, Russell D. Fernald:
Comprehensive Algorithm for Quantitative Real-Time Polymerase Chain Reaction. J. Comput. Biol. 12(8): 1047-1064 (2005) - 2003
- [c5]Sheng Zhao, Jianhua Tao, DanLing Jiang:
Chinese prosodic phrasing with extended features. ICASSP (1) 2003: 492-495 - 2002
- [c4]Sheng Zhao, Jianhua Tao, Lianhong Cai:
Learning Rules for Chinese Prosodic Phrase Prediction. SIGHAN@COLING 2002 - [c3]Sheng Zhao, Jianhua Tao, Lianhong Cai:
Prosodic phrasing with inductive learning. INTERSPEECH 2002: 2417-2420 - [c2]Jianhua Tao, Sheng Zhao, Lian-Hong Cai:
Automatic stress prediction of Chinese speech synthesis. ISCSLP 2002
1990 – 1999
- 1997
- [c1]Cathy H. Wu, Sheng Zhao, Kevin Simmons, Sailaja Shivakumar:
Motif neural network design for large-scale protein family identification. ICNN 1997: 86-89 - 1996
- [j2]Cathy H. Wu, Sheng Zhao, Hsi-Lien Chen, Chin-Ju Lo, Jerry McLarty:
Motif identification neural design for rapid and sensitive protein family search. Comput. Appl. Biosci. 12(2): 109-118 (1996) - [j1]Cathy H. Wu, Sheng Zhao, Hsi-Lien Chen:
A Protein Class Database Organized with ProSite Protein Groups and PIR Superfamilies. J. Comput. Biol. 3(4): 547-561 (1996)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-19 21:42 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint