default search action

combined dblp search
author search
venue search
publication search

ask others

Zhiyong Wu 0001

吴志勇

> Home > Persons

Person information

unicode name: 吴志勇
affiliation (PhD): Tsinghua University, Joint Research Center for Media Sciences, Beijing, China
affiliation: Chinese University of Hong Kong, Hong Kong

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j13]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LiLCZMWMTWW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LiLCZMWMTWW24
Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing. IEEE ACM Trans. Audio Speech Lang. Process. 32: 517-528 (2024)
[c200]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/WangZLZZCYT024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/WangZLZZCYT024
Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu:
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations. AAAI 2024: 301-309
[c199]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/Tang0WHCLM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Tang0WHCLM24
Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes. AAAI 2024: 15267-15275
[c198]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/XuCYHWZLLG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/XuCYHWZLLG24
Yaoxun Xu, Hangting Chen, Jianwei Yu, Qiaochu Huang, Zhiyong Wu, Shi-Xiong Zhang, Guangzhi Li, Yi Luo, Rongzhi Gu:
SECap: Speech Emotion Captioning with Large Language Model. AAAI 2024: 19323-19331
[c197]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/HeHZLWY0CXW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/HeHZLWY0CXW24
Xu He, Qiaochu Huang, Zhensong Zhang, Zhiwei Lin, Zhiyong Wu, Sicheng Yang, Minglei Li, Zhiyi Chen, Songcen Xu, Xiaofei Wu:
Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model. CVPR 2024: 2263-2273
[c196]
- view
  authority control:
- export record
  dblp key:
  - conf/hci/LiuNW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/hci/LiuNW24
Yaxin Liu, Xiaomei Nie, Zhiyong Wu:
Collaboration of Digital Human Gestures and Teaching Materials for Enhanced Integration in MOOC Teaching Scenarios. HCI (59) 2024: 169-175
[c195]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhouZLWW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhouZLWW24
Yixuan Zhou, Shuoyi Zhou, Shun Lei, Zhiyong Wu, Menglin Wu:
The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge. ICASSP Workshops 2024: 71-72
[c194]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LinCTSYJFK0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LinCTSYJFK0M24
Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng:
Multi-View Midivae: Fusing Track- and Bar-View Representations for Long Multi-Track Symbolic Music Generation. ICASSP 2024: 941-945
[c193]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangCYYW0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangCYYW0M24
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. ICASSP 2024: 961-965
[c192]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TongZCKJLWM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TongZCKJLWM24
Weinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng:
SCNet: Sparse Compression Network for Music Source Separation. ICASSP 2024: 1276-1280
[c191]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiZLCKWJLFZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiZLCKWJLFZ24
Xingda Li, Fan Zhuo, Dan Luo, Jun Chen, Shiyin Kang, Zhiyong Wu, Tao Jiang, Yang Li, Han Fang, Yahui Zhou:
Generating Stereophonic Music with Single-Stage Language Models. ICASSP 2024: 1471-1475
[c190]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangXXCHG024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangXXCHG024
Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu:
FreeTalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness. ICASSP 2024: 7945-7949
[c189]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangHTZCG00M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangHTZCG00M24
Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng:
Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style Information. ICASSP 2024: 8185-8189
[c188]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XueYZ00DM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XueYZ00DM24
Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen Meng:
Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models. ICASSP 2024: 8296-8300
[c187]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuWGL0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuWGL0M24
Hui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu, Helen Meng:
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations. ICASSP 2024: 11141-11145
[c186]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWZ00WM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWZ00WM24
Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis. ICASSP 2024: 12316-12320
[c185]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWWW0LM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWWW0LM24
Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. ICASSP 2024: 12341-12345
[c184]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Sha00SM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Sha00SM24
Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. ICASSP 2024: 12577-12581
[c183]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Lei0CLWWKJZ0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Lei0CLWWKJZ0M24
Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. ICASSP 2024: 12662-12666
[c182]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/XuS00PZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/XuS00PZ24
Yaoxun Xu, Xingchen Song, Zhiyong Wu, Di Wu, Zhendong Peng, Binbin Zhang:
Hydraformer: One Encoder for All Subsampling Rates. ICME 2024: 1-6
[c181]
- view
  authority control:
- export record
  dblp key:
  - conf/ijcnn/ChengLDWC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnn/ChengLDWC24
Ming Cheng, Shun Lei, Dongyang Dai, Zhiyong Wu, Dading Chong:
NRAdapt: Noise-Robust Adaptive Text to Speech Using Untranscribed Data. IJCNN 2024: 1-8
[c180]
- view
  authority control:
- export record
  dblp key:
  - conf/ijcnn/NiuWS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnn/NiuWS24
Rui Niu, Zhiyong Wu, Changhe Song:
Representation Space Maintenance: Against Forgetting in Continual Learning. IJCNN 2024: 1-7
[c179]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/0002QJZLZ0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/0002QJZLZ0024
Yixuan Zhou, Xiaoyu Qin, Zeyu Jin, Shuoyi Zhou, Shun Lei, Songtao Zhou, Zhiyong Wu, Jia Jia:
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling. ACM Multimedia 2024: 554-563
[c178]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/Jin0W0ZZQ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/Jin0W0ZZQ024
Zeyu Jin, Jia Jia, Qixin Wang, Kehan Li, Shuoyi Zhou, Songtao Zhou, Xiaoyu Qin, Zhiyong Wu:
SpeechCraft: A Fine-Grained Expressive Speech Dataset with Natural Language Description. ACM Multimedia 2024: 1255-1264
[c177]
- view
  authority control:
- export record
  dblp key:
  - conf/mrac/CaiYX0X024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mrac/CaiYX0X024
Yunrui Cai, Runchuan Ye, Jingran Xie, Yixuan Zhou, Yaoxun Xu, Zhiyong Wu:
Robust Representation Learning for Multimodal Emotion Recognition with Contrastive Learning and Mixup. MRAC@MM 2024: 93-97
[c176]
- view
  authority control:
- export record
  dblp key:
  - conf/mrac/XuZCXY024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mrac/XuZCXY024
Yaoxun Xu, Yixuan Zhou, Yunrui Cai, Jingran Xie, Runchuan Ye, Zhiyong Wu:
Multimodal Emotion Captioning Using Large Language Model with Prompt Engineering. MRAC@MM 2024: 104-109
[c175]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Lei0TLLLWK0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Lei0TLLLWK0M24
Shun Lei, Yixuan Zhou, Boshi Tang, Max W. Y. Lam, Feng Liu, Hangyu Liu, Jingcheng Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
SongCreator: Lyrics-based Universal Song Generation. NeurIPS 2024
[i103]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-03476
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-03476
Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu:
Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness. CoRR abs/2401.03476 (2024)
[i102]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-07532
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-07532
Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng:
Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation. CoRR abs/2401.07532 (2024)
[i101]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17796
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17796
Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2401.17796 (2024)
[i100]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-05834
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-05834
Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng:
Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information. CoRR abs/2403.05834 (2024)
[i99]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-01862
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-01862
Xu He, Qiaochu Huang, Zhensong Zhang, Zhiwei Lin, Zhiyong Wu, Sicheng Yang, Minglei Li, Zhiyi Chen, Songcen Xu, Xiaofei Wu:
Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model. CoRR abs/2404.01862 (2024)
[i98]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-16619
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-16619
Yixuan Zhou, Shuoyi Zhou, Shun Lei, Zhiyong Wu, Menglin Wu:
The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge. CoRR abs/2404.16619 (2024)
[i97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08336
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08336
Xueyuan Chen, Dongchao Yang, Dingdong Wang, Xixin Wu, Zhiyong Wu, Helen Meng:
CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2406.08336 (2024)
[i96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-13509
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-13509
Weiqin Li, Peiji Yang, Yicheng Zhong, Yixuan Zhou, Zhisheng Wang, Zhiyong Wu, Xixin Wu, Helen Meng:
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models. CoRR abs/2407.13509 (2024)
[i95]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-04325
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-04325
Yaoxun Xu, Xingchen Song, Zhiyong Wu, Di Wu, Zhendong Peng, Binbin Zhang:
HydraFormer: One Encoder For All Subsampling Rates. CoRR abs/2408.04325 (2024)
[i94]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-13608
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-13608
Zeyu Jin, Jia Jia, Qixin Wang, Kehan Li, Shuoyi Zhou, Songtao Zhou, Xiaoyu Qin, Zhiyong Wu:
SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description. CoRR abs/2408.13608 (2024)
[i93]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-14211
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-14211
Xu He, Xiaoyu Li, Di Kang, Jiangnan Ye, Chaopeng Zhang, Liyang Chen, Xiangjun Gao, Han Zhang, Zhiyong Wu, Haolin Zhuang:
MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement. CoRR abs/2408.14211 (2024)
[i92]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-14340
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-14340
Yinghao Ma, Anders Øland, Anton Ragni, Bleiz Macsen Del Sette, Charalampos Saitis, Chris Donahue, Chenghua Lin, Christos Plachouras, Emmanouil Benetos, Elio Quinton, Elona Shatri, Fabio Morreale, Ge Zhang, György Fazekas, Gus Xia, Huan Zhang, Ilaria Manco, Jiawen Huang, Julien Guinot, Liwei Lin, Luca Marinelli, Max W. Y. Lam, Megha Sharma, Qiuqiang Kong, Roger B. Dannenberg, Ruibin Yuan, Shangda Wu, Shih-Lun Wu, Shuqi Dai, Shun Lei, Shiyin Kang, Simon Dixon, Wenhu Chen, Wenhao Huang, Xingjian Du, Xingwei Qu, Xu Tan, Yizhi Li, Zeyue Tian, Zhiyong Wu, Zhizheng Wu, Ziyang Ma, Ziyu Wang:
Foundation Models for Music: A Survey. CoRR abs/2408.14340 (2024)
[i91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-15676
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-15676
Yixuan Zhou, Xiaoyu Qin, Zeyu Jin, Shuoyi Zhou, Shun Lei, Songtao Zhou, Zhiyong Wu, Jia Jia:
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling. CoRR abs/2408.15676 (2024)
[i90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-00800
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-00800
Yaoxun Xu, Shi-Xiong Zhang, Jianwei Yu, Zhiyong Wu, Dong Yu:
Comparing Discrete and Continuous Space LLMs for Speech Recognition. CoRR abs/2409.00800 (2024)
[i89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-06029
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-06029
Shun Lei, Yixuan Zhou, Boshi Tang, Max W. Y. Lam, Feng Liu, Hangyu Liu, Jingcheng Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
SongCreator: Lyrics-based Universal Song Generation. CoRR abs/2409.06029 (2024)
[i88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-06237
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-06237
Wei Chen, Xintao Zhao, Jun Chen, Binzhu Sha, Zhiwei Lin, Zhiyong Wu:
RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion. CoRR abs/2409.06237 (2024)
[i87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-06307
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-06307
Shuochen Gao, Shun Lei, Fan Zhuo, Hangyu Liu, Feng Liu, Boshi Tang, Qiaochu Huang, Shiyin Kang, Zhiyong Wu:
An End-to-End Approach for Chord-Conditioned Song Generation. CoRR abs/2409.06307 (2024)
[i86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-08628
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-08628
Zhiqi Huang, Dan Luo, Jun Wang, Huan Liao, Zhiheng Li, Zhiyong Wu:
Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis. CoRR abs/2409.08628 (2024)
[i85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12560
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12560
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Zhiyong Wu, Helen Meng, Xixin Wu:
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions. CoRR abs/2409.12560 (2024)
[i84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-13216
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-13216
Yaoxun Xu, Hangting Chen, Jianwei Yu, Wei Tan, Rongzhi Gu, Shun Lei, Zhiwei Lin, Zhiyong Wu:
MuCodec: Ultra Low-Bitrate Music Codec. CoRR abs/2409.13216 (2024)
[i83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-01100
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-01100
Shuoyi Zhou, Yixuan Zhou, Weiqing Li, Jun Chen, Runchuan Ye, Weihao Wu, Zijian Lin, Shun Lei, Zhiyong Wu:
The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024. CoRR abs/2412.01100 (2024)
[i82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-08237
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-08237
Xingchen Song, Mengtao Xing, Changwei Ma, Shengqiang Li, Di Wu, Binbin Zhang, Fuping Pan, Dinghao Zhou, Yuekai Zhang, Shun Lei, Zhendong Peng, Zhiyong Wu:
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch. CoRR abs/2412.08237 (2024)
2023
[j12]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/LiangZWX23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/LiangZWX23
Xingwei Liang, Lu Zhang, Zhiyong Wu, Ruifeng Xu:
Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios. IEEE Signal Process. Lett. 30: 1697-1701 (2023)
[j11]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LeiZCWWKM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LeiZCWWKM23
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3290-3303 (2023)
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WuLLWLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuLLWLM23
Xixin Wu, Hui Lu, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng:
Hiformer: Sequence Modeling Networks With Hierarchical Attention Mechanisms. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3993-4003 (2023)
[c174]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/Yang0S023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Yang0S023
Zhihan Yang, Zhiyong Wu, Ying Shan, Jia Jia:
What Does Your Face Sound Like? 3D Face Shape towards Voice. AAAI 2023: 13905-13913
[c173]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/CaiSTDWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/CaiSTDWM23
Yunrui Cai, Changhe Song, Boshi Tang, Dongyang Dai, Zhiyong Wu, Helen Meng:
Robust Representation Learning for Speech Emotion Recognition with Moment Exchange. APSIPA ASC 2023: 1002-1007
[c172]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Yang00ZHBZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Yang00ZHBZ23
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang:
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation. CVPR 2023: 2321-2330
[c171]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BaoCZYW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BaoCZYW23
Weihong Bao, Liyang Chen, Chaoyong Zhou, Sicheng Yang, Zhiyong Wu:
Wavsyncswap: End-To-End Portrait-Customized Audio-Driven Talking Face Generation. ICASSP 2023: 1-5
[c170]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenRWLWWSM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenRWLWWSM23
Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng:
Inter-Subnet: Speech Enhancement with Subband Interaction. ICASSP 2023: 1-5
[c169]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenSLRHLWWSZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenSLRHLWWSZ23
Jun Chen, Yupeng Shi, Wenzhe Liu, Wei Rao, Shulin He, Andong Li, Yannan Wang, Zhiyong Wu, Shidong Shang, Chengshi Zheng:
Gesper: A Unified Framework for General Speech Restoration. ICASSP 2023: 1-2
[c168]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenSPZPW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenSPZPW23
Jie Chen, Xingchen Song, Zhendong Peng, Binbin Zhang, Fuping Pan, Zhiyong Wu:
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech. ICASSP 2023: 1-5
[c167]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LeiZCWKM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LeiZCWKM23
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Context-Aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis. ICASSP 2023: 1-5
[c166]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LinCDCYWZWWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LinCDCYWZWWM23
Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu, Yujun Wang, Helen Meng:
Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction. ICASSP 2023: 1-5
[c165]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SongWWZZPLPZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SongWWZZPLPZ23
Xingchen Song, Di Wu, Zhiyong Wu, Binbin Zhang, Yuekai Zhang, Zhendong Peng, Wenpeng Li, Fuping Pan, Changbao Zhu:
TrimTail: Low-Latency Streaming ASR with Simple But Effective Spectrogram-Level Length Penalty. ICASSP 2023: 1-5
[c164]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TongZCWKM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TongZCWKM23
Weinan Tong, Jiaxu Zhu, Jun Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
TFCnet: Time-Frequency Domain Corrector for Speech Separation. ICASSP 2023: 1-5
[c163]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangLCLBHWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangLCLBHWM23
Zilin Wang, Peng Liu, Jun Chen, Sipan Li, Jinfeng Bai, Gang He, Zhiyong Wu, Helen Meng:
A Synthetic Corpus Generation Method for Neural Vocoder Training. ICASSP 2023: 1-5
[c162]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangZWYWZM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangZWYWZM23
Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng:
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification. ICASSP 2023: 1-5
[c161]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuLHSWKM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuLHSWKM23
Yaoxun Xu, Baiji Liu, Qiaochu Huang, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng:
CB-Conformer: Contextual Biasing Conformer for Biased Word Recognition. ICASSP 2023: 1-5
[c160]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangZWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangZWM23
Yujie Yang, Kun Zhang, Zhiyong Wu, Helen Meng:
Keyword-Specific Acoustic Model Pruning for Open-Vocabulary Keyword Spotting. ICASSP 2023: 1-5
[c159]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhouLWSM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhouLWSM23
Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training. ICASSP 2023: 1-5
[c158]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhuangLXLCYWKM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhuangLXLCYWKM23
Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng:
GTN-Bailando: Genre Consistent long-Term 3D Dance Generation Based on Pre-Trained Genre Token Network. ICASSP 2023: 1-5
[c157]
- view
  authority control:
- export record
  dblp key:
  - conf/iccvw/Chen0LBL0Z23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccvw/Chen0LBL0Z23
Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. ICCV (Workshops) 2023: 2969-2979
[c156]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/ZhaoWCWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/ZhaoWCWM23
Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng:
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation-based Voice Conversion. ICME 2023: 1691-1696
[c155]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/LiLZLBWWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/LiLZLBWWM23
Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. ICME 2023: 1703-1708
[c154]
- view
  authority control:
- export record
  dblp key:
  - conf/icmi/YangXZ00WXD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmi/YangXZ00WXD23
Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai:
The DiffuseStyleGesture+ entry to the GENEA Challenge 2023. ICMI 2023: 779-785
[c153]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/Yang0LZHBCX23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/Yang0LZHBCX23
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao:
DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models. IJCAI 2023: 5860-5868
[c152]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuTXS0Y00M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuTXS0Y00M23
Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. INTERSPEECH 2023: 1334-1338
[c151]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Song0ZP0P023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Song0ZP0P023
Xingchen Song, Di Wu, Binbin Zhang, Zhendong Peng, Bo Dang, Fuping Pan, Zhiyong Wu:
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs. INTERSPEECH 2023: 1648-1652
[c150]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinWD00YWZW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinWD00YWZW23
Jiuxin Lin, Peng Wang, Heinrich Dinkel, Jun Chen, Zhiyong Wu, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang:
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information. INTERSPEECH 2023: 2488-2492
[c149]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuS0M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuS0M23
Jiaxu Zhu, Changhe Song, Zhiyong Wu, Helen Meng:
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge. INTERSPEECH 2023: 3272-3276
[c148]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiLH00KM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiLH00KM23
Weiqin Li, Shun Lei, Qiaochu Huang, Yixuan Zhou, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis. INTERSPEECH 2023: 3377-3381
[c147]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0024RWLJHW023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0024RWLJHW023
Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Yukai Ju, Shulin He, Yannan Wang, Zhiyong Wu:
MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation. INTERSPEECH 2023: 4034-4038
[c146]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuS0RHLW023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuS0RHLW023
Wenzhe Liu, Yupeng Shi, Jun Chen, Wei Rao, Shulin He, Andong Li, Yannan Wang, Zhiyong Wu:
Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction. INTERSPEECH 2023: 4044-4048
[c145]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0105LL0WM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0105LL0WM23
Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. INTERSPEECH 2023: 4858-4862
[c144]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangLLW0S023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangLLW0S023
Zhihan Yang, Shansong Liu, Xu Li, Haozhe Wu, Zhiyong Wu, Ying Shan, Jia Jia:
Prosody Modeling with 3D Visual Information for Expressive Video Dubbing. INTERSPEECH 2023: 4863-4867
[c143]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/YangW00ZHHXWYD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/YangW00ZHHXWYD23
Sicheng Yang, Zilin Wang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Qiaochu Huang, Lei Hao, Songcen Xu, Xiaofei Wu, Changpeng Yang, Zonghong Dai:
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons. ACM Multimedia 2023: 1033-1044
[c142]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LuW0M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LuW0M23
Hui Lu, Xixin Wu, Zhiyong Wu, Helen Meng:
SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody. ACM Multimedia 2023: 2829-2837
[c141]
- view
  authority control:
- export record
  dblp key:
  - conf/mrac/CaiXTWCX023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mrac/CaiXTWCX023
Yunrui Cai, Jingran Xie, Boshi Tang, Yuanyuan Wang, Jun Chen, Haiwei Xue, Zhiyong Wu:
First-order Multi-label Learning with Cross-modal Interactions for Multimodal Emotion Recognition. MRAC@MM 2023: 13-20
[i81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-06359
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-06359
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis. CoRR abs/2304.06359 (2023)
[i80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-09607
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-09607
Yaoxun Xu, Baiji Liu, Qiaochu Huang, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng:
CB-Conformer: Contextual biasing Conformer for biased word recognition. CoRR abs/2304.09607 (2023)
[i79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-12704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-12704
Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng:
GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network. CoRR abs/2304.12704 (2023)
[i78]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-04919
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-04919
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao:
DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models. CoRR abs/2305.04919 (2023)
[i77]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05203
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05203
Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing. CoRR abs/2305.05203 (2023)
[i76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05599
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05599
Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng:
Inter-SubNet: Speech Enhancement with Subband Interaction. CoRR abs/2305.05599 (2023)
[i75]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-09167
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-09167
Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng:
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion. CoRR abs/2305.09167 (2023)
[i74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-10649
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-10649
Xingchen Song, Di Wu, Binbin Zhang, Zhendong Peng, Bo Dang, Fuping Pan, Zhiyong Wu:
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs. CoRR abs/2305.10649 (2023)
[i73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11094
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11094
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang:
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation. CoRR abs/2305.11094 (2023)
[i72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-16749
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-16749
Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. CoRR abs/2305.16749 (2023)
[i71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08454
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08454
Wenzhe Liu, Yupeng Shi, Jun Chen, Wei Rao, Shulin He, Andong Li, Yannan Wang, Zhiyong Wu:
Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction. CoRR abs/2306.08454 (2023)
[i70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-14170
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-14170
Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu, Yujun Wang, Helen Meng:
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction. CoRR abs/2306.14170 (2023)
[i69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-16241
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-16241
Jiuxin Lin, Peng Wang, Heinrich Dinkel, Jun Chen, Zhiyong Wu, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang:
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information. CoRR abs/2306.16241 (2023)
[i68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-16250
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-16250
Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Yukai Ju, Shulin He, Yannan Wang, Zhiyong Wu:
MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation. CoRR abs/2306.16250 (2023)
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-16012
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-16012
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis. CoRR abs/2307.16012 (2023)
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-04830
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-04830
Liyang Chen, Zhiyong Wu, Runnan Li, Weihong Bao, Jun Ling, Xu Tan, Sheng Zhao:
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer. CoRR abs/2308.04830 (2023)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-13879
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-13879
Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai:
The DiffuseStyleGesture+ entry to the GENEA Challenge 2023. CoRR abs/2308.13879 (2023)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-16021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-16021
Yi Meng, Xiang Li, Zhiyong Wu, Tingtian Li, Zixun Sun, Xinyu Xiao, Chi Sun, Hui Zhan, Helen Meng:
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis. CoRR abs/2308.16021 (2023)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-16569
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-16569
Jie Chen, Xingchen Song, Zhendong Peng, Binbin Zhang, Fuping Pan, Zhiyong Wu:
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech. CoRR abs/2308.16569 (2023)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-16577
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-16577
Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. CoRR abs/2308.16577 (2023)
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-16593
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-16593
Weiqin Li, Shun Lei, Qiaochu Huang, Yixuan Zhou, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis. CoRR abs/2308.16593 (2023)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-16836
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-16836
Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information. CoRR abs/2308.16836 (2023)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-00284
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-00284
Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training. CoRR abs/2309.00284 (2023)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-01437
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-01437
Jiaxu Zhu, Changhe Song, Zhiyong Wu, Helen M. Meng:
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge. CoRR abs/2309.01437 (2023)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-02459
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-02459
Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen M. Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. CoRR abs/2309.02459 (2023)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07051
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07051
Sicheng Yang, Zilin Wang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Qiaochu Huang, Lei Hao, Songcen Xu, Xiaofei Wu, Changpeng Yang, Zonghong Dai:
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons. CoRR abs/2309.07051 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07803
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07803
Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. CoRR abs/2309.07803 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-11849
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-11849
Xianhao Wei, Jia Jia, Xiang Li, Zhiyong Wu, Ziyi Wang:
A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion Analysis. CoRR abs/2309.11849 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-11977
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-11977
Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. CoRR abs/2309.11977 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-07236
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-07236
Liyang Chen, Weihong Bao, Shun Lei, Boshi Tang, Zhiyong Wu, Shiyin Kang, Haozhi Huang:
AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation. CoRR abs/2310.07236 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12111
Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng:
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification. CoRR abs/2310.12111 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-04919
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-04919
Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. CoRR abs/2312.04919 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-09305
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-09305
Boshi Tang, Jianan Wang, Zhiyong Wu, Lei Zhang:
Stable Score Distillation for High-Quality 3D Generation. CoRR abs/2312.09305 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-10381
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-10381
Yaoxun Xu, Hangting Chen, Jianwei Yu, Qiaochu Huang, Zhiyong Wu, Shi-Xiong Zhang, Guangzhi Li, Yi Luo, Rongzhi Gu:
SECap: Speech Emotion Captioning with Large Language Model. CoRR abs/2312.10381 (2023)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-11442
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-11442
Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu:
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations. CoRR abs/2312.11442 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-11858
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-11858
Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration based on Similarity between Nodes. CoRR abs/2312.11858 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-12181
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-12181
Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis. CoRR abs/2312.12181 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15463
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15463
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. CoRR abs/2312.15463 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15567
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15567
Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen M. Meng:
Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models. CoRR abs/2312.15567 (2023)
2022
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WuLLWML22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuLLWML22
Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. IEEE ACM Trans. Audio Speech Lang. Process. 30: 202-217 (2022)
[c140]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/coling/ChenL0XZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/coling/ChenL0XZM22
Xueyuan Chen, Shun Lei, Zhiyong Wu, Dong Xu, Weifeng Zhao, Helen Meng:
Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis. COLING 2022: 7193-7202
[c139]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuHGZHKWML22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuHGZHKWML22
Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Adversarial Sample Detection for Speaker Verification by Neural Vocoders. ICASSP 2022: 236-240
[c138]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YeMSWXTW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YeMSWXTW22
Wenxuan Ye, Shaoguang Mao, Frank K. Soong, Wenshan Wu, Yan Xia, Jonathan Tien, Zhiyong Wu:
An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings. ICASSP 2022: 6827-6831
[c137]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuHWLM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuHWLM22
Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. ICASSP 2022: 6902-6906
[c136]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoLSWKTM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoLSWKTM22
Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng:
Disentangling Content and Fine-Grained Prosody Information Via Hybrid ASR Bottleneck Features for Voice Conversion. ICASSP 2022: 7022-7026
[c135]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DaiSLWPLM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DaiSLWPLM22
Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng:
An End-to-End Chinese Text Normalization Model Based on Rule-Guided Flat-Lattice Transformer. ICASSP 2022: 7122-7126
[c134]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWLLTZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWLLTZ22
Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. ICASSP 2022: 7247-7251
[c133]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenSZWCWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenSZWCWM22
Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen, Zhongqin Wu, Helen Meng:
A Character-Level Span-Based Model for Mandarin Prosodic Structure Prediction. ICASSP 2022: 7602-7606
[c132]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWTWKM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWTWKM22
Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng:
FullSubNet+: Channel Attention Fullsubnet with Complex Spectrograms for Speech Enhancement. ICASSP 2022: 7857-7861
[c131]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiMLWMWS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiMLWMWS22
Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-Based Multi-Modal Context Modeling. ICASSP 2022: 7917-7921
[c130]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LeiZCWKM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LeiZCWKM22
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. ICASSP 2022: 7922-7926
[c129]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiMWMTWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiMWMTWW22
Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. ICASSP 2022: 8007-8011
[c128]
- view
  authority control:
- export record
  dblp key:
  - conf/icip/ChenWS022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icip/ChenWS022
Yulan Chen, Zhiyong Wu, Zheyan Shen, Jia Jia:
Learning from Designers: Fashion Compatibility Analysis Via Dataset Distillation. ICIP 2022: 856-860
[c127]
- view
  authority control:
- export record
  dblp key:
  - conf/icmi/Yang0LZLCB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmi/Yang0LZLCB22
Sicheng Yang, Zhiyong Wu, Minglei Li, Mengchen Zhao, Jiuxin Lin, Liyang Chen, Weihong Bao:
The ReprGesture entry to the GENEA Challenge 2022. ICMI 2022: 758-763
[c126]
- view
  authority control:
- export record
  dblp key:
  - conf/ijcnn/YangWJ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnn/YangWJ22
Zhihan Yang, Zhiyong Wu, Jia Jia:
Speaker Characteristics Guided Speech Synthesis. IJCNN 2022: 1-8
[c125]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangLWZH0LM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangLWZH0LM22
Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. INTERSPEECH 2022: 306-310
[c124]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenSTWK0M22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenSTWK0M22
Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. INTERSPEECH 2022: 426-430
[c123]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRW0WYSM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRW0WYSM22
Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng:
Speech Enhancement with Fullband-Subband Cross-Attention Network. INTERSPEECH 2022: 976-980
[c122]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangTZ0SWCTZWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangTZ0SWCTZWM22
Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng:
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion. INTERSPEECH 2022: 2553-2557
[c121]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouSLZ0B0M22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouSLZ0B0M22
Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. INTERSPEECH 2022: 2573-2577
[c120]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouLYTY0KM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouLYTY0KM22
Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information. INTERSPEECH 2022: 4292-4296
[c119]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouSL0B0M22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouSL0B0M22
Yixuan Zhou, Changhe Song, Jingbei Li, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis. INTERSPEECH 2022: 5518-5522
[c118]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeiZCH0KM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeiZCH0KM22
Shun Lei, Yixuan Zhou, Liyang Chen, Jiankun Hu, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. INTERSPEECH 2022: 5523-5527
[c117]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiSW0JM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiSW0JM22
Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset. INTERSPEECH 2022: 5528-5532
[c116]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MengL0LSXSZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MengL0LSXSZM22
Yi Meng, Xiang Li, Zhiyong Wu, Tingtian Li, Zixun Sun, Xinyu Xiao, Chi Sun, Hui Zhan, Helen Meng:
CALM: Constrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis. INTERSPEECH 2022: 5533-5537
[c115]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/ChenHWWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/ChenHWWM22
Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu, Helen Meng:
HILvoice:Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis. ISCSLP 2022: 86-90
[c114]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/LiWRWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/LiWRWM22
Chenyi Li, Zhiyong Wu, Wei Rao, Yannan Wang, Helen Meng:
Boosting the Performance of SpEx+ by Attention and Contextual Mechanism. ISCSLP 2022: 135-139
[c113]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LiMW0JMTWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LiMW0JMTWW22
Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. ACM Multimedia 2022: 5811-5820
[c112]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/odyssey/WuKMZW0LM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/odyssey/WuKMZW0LM22
Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. Odyssey 2022: 92-99
[c111]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LuWWWLM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LuWWWLM22
Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE. SLT 2022: 814-821
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-12188
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-12188
Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng:
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement. CoRR abs/2203.12188 (2022)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-12201
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-12201
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. CoRR abs/2203.12201 (2022)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-12813
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-12813
Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng:
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion. CoRR abs/2203.12813 (2022)
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15249
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15249
Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-Yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. CoRR abs/2203.15249 (2022)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16838
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16838
Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. CoRR abs/2203.16838 (2022)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16922
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16922
Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen, Zhongqin Wu, Helen Meng:
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction. CoRR abs/2203.16922 (2022)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16928
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16928
Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. CoRR abs/2203.16928 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16954
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16954
Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng:
An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer. CoRR abs/2203.16954 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-00990
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-00990
Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. CoRR abs/2204.00990 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-02743
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-02743
Shun Lei, Yixuan Zhou, Liyang Chen, Jiankun Hu, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. CoRR abs/2204.02743 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-09131
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-09131
Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. CoRR abs/2206.09131 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-02454
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-02454
Bin Su, Shaoguang Mao, Frank K. Soong, Zhiyong Wu:
Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives. CoRR abs/2207.02454 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-05359
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-05359
Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset. CoRR abs/2208.05359 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-08757
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-08757
Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng:
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion. CoRR abs/2208.08757 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-12133
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-12133
Sicheng Yang, Zhiyong Wu, Minglei Li, Mengchen Zhao, Jiuxin Lin, Liyang Chen, Weihong Bao:
The ReprGesture entry to the GENEA Challenge 2022. CoRR abs/2208.12133 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-13771
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-13771
Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE. CoRR abs/2210.13771 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-17079
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-17079
Xingchen Song, Di Wu, Binbin Zhang, Zhiyong Wu, Wenpeng Li, Dongfang Li, Pengshen Zhang, Zhendong Peng, Fuping Pan, Changbao Zhu, Zhongqin Wu:
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition. CoRR abs/2210.17079 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-00522
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-00522
Xingchen Song, Di Wu, Zhiyong Wu, Binbin Zhang, Yuekai Zhang, Zhendong Peng, Wenpeng Li, Fuping Pan, Changbao Zhu:
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty. CoRR abs/2211.00522 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-05432
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-05432
Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng:
Speech Enhancement with Fullband-Subband Cross-Attention Network. CoRR abs/2211.05432 (2022)
2021
[j8]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WuCLLKWLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuCLLKWLM21
Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exemplar-Based Emotive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 29: 874-886 (2021)
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WuCLLWWLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuCLLWWLM21
Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Sequential Capsule Networks. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3280-3291 (2021)
[c110]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/Zhou00YW0MHSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Zhou00YW0MHSW21
Suping Zhou, Jia Jia, Zhiyong Wu, Zhihan Yang, Yanfeng Wang, Wei Chen, Fanbo Meng, Shuo Huang, Jialie Shen, Xiaochuan Wang:
Inferring Emotion from Large-scale Internet Voice Data: A Semi-supervised Curriculum Augmentation based Deep Learning Approach. AAAI 2021: 6039-6047
[c109]
- view
  - electronic edition @ ieee.org
  - details & citations
- export record
  dblp key:
  - conf/apsipa/HuangWKDJFTLLSY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/HuangWKDJFTLLSY21
Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. APSIPA ASC 2021: 1433-1437
[c108]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SunWCTWMXX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SunWCTWMXX21
Aolan Sun, Jianzong Wang, Ning Cheng, Methawee Tantrawenith, Zhiyong Wu, Helen Meng, Edward Xiao, Jing Xiao:
Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples. ASRU 2021: 946-953
[c107]
- view
  authority control:
- export record
  dblp key:
  - conf/chi/BuMLZ00XSWYLWSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/chi/BuMLZ00XSWYLWSL21
Yaohua Bu, Tianyi Ma, Weijun Li, Hang Zhou, Jia Jia, Shengqi Chen, Kaiyuan Xu, Dachuan Shi, Haozhe Wu, Zhihan Yang, Kun Li, Zhiyong Wu, Yuanchun Shi, Xiaobo Lu, Ziwei Liu:
PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback. CHI 2021: 676:1-676:14
[c106]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/GuoSPGX0J21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/GuoSPGX0J21
Yingmei Guo, Linjun Shou, Jian Pei, Ming Gong, Mingxing Xu, Zhiyong Wu, Daxin Jiang:
Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding. EMNLP (1) 2021: 3226-3237
[c105]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/CaiDWLLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/CaiDWLLM21
Xiong Cai, Dongyang Dai, Zhiyong Wu, Xiang Li, Jingbei Li, Helen Meng:
Emotion Controllable Speech Synthesis Using Emotion-Unlabeled Dataset with the Assistance of Cross-Domain Speech Emotion Recognition. ICASSP 2021: 5734-5738
[c104]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Song0HW0M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Song0HW0M21
Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen M. Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. ICASSP 2021: 5894-5898
[c103]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SongLZ0M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SongLZ0M21
Changhe Song, Jingbei Li, Yixuan Zhou, Zhiyong Wu, Helen M. Meng:
Syntactic Representation Learning For Neural Network Based TTS with Syntactic Parse Tree Traversal. ICASSP 2021: 6064-6068
[c102]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuLL0ML21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuLL0ML21
Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models. ICASSP 2021: 6718-6722
[c101]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SuMSXT021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SuMSXT021
Bin Su, Shaoguang Mao, Frank K. Soong, Yan Xia, Jonathan Tien, Zhiyong Wu:
Improving Pronunciation Assessment Via Ordinal Regression with Anchored Reference Samples. ICASSP 2021: 7748-7752
[c100]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangYLTK0M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangYLTK0M21
Jie Wang, Yuren You, Feng Liu, Deyi Tuo, Shiyin Kang, Zhiyong Wu, Helen Meng:
The Huya Multi-Speaker and Multi-Style Speech Synthesis System for M2voc Challenge 2020. ICASSP 2021: 8608-8612
[c99]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLZ0KM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLZ0KM21
Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Shiyin Kang, Helen Meng:
Adversarially Learning Disentangled Speech Representations for Robust Multi-Factor Voice Conversion. Interspeech 2021: 846-850
[c98]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Lu0WLKLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Lu0WLKLM21
Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis. Interspeech 2021: 3775-3779
[c97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuZWWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuZWWL21
Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee:
Voting for the Right Answer: Adversarial Defense for Speaker Verification. Interspeech 2021: 4294-4298
[c96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiSL00M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiSL00M21
Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Multi-Scale Style Control for Expressive Speech Synthesis. Interspeech 2021: 4673-4677
[c95]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/CaiWZSDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/CaiWZSDM21
Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng:
Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network. ISCSLP 2021: 1-5
[c94]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuHWYY0M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuHWYY0M21
Liangqi Liu, Jiankun Hu, Zhiyong Wu, Song Yang, Songfan Yang, Jia Jia, Helen Meng:
Controllable Emphatic Speech Synthesis based on Forward Attention for Expressive Speech Synthesis. SLT 2021: 410-414
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-00184
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-00184
Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Helen Meng:
Adversarially learning disentangled speech representations for robust multi-factor voice conversion. CoRR abs/2102.00184 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-07047
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-07047
Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial defense for automatic speaker verification by cascaded self-supervised learning models. CoRR abs/2102.07047 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-03521
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-03521
Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen M. Meng:
Towards Multi-Scale Style Control for Expressive Speech Synthesis. CoRR abs/2104.03521 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-06835
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-06835
Yixuan Zhou, Changhe Song, Jingbei Li, Zhiyong Wu, Helen Meng:
Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech. CoRR abs/2104.06835 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-05182
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-05182
Yaohua Bu, Tianyi Ma, Weijun Li, Hang Zhou, Jia Jia, Shengqi Chen, Kaiyuan Xu, Dachuan Shi, Haozhe Wu, Zhihan Yang, Kun Li, Zhiyong Wu, Yuanchun Shi, Xiaobo Lu, Ziwei Liu:
PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback. CoRR abs/2105.05182 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-00273
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-00273
Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. CoRR abs/2106.00273 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-06233
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-06233
Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis. CoRR abs/2106.06233 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-07868
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-07868
Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee:
Voting for the right answer: Adversarial defense for speaker verification. CoRR abs/2106.07868 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-00309
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-00309
Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Spotting adversarial samples for speaker verification by neural vocoders. CoRR abs/2107.00309 (2021)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-03298
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-03298
Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis. CoRR abs/2107.03298 (2021)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-01583
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-01583
Yingmei Guo, Linjun Shou, Jian Pei, Ming Gong, Mingxing Xu, Zhiyong Wu, Daxin Jiang:
Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding. CoRR abs/2109.01583 (2021)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-07274
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-07274
Wenxuan Ye, Shaoguang Mao, Frank K. Soong, Wenshan Wu, Yan Xia, Jonathan Tien, Zhiyong Wu:
An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings. CoRR abs/2110.07274 (2021)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-09771
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-09771
Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. CoRR abs/2111.09771 (2021)
2020
[c93]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuWCSWKWLSYM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuWCSWKWLSYM20
Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
End-To-End Accent Conversion Without Using Native Utterances. ICASSP 2020: 6289-6293
[c92]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/CaoLWKLWLSYM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/CaoLWKLWLSYM20
Yuewen Cao, Songxiang Liu, Xixin Wu, Shiyin Kang, Peng Liu, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora. ICASSP 2020: 7619-7623
[c91]
- view
  authority control:
- export record
  dblp key:
  - conf/icpr/BanTengW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icpr/BanTengW20
Michael Lao BanTeng, Zhiyong Wu:
Channel-Wise Dense Connection Graph Convolutional Network for Skeleton-Based Action Recognition. ICPR 2020: 3799-3806
[c90]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcnlp/GuoWX20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnlp/GuoWX20
Yingmei Guo, Zhiyong Wu, Mingxing Xu:
FERNet: Fine-grained Extraction and Reasoning Network for Emotion Recognition in Dialogues. AACL/IJCNLP 2020: 37-43
[c89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SongWH0M20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SongWH0M20
Xingcheng Song, Zhiyong Wu, Yiheng Huang, Dan Su, Helen Meng:
SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition. INTERSPEECH 2020: 581-585
[c88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWYL0MS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWYL0MS20
Kun Zhang, Zhiyong Wu, Daode Yuan, Jian Luan, Jia Jia, Helen Meng, Binheng Song:
Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting. INTERSPEECH 2020: 2567-2571
[c87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiangWLLZM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiangWLLZM20
Xiangyu Liang, Zhiyong Wu, Runnan Li, Yanqing Liu, Sheng Zhao, Helen Meng:
Enhancing Monotonicity for Robust Autoregressive Transformer TTS. INTERSPEECH 2020: 3181-3185
[c86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SongWHWSM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SongWHWSM20
Xingchen Song, Guangsen Wang, Yiheng Huang, Zhiyong Wu, Dan Su, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks. INTERSPEECH 2020: 3765-3769
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-12531
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-12531
Dongyang Dai, Li Chen, Yuping Wang, Mu Wang, Rui Xia, Xuchen Song, Zhiyong Wu, Yuxuan Wang:
Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement. CoRR abs/2005.12531 (2020)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-11610
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-11610
Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. CoRR abs/2006.11610 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13339
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13339
Bin Su, Shaoguang Mao, Frank K. Soong, Yan Xia, Jonathan Tien, Zhiyong Wu:
Improving pronunciation assessment via ordinal regression with anchored reference samples. CoRR abs/2010.13339 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13350
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13350
Xiong Cai, Dongyang Dai, Zhiyong Wu, Xiang Li, Jingbei Li, Helen Meng:
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition. CoRR abs/2010.13350 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-15025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-15025
Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. CoRR abs/2010.15025 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-06971
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-06971
Changhe Song, Jingbei Li, Yixuan Zhou, Zhiyong Wu, Helen M. Meng:
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal. CoRR abs/2012.06971 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-11174
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-11174
Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng:
Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network. CoRR abs/2012.11174 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c85]
- view
  authority control:
- export record
  dblp key:
  - conf/acii/GuoXWWS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acii/GuoXWWS19
Yingmei Guo, Mingxing Xu, Zhiyong Wu, Jianming Wu, Bin Su:
Multi-Scale Convolutional Recurrent Neural Network with Ensemble Method for Weakly Labeled Sound Event Detection. ACII Workshops 2019: 1-5
[c84]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/DuWKS0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/DuWKS0M19
Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Prosodic Structure Prediction using Deep Self-attention Neural Network. APSIPA 2019: 320-324
[c83]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/LiuWL0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/LiuWL0M19
Liangqi Liu, Zhiyong Wu, Runnan Li, Jia Jia, Helen Meng:
Learning Contextual Representation with Convolution Bank and Multi-head Self-attention for Speech Emphasis Detection. APSIPA 2019: 922-926
[c82]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/DuWKS0M19a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/DuWKS0M19a
Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Automatic Prosodic Structure Labeling using DNN-BGRU-CRF Hybrid Neural Network. APSIPA 2019: 1234-1238
[c81]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/ZhangW0MS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/ZhangW0MS19
Kun Zhang, Zhiyong Wu, Jia Jia, Helen M. Meng, Binheng Song:
Query-by-Example Spoken Term Detection using Attentive Pooling Networks. APSIPA 2019: 1267-1272
[c80]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiW0ZM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiW0ZM19
Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng:
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition. ICASSP 2019: 6675-6679
[c79]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuLCLYDMHWLM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuLCLYDMHWLM19
Xixin Wu, Songxiang Liu, Yuewen Cao, Xu Li, Jianwei Yu, Dongyang Dai, Xi Ma, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Capsule Networks. ICASSP 2019: 6695-6699
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuWLK0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuWLK0M19
Hui Lu, Zhiyong Wu, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng:
A Compact Framework for Voice Conversion Using Wavenet Conditioned on Phonetic Posteriorgrams. ICASSP 2019: 6810-6814
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/CaoWLYLWLM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/CaoWLYLWLM19
Yuewen Cao, Xixin Wu, Songxiang Liu, Jianwei Yu, Xu Li, Zhiyong Wu, Xunying Liu, Helen Meng:
End-to-end Code-switched TTS with Mix of Monolingual Recordings. ICASSP 2019: 6935-6939
[c76]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangWWKTLSYM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangWWKTLSYM19
Mu Wang, Xixin Wu, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Guangzhi Li, Dan Su, Dong Yu, Helen Meng:
Quasi-fully Convolutional Neural Network with Variational Inference for Speech Synthesis. ICASSP 2019: 7060-7064
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DaiWLW0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DaiWLW0M19
Dongyang Dai, Zhiyong Wu, Runnan Li, Xixin Wu, Jia Jia, Helen Meng:
Learning Discriminative Features from Spectrograms Using Center Loss for Speech Emotion Recognition. ICASSP 2019: 7405-7409
[c74]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaoWJLS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaoWJLS19
Shaoguang Mao, Zhiyong Wu, Jingshuai Jiang, Peiyun Liu, Frank K. Soong:
NN-based Ordinal Regression for Assessing Fluency of ESL Speech. ICASSP 2019: 7420-7424
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/icmi/Chen0W19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmi/Chen0W19
Yulan Chen, Jia Jia, Zhiyong Wu:
Modeling Emotion Influence Using Attention-based Graph Convolutional Recurrent Network. ICMI 2019: 302-309
[c72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/LiW0BZM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/LiW0BZM19
Runnan Li, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, Helen Meng:
Towards Discriminative Representation Learning for Speech Emotion Recognition. IJCAI 2019: 5060-5066
[c71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuWDLK0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuWDLK0M19
Hui Lu, Zhiyong Wu, Dongyang Dai, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng:
One-Shot Voice Conversion with Global Speaker Embeddings. INTERSPEECH 2019: 669-673
[c70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DaiWKW0S0M19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DaiWKW0S0M19
Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. INTERSPEECH 2019: 2090-2094
[c69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWLZYM19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWLZYM19
Jingbei Li, Zhiyong Wu, Runnan Li, Pengpeng Zhi, Song Yang, Helen Meng:
Knowledge-Based Linguistic Encoding for End-to-End Mandarin Text-to-Speech Synthesis. INTERSPEECH 2019: 4494-4498
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-10387
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-10387
Xingcheng Song, Guangsen Wang, Zhiyong Wu, Yiheng Huang, Dan Su, Dong Yu, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks. CoRR abs/1910.10387 (2019)
2018
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/speech/LiMLWM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/speech/LiMLWM18
Kun Li, Shaoguang Mao, Xu Li, Zhiyong Wu, Helen Meng:
Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks. Speech Commun. 96: 28-36 (2018)
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/aims2/LiWLXLC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aims2/LiWLXLC18
Jingbei Li, Zhiyong Wu, Runnan Li, Mingxing Xu, Kehua Lei, Lianhong Cai:
Multi-modal Multi-scale Speech Expression Evaluation in Computer-Assisted Language Learning. AIMS 2018: 16-28
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/aims2/ZhuWLNM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aims2/ZhuWLNM18
Ziwei Zhu, Zhiyong Wu, Runnan Li, Yishuang Ning, Helen Meng:
Learning Frame-Level Recurrent Neural Networks Representations for Query-by-Example Spoken Term Detection on Mobile Devices. AIMS 2018: 55-66
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiWH0MC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiWH0MC18
Runnan Li, Zhiyong Wu, Yuchen Huang, Jia Jia, Helen Meng, Lianhong Cai:
Emphatic Speech Generation with Conditioned Input Layer and Bidirectional LSTMS for Expressive Speech Synthesis. ICASSP 2018: 5129-5133
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuSKLWLM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuSKLWLM18
Xixin Wu, Lifa Sun, Shiyin Kang, Songxiang Liu, Zhiyong Wu, Xunying Liu, Helen Meng:
Feature Based Adaptation for Speaking Style Synthesis. ICASSP 2018: 5304-5308
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaoLLWLM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaoLLWLM18
Shaoguang Mao, Xu Li, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng:
Unsupervised Discovery of an Extended Phoneme Set in L2 English Speech for Mispronunciation Detection and Diagnosis. ICASSP 2018: 6244-6248
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaoWLLMC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaoWLLMC18
Shaoguang Mao, Zhiyong Wu, Runnan Li, Xu Li, Helen Meng, Lianhong Cai:
Applying Multitask Learning to Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech. ICASSP 2018: 6254-6258
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/MaoWLLWM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/MaoWLLWM18
Shaoguang Mao, Zhiyong Wu, Xu Li, Runnan Li, Xixin Wu, Helen Meng:
Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech. ICME 2018: 1-6
[c61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuWLMC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuWLMC18
Ziwei Zhu, Zhiyong Wu, Runnan Li, Helen Meng, Lianhong Cai:
Siamese Recurrent Auto-Encoder Representation for Query-by-Example Spoken Term Detection. INTERSPEECH 2018: 102-106
[c60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangWSM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangWSM18
Shuai Yang, Zhiyong Wu, Binbin Shen, Helen Meng:
Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method. INTERSPEECH 2018: 317-321
[c59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuCWLKWLSYM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuCWLKWLSYM18
Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis. INTERSPEECH 2018: 3072-3076
[c58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaW0XMC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaW0XMC18
Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen Meng, Lianhong Cai:
Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms. INTERSPEECH 2018: 3683-3687
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WangWKW0SYM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WangWKW0SYM18
Mu Wang, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Speech Super-Resolution Using Parallel WaveNet. ISCSLP 2018: 260-264
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LiW0LCM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LiW0LCM18
Runnan Li, Zhiyong Wu, Jia Jia, Jingbei Li, Wei Chen, Helen Meng:
Inferring User Emotive State Changes in Realistic Human-Computer Conversational Dialogs. ACM Multimedia 2018: 136-144
2017
[c55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/NingJWLAWM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/NingJWLAWM17
Yishuang Ning, Jia Jia, Zhiyong Wu, Runnan Li, Yongsheng An, Yanfeng Wang, Helen M. Meng:
Multi-Task Deep Learning for User Intention Understanding in Speech Interaction Systems. AAAI 2017: 161-167
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiWLMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiWLMC17
Runnan Li, Zhiyong Wu, Xunying Liu, Helen M. Meng, Lianhong Cai:
Multi-task learning of structured output layer bidirectional LSTMS for speech synthesis. ICASSP 2017: 5510-5514
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NingWLJXMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NingWLJXMC17
Yishuang Ning, Zhiyong Wu, Runnan Li, Jia Jia, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Learning cross-lingual knowledge with multilingual BLSTM for emphasis detection with limited training data. ICASSP 2017: 5615-5619
[c52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangWLMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangWLMC17
Yuchen Huang, Zhiyong Wu, Runnan Li, Helen Meng, Lianhong Cai:
Multi-Task Learning for Prosodic Structure Generation Using BLSTM RNN with Structured Output Layer. INTERSPEECH 2017: 779-783
[c51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaWJXMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaWJXMC17
Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen Meng, Lianhong Cai:
Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space. INTERSPEECH 2017: 1238-1242
[c50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWNSMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWNSMC17
Runnan Li, Zhiyong Wu, Yishuang Ning, Lifa Sun, Helen Meng, Lianhong Cai:
Spectro-Temporal Modelling with Time-Frequency LSTM and Structured Output Layer for Voice Conversion. INTERSPEECH 2017: 3409-3413
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/mmm/TangWC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mmm/TangWC17
Song Tang, Zhiyong Wu, Kang Chen:
Movie Recommendation via BLSTM. MMM (2) 2017: 269-279
2016
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YuLWKMC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YuLWKMC16
Quanjie Yu, Peng Liu, Zhiyong Wu, Shiyin Kang, Helen Meng, Lianhong Cai:
Learning cross-lingual information with multilingual BLSTM for speech synthesis of low-resource languages. ICASSP 2016: 5545-5549
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LanLNWMJC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LanLNWMJC16
Xinyu Lan, Xu Li, Yishuang Ning, Zhiyong Wu, Helen Meng, Jia Jia, Lianhong Cai:
Low level descriptors based DBLSTM bottleneck feature for speech driven talking avatar. ICASSP 2016: 5550-5554
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TangHWMXC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TangHWMXC16
Yaodong Tang, Yuchen Huang, Zhiyong Wu, Helen Meng, Mingxing Xu, Lianhong Cai:
Question detection from acoustic features using recurrent neural network with gated recurrent unit. ICASSP 2016: 6125-6129
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/LiWXMC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/LiWXMC16
Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Recognizing stances in Mandarin social ideological debates with text and acoustic features. ICME Workshops 2016: 1-6
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/XianyuXWC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/XianyuXWC16
Haishu Xianyu, Mingxing Xu, Zhiyong Wu, Lianhong Cai:
Heterogeneity-entropy based unsupervised feature learning for personality prediction with cross-media data. ICME 2016: 1-6
[c43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangWMXC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangWMXC16
Yaodong Tang, Zhiyong Wu, Helen M. Meng, Mingxing Xu, Lianhong Cai:
Analysis on Gated Recurrent Unit Based Question Detection Approach. INTERSPEECH 2016: 735-739
[c42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWXMC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWXMC16
Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition. INTERSPEECH 2016: 1392-1396
[c41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWMJLC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWMJLC16
Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai:
Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis. INTERSPEECH 2016: 1472-1476
[c40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWMJLC16a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWMJLC16a
Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai:
Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data. INTERSPEECH 2016: 1477-1481
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/LiWMC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/LiWMC16
Runnan Li, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
DBLSTM-based multi-task learning for pitch transformation in voice conversion. ISCSLP 2016: 1-5
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/ispacs/WeiJW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ispacs/WeiJW16
Leye Wei, Xin Jin, Zhiyong Wu:
3D modeling based on multiple Unmanned Aerial Vehicles with the optimal paths. ISPACS 2016: 1-6
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/pcm/JiangJW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pcm/JiangJW16
Yiqi Jiang, Xin Jin, Zhiyong Wu:
Video Inpainting Based on Joint Gradient and Noise Minimization. PCM (1) 2016: 407-417
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/pcm/WeiJWZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pcm/WeiJWZ16
Leye Wei, Xin Jin, Zhiyong Wu, Lei Zhang:
A Real-Time Gesture-Based Unmanned Aerial Vehicle Control System. PCM (1) 2016: 529-539
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/MaWJXMC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/MaWJXMC16
Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Study on Feature Subspace of Archetypal Emotions for Speech Emotion Recognition. CoRR abs/1611.05675 (2016)
2015
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/mta/WuZWLM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mta/WuZWLM15
Zhiyong Wu, Kai Zhao, Xixin Wu, Xinyu Lan, Helen Meng:
Acoustic to articulatory mapping with deep neural network. Multim. Tools Appl. 74(22): 9889-9907 (2015)
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/mta/WuNZJMMC15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mta/WuNZJMMC15
Zhiyong Wu, Yishuang Ning, Xiao Zang, Jia Jia, Fanbo Meng, Helen Meng, Lianhong Cai:
Generating emphatic speech with hidden Markov model for expressive speech synthesis. Multim. Tools Appl. 74(22): 9909-9925 (2015)
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/acii/WuWNJCM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acii/WuWNJCM15
Xixin Wu, Zhiyong Wu, Yishuang Ning, Jia Jia, Lianhong Cai, Helen M. Meng:
Understanding speaking styles of internet speech data with LSTM and low-resource training. ACII 2015: 815-820
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuYWKMC15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuYWKMC15
Peng Liu, Quanjie Yu, Zhiyong Wu, Shiyin Kang, Helen M. Meng, Lianhong Cai:
A deep recurrent approach for acoustic-to-articulatory inversion. ICASSP 2015: 4450-4454
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NingWJMMC15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NingWJMMC15
Yishuang Ning, Zhiyong Wu, Jia Jia, Fanbo Meng, Helen M. Meng, Lianhong Cai:
HMM-based emphatic speech synthesis for corrective feedback in computer-aided pronunciation training. ICASSP 2015: 4934-4938
[c32]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/LyuWZM15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/LyuWZM15
Qi Lyu, Zhiyong Wu, Jun Zhu, Helen Meng:
Modelling High-Dimensional Sequences with LSTM-RTRBM: Application to Polyphonic Music Generation. IJCAI 2015: 4138-4139
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NingWLMJC15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NingWLMJC15
Yishuang Ning, Zhiyong Wu, Xiaoyan Lou, Helen M. Meng, Jia Jia, Lianhong Cai:
Using tilt for automatic emphasis detection with Bayesian networks. INTERSPEECH 2015: 578-582
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LyuWZ15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LyuWZ15
Qi Lyu, Zhiyong Wu, Jun Zhu:
Polyphonic Music Modelling with LSTM-RTRBM. ACM Multimedia 2015: 991-994
2014
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/mta/JiaWZMC14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mta/JiaWZMC14
Jia Jia, Zhiyong Wu, Shen Zhang, Helen M. Meng, Lianhong Cai:
Head and facial gestures synthesis using PAD model for an expressive talking avatar. Multim. Tools Appl. 73(1): 439-461 (2014)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/mta/MengWJMC14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mta/MengWJMC14
Fanbo Meng, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai:
Synthesizing English emphatic speech for multimodal corrective feedback in computer-aided pronunciation training. Multim. Tools Appl. 73(1): 463-489 (2014)
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/FanXWC14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/FanXWC14
Yuchao Fan, Mingxing Xu, Zhiyong Wu, Lianhong Cai:
Automatic Emotion Variation Detection in continuous speech. APSIPA 2014: 1-5
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhengWMC14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhengWMC14
Xin Zheng, Zhiyong Wu, Helen Meng, Lianhong Cai:
Learning dynamic features with neural networks for phoneme recognition. ICASSP 2014: 2524-2528
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhengWMC14a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhengWMC14a
Xin Zheng, Zhiyong Wu, Helen Meng, Lianhong Cai:
Contrastive auto-encoder for phoneme recognition. ICASSP 2014: 2529-2533
[c26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZangWMJC14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZangWMJC14
Xiao Zang, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai:
Using conditional random fields to predict focus word pair in spontaneous spoken English. INTERSPEECH 2014: 756-760
[c25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouDLWWL14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouDLWWL14
Zhiyuan Zhou, Zhaogui Ding, Weifeng Li, Zhiyong Wu, Longbiao Wang, Qingmin Liao:
Multi-channel speech enhancement using sparse coding on local time-frequency structures. INTERSPEECH 2014: 2824-2827
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuWJMCL14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuWJMCL14
Xixin Wu, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai, Weifeng Li:
Automatic speech data clustering with human perception based weighted distance. ISCSLP 2014: 216-220
2013
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/JiangWXJC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/JiangWXJC13
Jianbo Jiang, Zhiyong Wu, Mingxing Xu, Jia Jia, Lianhong Cai:
Comparing feature dimension reduction algorithms for GMM-SVM based speech emotion recognition. APSIPA 2013: 1-4
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/ZhangLWWWL13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/ZhangLWWWL13
Mingming Zhang, Weifeng Li, Longbiao Wang, Jianguo Wei, Zhiyong Wu, Qingmin Liao:
Sparse coding for sound event classification. APSIPA 2013: 1-5
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/ZhangLWWWL13a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/ZhangLWWWL13a
Mingming Zhang, Weifeng Li, Longbiao Wang, Jianguo Wei, Zhiyong Wu, Qingmin Liao:
Frequency-domain dereverberation on speech signal using surround retinex. APSIPA 2013: 1-5
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/ZhaoWC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/ZhaoWC13
Kai Zhao, Zhiyong Wu, Lianhong Cai:
A real-time speech driven talking avatar based on deep neural network. APSIPA 2013: 1-4
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhengWSMC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhengWSMC13
Xin Zheng, Zhiyong Wu, Binbin Shen, Helen M. Meng, Lianhong Cai:
Investigation of tandem deep belief network approach for phoneme recognition. ICASSP 2013: 7586-7590
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ZhengWMLC13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ZhengWMLC13
Xin Zheng, Zhiyong Wu, Helen M. Meng, Weifeng Li, Lianhong Cai:
Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition. CoRR abs/1309.6176 (2013)
2012
[c18]
- view
  - electronic edition @ ieee.org
  - details & citations
- export record
  dblp key:
  - conf/apsipa/JiaWWCM12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/JiaWWCM12
Jia Jia, Xiaohui Wang, Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Modeling the correlation between modality semantics and facial expressions. APSIPA 2012: 1-10
[c17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MengWMJC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MengWMJC12
Fanbo Meng, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai:
Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training Data. INTERSPEECH 2012: 466-469
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/JiangWJC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/JiangWJC12
Tao Jiang, Zhiyong Wu, Jia Jia, Lianhong Cai:
Perceptual clustering based unit selection optimization for concatenative text-to-speech synthesis. ISCSLP 2012: 64-68
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/LiWMMC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/LiWMMC12
Chunrong Li, Zhiyong Wu, Fanbo Meng, Helen M. Meng, Lianhong Cai:
Detection and emphatic realization of contrastive word pairs for expressive text-to-speech synthesis. ISCSLP 2012: 93-97
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuWJC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuWJC12
Xixin Wu, Zhiyong Wu, Jia Jia, Lianhong Cai:
Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers. ISCSLP 2012: 363-367
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/JiangWXJC12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/JiangWXJC12
Jianbo Jiang, Zhiyong Wu, Mingxing Xu, Jia Jia, Lianhong Cai:
Comparison of adaptation methods for GMM-SVM based speech emotion recognition. SLT 2012: 269-273
2011
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShenWWC11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShenWWC11
Binbin Shen, Zhiyong Wu, Yongxin Wang, Lianhong Cai:
Combining Active and Semi-Supervised Learning for Homograph Disambiguation in Mandarin Text-to-Speech Synthesis. INTERSPEECH 2011: 2165-2168
2010
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/icpr/DuanKWCSQ10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icpr/DuanKWCSQ10
Quansheng Duan, Shiyin Kang, Zhiyong Wu, Lianhong Cai, Zhiwei Shuang, Yong Qin:
Comparison of Syllable/Phone HMM Based Mandarin TTS. ICPR 2010: 4496-4499
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuCM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuCM10
Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Modeling prosody patterns for Chinese expressive text-to-speech synthesis. ISCSLP 2010: 148-152
[p1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/10/ZhangWMC10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/10/ZhangWMC10
Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar. Modeling Machine Emotions for Realizing Intelligence 2010: 109-132

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2009
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WuMYC09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuMYC09
Zhiyong Wu, Helen M. Meng, Hongwu Yang, Lianhong Cai:
Modeling the Expressivity of Input Text Semantics for Chinese Text-to-Speech Synthesis in a Spoken Dialog System. IEEE Trans. Speech Audio Process. 17(8): 1567-1576 (2009)
2008
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/CongWCM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/CongWCM08
Honglei Cong, Zhiyong Wu, Lianhong Cai, Helen M. Meng:
A New Prosodic Strength Calculation Method for Prosody Reduction Modeling. ISCSLP 2008: 53-56
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuWM08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuWM08
Zhiyong Wu, Jiying Wu, Helen M. Meng:
The Use of Dynamic Deformable Templates for Lip Tracking in an Audio-Visual Corpus with Large Variations in Head Pose, Face Illumination and Lip Shapes. ISCSLP 2008: 370-373
2007
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/acii/ZhangWMC07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acii/ZhangWMC07
Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Facial Expression Synthesis Using PAD Emotional Parameters for a Chinese Expressive Avatar. ACII 2007: 24-35
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangWMC07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangWMC07
Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Head Movement Synthesis Based on Semantic and Prosodic Features for a Chinese Expressive Avatar. ICASSP (4) 2007: 837-840
2006
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/icb/WuCM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icb/WuCM06
Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Multi-level Fusion of Audio and Visual Features for Speaker Identification. ICB 2006: 493-499
[c4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuZCM06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuZCM06
Zhiyong Wu, Shen Zhang, Lianhong Cai, Helen M. Meng:
Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar. INTERSPEECH 2006
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WuMNT06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WuMNT06
Zhiyong Wu, Helen M. Meng, Hui Ning, Sam C. Tse:
A Corpus-Based Approach for Cooperative Response Generation in a Dialog System. ISCSLP (Selected Papers) 2006: 614-626
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangMWC06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangMWC06
Hongwu Yang, Helen M. Meng, Zhiyong Wu, Lianhong Cai:
Modelling the Global acoustic Correlates of Expressivity for Chinese Text-to-speech Synthesis. SLT 2006: 138-141
2000
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuCZ00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuCZ00
Zhiyong Wu, Lianhong Cai, Tongchun Zhou:
Research on dynamic characters of Chinese pitch contours. INTERSPEECH 2000: 686-689

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.