default search action
Kaizhi Qian
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c21]Heting Gao, Kaizhi Qian, Junrui Ni, Chuang Gan, Mark A. Hasegawa-Johnson, Shiyu Chang, Yang Zhang:
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data. ICML 2024 - [c20]Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, Yang Zhang:
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling. ICML 2024 - [i20]Jiaben Chen, Xin Yan, Yihang Chen, Siyuan Cen, Qinwei Ma, Haoyu Zhen, Kaizhi Qian, Lie Lu, Chuang Gan:
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text. CoRR abs/2405.20336 (2024) - [i19]Junrui Ni, Liming Wang, Yang Zhang, Kaizhi Qian, Heting Gao, Mark Hasegawa-Johnson, Chang D. Yoo:
Towards Unsupervised Speech Recognition Without Pronunciation Models. CoRR abs/2406.08380 (2024) - [i18]Han Yang, Kun Su, Yutong Zhang, Jiaben Chen, Kaizhi Qian, Gaowen Liu, Chuang Gan:
UniMuMo: Unified Text, Music and Motion Generation. CoRR abs/2410.04534 (2024) - 2023
- [c19]Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan:
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos. CVPR 2023: 9749-9759 - [c18]Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Cheng Wan, Yonggan Fu, Yongan Zhang, Yingyan Celine Lin:
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning. ICML 2023: 40475-40487 - [i17]Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan:
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos. CoRR abs/2303.16897 (2023) - [i16]Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin:
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning. CoRR abs/2306.15686 (2023) - [i15]Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, Yang Zhang:
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling. CoRR abs/2311.08718 (2023) - 2022
- [j1]Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson:
Domain Generalization for Language-Independent Automatic Speech Recognition. Frontiers Artif. Intell. 5: 806274 (2022) - [c17]Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson:
SpeechSplit2.0: Unsupervised Speech Disentanglement for Voice Conversion without Tuning Autoencoder Bottlenecks. ICASSP 2022: 6332-6336 - [c16]Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. ICASSP 2022: 8447-8451 - [c15]Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David D. Cox, Mark Hasegawa-Johnson, Shiyu Chang:
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers. ICML 2022: 18003-18017 - [c14]Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition. INTERSPEECH 2022: 461-465 - [c13]Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models. INTERSPEECH 2022: 2738-2742 - [c12]Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Jeff Lai, Celine Lin:
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing. NeurIPS 2022 - [i14]Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson:
SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks. CoRR abs/2203.14156 (2022) - [i13]Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition. CoRR abs/2203.15796 (2022) - [i12]Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models. CoRR abs/2203.15863 (2022) - [i11]Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David D. Cox, Mark Hasegawa-Johnson, Shiyu Chang:
Improving Self-Supervised Speech Representations by Disentangling Speakers. CoRR abs/2204.09224 (2022) - [i10]Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Lai, Yingyan Lin:
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing. CoRR abs/2211.01522 (2022) - 2021
- [c11]Hui Shi, Yang Zhang, Hao Wu, Shiyu Chang, Kaizhi Qian, Mark Hasegawa-Johnson, Jishen Zhao:
Continuous Cnn For Nonuniform Time Series. ICASSP 2021: 3550-3554 - [c10]Kaizhi Qian, Yang Zhang, Shiyu Chang, Jinjun Xiong, Chuang Gan, David D. Cox, Mark Hasegawa-Johnson:
Global Prosody Style Transfer Without Text Transcriptions. ICML 2021: 8650-8660 - [c9]Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson:
Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding. Interspeech 2021: 1304-1308 - [c8]Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott:
Speech Denoising with Auditory Models. Interspeech 2021: 2681-2685 - [c7]Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. NeurIPS 2021: 21256-21272 - [i9]Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. CoRR abs/2106.05933 (2021) - [i8]Kaizhi Qian, Yang Zhang, Shiyu Chang, Jinjun Xiong, Chuang Gan, David D. Cox, Mark Hasegawa-Johnson:
Global Rhythm Style Transfer Without Text Transcriptions. CoRR abs/2106.08519 (2021) - [i7]Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. CoRR abs/2110.01147 (2021) - 2020
- [b1]Kaizhi Qian:
Deep generative models for speech editing. University of Illinois Urbana-Champaign, USA, 2020 - [c6]Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, Gautham J. Mysore:
F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder. ICASSP 2020: 6284-6288 - [c5]Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson, David D. Cox:
Unsupervised Speech Decomposition via Triple Information Bottleneck. ICML 2020: 7836-7846 - [i6]Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, Gautham J. Mysore:
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder. CoRR abs/2004.07370 (2020) - [i5]Kaizhi Qian, Yang Zhang, Shiyu Chang, David D. Cox, Mark Hasegawa-Johnson:
Unsupervised Speech Decomposition via Triple Information Bottleneck. CoRR abs/2004.11284 (2020) - [i4]Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott:
Deep Network Perceptual Losses for Speech Denoising. CoRR abs/2011.10706 (2020)
2010 – 2019
- 2019
- [c4]Feng Li, Kaizhi Qian, Mark Hasegawa-Johnson, Masato Akagi:
Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking. APSIPA 2019: 1239-1243 - [c3]Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson:
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss. ICML 2019: 5210-5219 - [i3]Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson:
Zero-Shot Voice Style Transfer with Only Autoencoder Loss. CoRR abs/1905.05879 (2019) - [i2]Yang Zhang, Shiyu Chang, Mo Yu, Kaizhi Qian:
An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack. CoRR abs/1910.00511 (2019) - 2018
- [c2]Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei A. F. Florêncio, Mark Hasegawa-Johnson:
Deep Learning Based Speech Beamforming. ICASSP 2018: 5389-5393 - [i1]Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei A. F. Florêncio, Mark Hasegawa-Johnson:
Deep Learning Based Speech Beamforming. CoRR abs/1802.05383 (2018) - 2017
- [c1]Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florêncio, Mark Hasegawa-Johnson:
Speech Enhancement Using Bayesian Wavenet. INTERSPEECH 2017: 2013-2017
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-14 00:54 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint