default search action

combined dblp search
author search
venue search
publication search

ask others

Kaizhi Qian

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c21]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GaoQNGHC024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GaoQNGHC024
Heting Gao, Kaizhi Qian, Junrui Ni, Chuang Gan, Mark A. Hasegawa-Johnson, Shiyu Chang, Yang Zhang:
Speech Self-Supervised Learning Using Diffusion Model Synthetic Data. ICML 2024
[c20]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/HouLQAC024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/HouLQAC024
Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, Yang Zhang:
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling. ICML 2024
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-20336
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-20336
Jiaben Chen, Xin Yan, Yihang Chen, Siyuan Cen, Qinwei Ma, Haoyu Zhen, Kaizhi Qian, Lie Lu, Chuang Gan:
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text. CoRR abs/2405.20336 (2024)
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08380
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08380
Junrui Ni, Liming Wang, Yang Zhang, Kaizhi Qian, Heting Gao, Mark Hasegawa-Johnson, Chang D. Yoo:
Towards Unsupervised Speech Recognition Without Pronunciation Models. CoRR abs/2406.08380 (2024)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-04534
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-04534
Han Yang, Kun Su, Yutong Zhang, Jiaben Chen, Kaizhi Qian, Gaowen Liu, Chuang Gan:
UniMuMo: Unified Text, Music and Motion Generation. CoRR abs/2410.04534 (2024)
2023
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/SuQS0G23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/SuQS0G23
Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan:
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos. CVPR 2023: 9749-9759
[c18]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/YuZQWFZL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/YuZQWFZL23
Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Cheng Wan, Yonggan Fu, Yongan Zhang, Yingyan Celine Lin:
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning. ICML 2023: 40475-40487
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-16897
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-16897
Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan:
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos. CoRR abs/2303.16897 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15686
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15686
Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin:
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning. CoRR abs/2306.15686 (2023)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-08718
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-08718
Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, Yang Zhang:
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling. CoRR abs/2311.08718 (2023)
2022
[j1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/frai/GaoNZQCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/frai/GaoNZQCH22
Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson:
Domain Generalization for Language-Independent Automatic Speech Recognition. Frontiers Artif. Intell. 5: 806274 (2022)
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChanQZH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChanQZH22
Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson:
SpeechSplit2.0: Unsupervised Speech Disentanglement for Voice Conversion without Tuning Autoencoder Bottlenecks. ICASSP 2022: 6332-6336
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LaiCZCQLCLYCG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LaiCZCQLCLYCG22
Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. ICASSP 2022: 8447-8451
[c15]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/QianZGNLCHC22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/QianZGNLCHC22
Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David D. Cox, Mark Hasegawa-Johnson, Shiyu Chang:
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers. ICML 2022: 18003-18017
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NiWGQ0CH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NiWGQ0CH22
Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition. INTERSPEECH 2022: 461-465
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoNQZCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoNQZCH22
Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models. INTERSPEECH 2022: 2738-2742
[c12]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Fu0QYYLL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Fu0QYYLL22
Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Jeff Lai, Celine Lin:
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing. NeurIPS 2022
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-14156
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-14156
Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson:
SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks. CoRR abs/2203.14156 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15796
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15796
Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition. CoRR abs/2203.15796 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15863
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15863
Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson:
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models. CoRR abs/2203.15863 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-09224
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-09224
Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David D. Cox, Mark Hasegawa-Johnson, Shiyu Chang:
Improving Self-Supervised Speech Representations by Disentangling Speakers. CoRR abs/2204.09224 (2022)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-01522
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-01522
Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Lai, Yingyan Lin:
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing. CoRR abs/2211.01522 (2022)
2021
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiZWCQHZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiZWCQHZ21
Hui Shi, Yang Zhang, Hao Wu, Shiyu Chang, Kaizhi Qian, Mark Hasegawa-Johnson, Jishen Zhao:
Continuous Cnn For Nonuniform Time Series. ICASSP 2021: 3550-3554
[c10]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/Qian0CXGCH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Qian0CXGCH21
Kaizhi Qian, Yang Zhang, Shiyu Chang, Jinjun Xiong, Chuang Gan, David D. Cox, Mark Hasegawa-Johnson:
Global Prosody Style Transfer Without Text Transcriptions. ICML 2021: 8650-8660
[c9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoNZQCH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoNZQCH21
Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson:
Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding. Interspeech 2021: 1304-1308
[c8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaddlerFFQZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaddlerFFQZM21
Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott:
Speech Denoising with Auditory Models. Interspeech 2021: 2681-2685
[c7]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LaiZLCLCQKCG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LaiZLCLCQKCG21
Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. NeurIPS 2021: 21256-21272
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-05933
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-05933
Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. CoRR abs/2106.05933 (2021)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-08519
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-08519
Kaizhi Qian, Yang Zhang, Shiyu Chang, Jinjun Xiong, Chuang Gan, David D. Cox, Mark Hasegawa-Johnson:
Global Rhythm Style Transfer Without Text Transcriptions. CoRR abs/2106.08519 (2021)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-01147
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-01147
Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. CoRR abs/2110.01147 (2021)
2020
[b1]
- view
  - electronic edition via handle.net
  - details & citations
- export record
  dblp key:
  - phd/us/Qian20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/us/Qian20
Kaizhi Qian:
Deep generative models for speech editing. University of Illinois Urbana-Champaign, USA, 2020
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/QianJHM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/QianJHM20
Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, Gautham J. Mysore:
F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder. ICASSP 2020: 6284-6288
[c5]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/QianZCHC20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/QianZCHC20
Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson, David D. Cox:
Unsupervised Speech Decomposition via Triple Information Bottleneck. ICML 2020: 7836-7846
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-07370
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-07370
Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, Gautham J. Mysore:
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder. CoRR abs/2004.07370 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-11284
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-11284
Kaizhi Qian, Yang Zhang, Shiyu Chang, David D. Cox, Mark Hasegawa-Johnson:
Unsupervised Speech Decomposition via Triple Information Bottleneck. CoRR abs/2004.11284 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-10706
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-10706
Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott:
Deep Network Perceptual Losses for Speech Denoising. CoRR abs/2011.10706 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/LiQHA19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/LiQHA19
Feng Li, Kaizhi Qian, Mark Hasegawa-Johnson, Masato Akagi:
Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking. APSIPA 2019: 1239-1243
[c3]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/QianZCYH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/QianZCYH19
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson:
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss. ICML 2019: 5210-5219
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1905-05879
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-05879
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson:
Zero-Shot Voice Style Transfer with Only Autoencoder Loss. CoRR abs/1905.05879 (2019)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-00511
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-00511
Yang Zhang, Shiyu Chang, Mo Yu, Kaizhi Qian:
An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack. CoRR abs/1910.00511 (2019)
2018
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/QianZCYFH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/QianZCYFH18
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei A. F. Florêncio, Mark Hasegawa-Johnson:
Deep Learning Based Speech Beamforming. ICASSP 2018: 5389-5393
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1802-05383
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1802-05383
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei A. F. Florêncio, Mark Hasegawa-Johnson:
Deep Learning Based Speech Beamforming. CoRR abs/1802.05383 (2018)
2017
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QianZCYFH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QianZCYFH17
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florêncio, Mark Hasegawa-Johnson:
Speech Enhancement Using Bayesian Wavenet. INTERSPEECH 2017: 2013-2017

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.