default search action

combined dblp search
author search
venue search
publication search

ask others

Wei-Ning Hsu

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j3]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/PratapTSTBKENVF24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/PratapTSTBKENVF24
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Scaling Speech Technology to 1, 000+ Languages. J. Mach. Learn. Res. 25: 97:1-97:52 (2024)
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HanA0HCSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HanA0HCSW24
HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang:
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception. ACL (1) 2024: 12896-12911
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ChenPBXHHG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ChenPBXHHG24
Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman:
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos. ECCV (70) 2024: 277-295
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/JeonYIHRMB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/JeonYIHRMB24
Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel:
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency. ICASSP Workshops 2024: 555-559
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenSN0H24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenSN0H24
Peng-Jen Chen, Bowen Shi, Kelvin Niu, Ann Lee, Wei-Ning Hsu:
M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation. ICASSP 2024: 11896-11900
[c66]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/Liu0VSTH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/Liu0VSTH24
Alexander H. Liu, Matthew Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. ICLR 2024
[c65]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/PrajwalS0VTLGWA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PrajwalS0VTLGWA24
K. R. Prajwal, Bowen Shi, Matthew Le, Apoorv Vyas, Andros Tjandra, Mahi Luthra, Baishan Guo, Huiyu Wang, Triantafyllos Afouras, David Kant, Wei-Ning Hsu:
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation. ICML 2024
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/MajumderHGHMP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/MajumderHGHMP24
Navonil Majumder, Chia-Yu Hung, Deepanway Ghosal, Wei-Ning Hsu, Rada Mihalcea, Soujanya Poria:
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization. ACM Multimedia 2024: 564-572
[i74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-14402
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-14402
HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang:
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception. CoRR abs/2403.14402 (2024)
[i73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-09956
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-09956
Navonil Majumder, Chia-Yu Hung, Deepanway Ghosal, Wei-Ning Hsu, Rada Mihalcea, Soujanya Poria:
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization. CoRR abs/2404.09956 (2024)
[i72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06251
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06251
Chung-Ming Chien, Andros Tjandra, Apoorv Vyas, Matt Le, Bowen Shi, Wei-Ning Hsu:
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning. CoRR abs/2406.06251 (2024)
[i71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09272
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09272
Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman:
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos. CoRR abs/2406.09272 (2024)
[i70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-03648
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-03648
Gaël Le Lan, Bowen Shi, Zhaoheng Ni, Sidd Srinivasan, Anurag Kumar, Brian Ellis, David Kant, Varun Nagaraja, Ernie Chang, Wei-Ning Hsu, Yangyang Shi, Vikas Chandra:
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching. CoRR abs/2407.03648 (2024)
[i69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-13720
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-13720
Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, David Yan, Dhruv Choudhary, Dingkang Wang, Geet Sethi, Guan Pang, Haoyu Ma, Ishan Misra, Ji Hou, Jialiang Wang, Kiran Jagadeesh, Kunpeng Li, Luxin Zhang, Mannat Singh, Mary Williamson, Matt Le, Matthew Yu, Mitesh Kumar Singh, Peizhao Zhang, Peter Vajda, Quentin Duval, Rohit Girdhar, Roshan Sumbaly, Sai Saketh Rambhatla, Sam S. Tsai, Samaneh Azadi, Samyak Datta, Sanyuan Chen, Sean Bell, Sharadh Ramaswamy, Shelly Sheynin, Siddharth Bhattacharya, Simran Motwani, Tao Xu, Tianhe Li, Tingbo Hou, Wei-Ning Hsu, Xi Yin, Xiaoliang Dai, Yaniv Taigman, Yaqiao Luo, Yen-Cheng Liu, Yi-Chiao Wu, Yue Zhao, Yuval Kirstain, Zecheng He, Zijian He, Albert Pumarola, Ali K. Thabet, Artsiom Sanakoyeu, Arun Mallya, Baishan Guo, Boris Araya, Breena Kerr, Carleigh Wood, Ce Liu, Cen Peng, Dmitry Vengertsev, Edgar Schönfeld, Elliot Blanchard, Felix Juefei-Xu, Fraylie Nord, Jeff Liang, John Hoffman, Jonas Kohler, Kaolin Fire, Karthik Sivakumar, Lawrence Chen, Licheng Yu, Luya Gao, Markos Georgopoulos, Rashel Moritz, Sara K. Sampson, Shikai Li, Simone Parmeggiani, Steve Fine, Tara Fowler, Vladan Petrovic, Yuming Du:
Movie Gen: A Cast of Media Foundation Models. CoRR abs/2410.13720 (2024)
[i68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-20478
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-20478
K. R. Prajwal, Bowen Shi, Matthew Le, Apoorv Vyas, Andros Tjandra, Mahi Luthra, Baishan Guo, Huiyu Wang, Triantafyllos Afouras, David Kant, Wei-Ning Hsu:
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation. CoRR abs/2410.20478 (2024)
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-05141
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-05141
Mu Yang, Bowen Shi, Matthew Le, Wei-Ning Hsu, Andros Tjandra:
Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation. CoRR abs/2411.05141 (2024)
2023
[j2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tacl/NguyenKCAHETASM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tacl/NguyenKCAHETASM23
Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoît Sagot, Abdelrahman Mohamed, Emmanuel Dupoux:
Generative Spoken Dialogue Language Modeling. Trans. Assoc. Comput. Linguistics 11: 250-266 (2023)
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ChenTYDKCTDSGIP23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ChenTYDKCTDSGIP23
Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee:
Speech-to-Speech Translation for a Real-world Unwritten Language. ACL (Findings) 2023: 4969-4983
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/WangICK0HA023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/WangICK0HA023
Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino:
Simple and Effective Unsupervised Speech Translation. ACL (1) 2023: 10771-10784
[c61]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/LianBHA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/LianBHA23
Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli:
Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations. ASRU 2023: 1-8
[c60]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/HsuRSDA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/HsuRSDA23
Wei-Ning Hsu, Tal Remez, Bowen Shi, Jacob Donley, Yossi Adi:
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration. CVPR 2023: 18796-18806
[c59]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/ChouCHLBCBA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ChouCHLBCBA23
Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli:
Toward Joint Language Modeling for Speech Units and Text. EMNLP (Findings) 2023: 6582-6593
[c58]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DiwanYHTCHM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DiwanYHTCHM23
Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed:
Continual Learning for On-Device Speech Recognition Using Disentangled Conformers. ICASSP 2023: 1-5
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ElkahkyHTNAACDM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ElkahkyHTNAACDM23
Ali Elkahky, Wei-Ning Hsu, Paden Tomasello, Tu Anh Nguyen, Robin Algayres, Yossi Adi, Jade Copet, Emmanuel Dupoux, Abdelrahman Mohamed:
Do Coarser Units Benefit Cluster Prediction-Based Speech Pre-Training? ICASSP 2023: 1-5
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FazelZarandiH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FazelZarandiH23
Maryam Fazel-Zarandi, Wei-Ning Hsu:
Cocktail Hubert: Generalized Self-Supervised Pre-Training for Mixture and Single-Source Speech. ICASSP 2023: 1-5
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SanabriaHBA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SanabriaHBA23
Ramon Sanabria, Wei-Ning Hsu, Alexei Baevski, Michael Auli:
Measuring the Impact of Domain Factors in Self-Supervised Pre-Training. ICASSP Workshops 2023: 1-5
[c54]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/AghajanyanYCHHZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AghajanyanYCHHZ23
Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer:
Scaling Laws for Generative Mixed-Modal Language Models. ICML 2023: 265-279
[c53]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/BaevskiBHA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BaevskiBHA23
Alexei Baevski, Arun Babu, Wei-Ning Hsu, Michael Auli:
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language. ICML 2023: 1416-1429
[c52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AnwarSGH0W23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AnwarSGH0W23
Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang:
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation. INTERSPEECH 2023: 4064-4068
[c51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenHDSGFRCSH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenHDSGFRCSH23
Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarandi, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux:
Expresso: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis. INTERSPEECH 2023: 4823-4827
[c50]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LeVSKSMWMAMH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LeVSKSMWMAMH23
Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. NeurIPS 2023
[c49]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LiuCAHG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LiuCAHG23
Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. NeurIPS 2023
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-00652
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-00652
Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Abdelrahman Mohamed:
Efficient Speech Representation Learning with Low-Bit Quantization. CoRR abs/2301.00652 (2023)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-03728
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-03728
Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer:
Scaling Laws for Generative Mixed-Modal Language Models. CoRR abs/2301.03728 (2023)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-06419
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-06419
Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli:
AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations. CoRR abs/2302.06419 (2023)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-00628
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-00628
Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang:
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation. CoRR abs/2303.00628 (2023)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-11131
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-11131
Maryam Fazel-Zarandi, Wei-Ning Hsu:
Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech. CoRR abs/2303.11131 (2023)
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-10005
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-10005
Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. CoRR abs/2305.10005 (2023)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13516
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13516
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Scaling Speech Technology to 1, 000+ Languages. CoRR abs/2305.13516 (2023)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15687
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15687
Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu:
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale. CoRR abs/2306.15687 (2023)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-05725
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-05725
Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarandi, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux:
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis. CoRR abs/2308.05725 (2023)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-17020
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-17020
Po-Chun Hsu, Ali Elkahky, Wei-Ning Hsu, Yossi Adi, Tu Anh Nguyen, Jade Copet, Emmanuel Dupoux, Hung-yi Lee, Abdelrahman Mohamed:
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS. CoRR abs/2309.17020 (2023)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-08715
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-08715
Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli:
Toward Joint Language Modeling for Speech Units and Text. CoRR abs/2310.08715 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16338
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16338
Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. CoRR abs/2310.16338 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-02772
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-02772
Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel:
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency. CoRR abs/2311.02772 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-15821
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-15821
Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu:
Audiobox: Unified Audio Generation with Natural Language Prompts. CoRR abs/2312.15821 (2023)
2022
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/TangGDWHGBLMAP22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/TangGDWHGBLMAP22
Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Miguel Pino:
Unified Speech-Text Pre-training for Speech Translation and Recognition. ACL (1) 2022: 1488-1499
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LeeCWGPMPAHTPH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LeeCWGPMPAHTPH22
Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu:
Direct Speech-to-Speech Translation With Discrete Units. ACL (1) 2022: 3327-3339
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/KharitonovLPACL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/KharitonovLPACL22
Eugene Kharitonov, Ann Lee, Adam Polyak, Yossi Adi, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Morgane Rivière, Abdelrahman Mohamed, Emmanuel Dupoux, Wei-Ning Hsu:
Text-Free Prosody-Aware Generative Spoken Language Modeling. ACL (1) 2022: 8666-8681
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/KreukPCKNRHMDA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/KreukPCKNRHMDA22
Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi:
Textless Speech Emotion Conversion using Discrete & Decomposed Representations. EMNLP 2022: 11200-11214
[c44]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ShiHLM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ShiHLM22
Bowen Shi, Wei-Ning Hsu, Kushal Lakhotia, Abdelrahman Mohamed:
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction. ICLR 2022
[c43]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/BaevskiHXBGA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BaevskiHXBGA22
Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli:
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language. ICML 2022: 1298-1312
[c42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuLHABG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuLHABG22
Alexander H. Liu, Cheng-I Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. INTERSPEECH 2022: 843-847
[c41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiHM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiHM22
Bowen Shi, Wei-Ning Hsu, Abdelrahman Mohamed:
Robust Self-Supervised Audio-Visual Speech Recognition. INTERSPEECH 2022: 2118-2122
[c40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VyasHAB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VyasHAB22
Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
On-demand compute reduction with stochastic wav2vec 2.0. INTERSPEECH 2022: 3048-3052
[c39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiMH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiMH22
Bowen Shi, Abdelrahman Mohamed, Wei-Ning Hsu:
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT. INTERSPEECH 2022: 4785-4789
[c38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PopuriCWPAGHL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PopuriCWPAGHL22
Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee:
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation. INTERSPEECH 2022: 5195-5199
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/naacl/LeeGDSCWPAPGH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/LeeGDSCWPAPGH22
Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Miguel Pino, Jiatao Gu, Wei-Ning Hsu:
Textless Speech-to-Speech Translation on Real Data. NAACL-HLT 2022: 860-872
[c36]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/HsuS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HsuS22
Wei-Ning Hsu, Bowen Shi:
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality. NeurIPS 2022
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuHAB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuHAB22
Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
Towards End-to-End Unsupervised Speech Recognition. SLT 2022: 221-228
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/TomaselloSLHLSECHAANDZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/TomaselloSLHLSECHAANDZM22
Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-Chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossi Adi, Robin Algayres, Tu Anh Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed:
Stop: A Dataset for Spoken Task Oriented Semantic Parsing. SLT 2022: 991-998
[i52]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-01763
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-01763
Bowen Shi, Wei-Ning Hsu, Abdelrahman Mohamed:
Robust Self-Supervised Audio-Visual Speech Recognition. CoRR abs/2201.01763 (2022)
[i51]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-02184
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-02184
Bowen Shi, Wei-Ning Hsu, Kushal Lakhotia, Abdelrahman Mohamed:
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction. CoRR abs/2201.02184 (2022)
[i50]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-03555
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-03555
Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli:
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language. CoRR abs/2202.03555 (2022)
[i49]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-07359
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-07359
Eugene Kharitonov, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Paden Tomasello, Ann Lee, Ali Elkahky, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi:
textless-lib: a Library for Textless Spoken Language Processing. CoRR abs/2202.07359 (2022)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-00648
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-00648
Ramon Sanabria, Wei-Ning Hsu, Alexei Baevski, Michael Auli:
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training. CoRR abs/2203.00648 (2022)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16502
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16502
Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoît Sagot, Abdelrahman Mohamed, Emmanuel Dupoux:
Generative Spoken Dialogue Language Modeling. CoRR abs/2203.16502 (2022)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-02492
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-02492
Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
Towards End-to-end Unsupervised Speech Recognition. CoRR abs/2204.02492 (2022)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-02524
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-02524
Alexander H. Liu, Cheng-I Jeff Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. CoRR abs/2204.02524 (2022)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-02967
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-02967
Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee:
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation. CoRR abs/2204.02967 (2022)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-05409
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-05409
Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Miguel Pino:
Unified Speech-Text Pre-training for Speech Translation and Recognition. CoRR abs/2204.05409 (2022)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-11934
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-11934
Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
On-demand compute reduction with stochastic wav2vec 2.0. CoRR abs/2204.11934 (2022)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-07180
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-07180
Bowen Shi, Abdelrahman Mohamed, Wei-Ning Hsu:
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT. CoRR abs/2205.07180 (2022)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-07036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-07036
Wei-Ning Hsu, Bowen Shi:
A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer. CoRR abs/2207.07036 (2022)
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-10643
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-10643
Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-Chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossef Mordechay, Robin Algayres, Tu Anh Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed:
STOP: A dataset for Spoken Task Oriented Semantic Parsing. CoRR abs/2207.10643 (2022)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-10191
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-10191
Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino:
Simple and Effective Unsupervised Speech Translation. CoRR abs/2210.10191 (2022)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-06474
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-06474
Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Miguel Pino, Wei-Ning Hsu, Ann Lee:
Speech-to-Speech Translation For A Real-world Unwritten Language. CoRR abs/2211.06474 (2022)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-01393
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-01393
Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed:
Continual Learning for On-Device Speech Recognition using Disentangled Conformers. CoRR abs/2212.01393 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-07525
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-07525
Alexei Baevski, Arun Babu, Wei-Ning Hsu, Michael Auli:
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language. CoRR abs/2212.07525 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-11377
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-11377
Wei-Ning Hsu, Tal Remez, Bowen Shi, Jacob Donley, Yossi Adi:
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement. CoRR abs/2212.11377 (2022)
2021
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/HsuBTLSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/HsuBTLSM21
Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed:
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3451-3460 (2021)
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HsuHMSG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HsuHMSG20
Wei-Ning Hsu, David Harwath, Tyler Miller, Christopher Song, James R. Glass:
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units. ACL/IJCNLP (1) 2021: 5284-5300
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ManoharLXHCSZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ManoharLXHCSZM21
Vimal Manohar, Tatiana Likhomanenko, Qiantong Xu, Wei-Ning Hsu, Ronan Collobert, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed:
Kaizen: Continuously Improving Teacher Using Exponential Moving Average for Semi-Supervised Speech Recognition. ASRU 2021: 518-525
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/WangHAPLCGP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/WangHAPLCGP21
Changhan Wang, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Ann Lee, Peng-Jen Chen, Jiatao Gu, Juan Pino:
fairseq S\^2: A Scalable and Integrable Speech Synthesis Toolkit. EMNLP (Demos) 2021: 143-152
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HsuTBSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HsuTBSM21
Wei-Ning Hsu, Yao-Hung Hubert Tsai, Benjamin Bolte, Ruslan Salakhutdinov, Abdelrahman Mohamed:
Hubert: How Much Can a Bad Teacher Benefit ASR Pre-Training? ICASSP 2021: 6533-6537
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuSBLXPK0CSA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuSBLXPK0CSA21
Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli:
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training. Interspeech 2021: 721-725
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PolyakACKLHMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PolyakACKLHMD21
Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux:
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations. Interspeech 2021: 3615-3619
[c27]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/BaevskiHCA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/BaevskiHCA21
Alexei Baevski, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Unsupervised Speech Recognition. NeurIPS 2021: 27826-27839
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/Hsu0SH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/Hsu0SH21
Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Y. Hannun:
Semi-Supervised end-to-end Speech Recognition via Local Prior Matching. SLT 2021: 125-132
[i33]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-01192
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-01192
Kushal Lakhotia, Evgeny Kharitonov, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte, Tu Anh Nguyen, Jade Copet, Alexei Baevski, Adelrahman Mohamed, Emmanuel Dupoux:
Generative Spoken Language Modeling from Raw Audio. CoRR abs/2102.01192 (2021)
[i32]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-00355
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-00355
Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux:
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations. CoRR abs/2104.00355 (2021)
[i31]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-01027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-01027
Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli:
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training. CoRR abs/2104.01027 (2021)
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-11084
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-11084
Alexei Baevski, Wei-Ning Hsu, Alexis Conneau, Michael Auli:
Unsupervised Speech Recognition. CoRR abs/2105.11084 (2021)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-07447
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-07447
Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed:
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. CoRR abs/2106.07447 (2021)
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-07759
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-07759
Vimal Manohar, Tatiana Likhomanenko, Qiantong Xu, Wei-Ning Hsu, Ronan Collobert, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed:
Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition. CoRR abs/2106.07759 (2021)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-05604
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-05604
Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Miguel Pino, Wei-Ning Hsu:
Direct speech-to-speech translation with discrete units. CoRR abs/2107.05604 (2021)
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-03264
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-03264
Eugene Kharitonov, Ann Lee, Adam Polyak, Yossi Adi, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Morgane Rivière, Abdelrahman Mohamed, Emmanuel Dupoux, Wei-Ning Hsu:
Text-Free Prosody-Aware Generative Spoken Language Modeling. CoRR abs/2109.03264 (2021)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-06912
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-06912
Changhan Wang, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Ann Lee, Peng-Jen Chen, Jiatao Gu, Juan Miguel Pino:
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit. CoRR abs/2109.06912 (2021)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-08250
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-08250
Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Kenneth Heafield, Phillip Koehn, Juan Miguel Pino:
Direct simultaneous speech to speech translation. CoRR abs/2110.08250 (2021)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-07402
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-07402
Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi:
Textless Speech Emotion Conversion using Decomposed and Discrete Representations. CoRR abs/2111.07402 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2112-08352
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-08352
Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Juan Miguel Pino, Jiatao Gu, Wei-Ning Hsu:
Textless Speech-to-Speech Translation on Real Data. CoRR abs/2112.08352 (2021)
2020
[c25]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HarwathHG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HarwathHG20
David Harwath, Wei-Ning Hsu, James R. Glass:
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech. ICLR 2020
[c24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GumpHG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GumpHG20
Michael Gump, Wei-Ning Hsu, James R. Glass:
Unsupervised Methods for Evaluating Speech Representations. INTERSPEECH 2020: 170-174
[c23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KhuranaLHCLMG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KhuranaLHCLMG20
Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James R. Glass:
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning. INTERSPEECH 2020: 3790-3794
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-10336
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-10336
Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Y. Hannun:
Semi-Supervised Speech Recognition via Local Prior Matching. CoRR abs/2002.10336 (2020)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-02547
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-02547
Sameer Khurana, Antoine Laurent, Wei-Ning Hsu, Jan Chorowski, Adrian Lancucki, Ricard Marxer, James R. Glass:
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning. CoRR abs/2006.02547 (2020)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-01003
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-01003
Awni Y. Hannun, Vineel Pratap, Jacob Kahn, Wei-Ning Hsu:
Differentiable Weighted Finite-State Transducers. CoRR abs/2010.01003 (2020)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-15454
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-15454
Wei-Ning Hsu, David Harwath, Christopher Song, James R. Glass:
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units. CoRR abs/2012.15454 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HsuZWCWWG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HsuZWCWWG19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Yu-An Chung, Yuxuan Wang, Yonghui Wu, James R. Glass:
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization. ICASSP 2019: 5901-5905
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChungWHZS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChungWHZS19
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis. ICASSP 2019: 6940-6944
[c20]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HsuZWZWWCJCSNP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HsuZWZWWCJCSNP19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. ICLR (Poster) 2019
[c19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChungHTG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChungHTG19
Yu-An Chung, Wei-Ning Hsu, Hao Tang, James R. Glass:
An Unsupervised Autoregressive Model for Speech Representation Learning. INTERSPEECH 2019: 146-150
[c18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuHG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuHG19
Wei-Ning Hsu, David Harwath, James R. Glass:
Transfer Learning from Audio-Visual Grounding to Speech Recognition. INTERSPEECH 2019: 3242-3246
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-08295
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-08295
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1904-03240
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-03240
Yu-An Chung, Wei-Ning Hsu, Hao Tang, James R. Glass:
An Unsupervised Autoregressive Model for Speech Representation Learning. CoRR abs/1904.03240 (2019)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1907-04355
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1907-04355
Wei-Ning Hsu, David F. Harwath, James R. Glass:
Transfer Learning from Audio-Visual Grounding to Speech Recognition. CoRR abs/1907.04355 (2019)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-09602
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-09602
David Harwath, Wei-Ning Hsu, James R. Glass:
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech. CoRR abs/1911.09602 (2019)
2018
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HsuG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HsuG18
Wei-Ning Hsu, James R. Glass:
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition. ICASSP 2018: 5614-5618
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icpr/ZhengWXHG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icpr/ZhengWXHG18
Siqi Zheng, Jianzong Wang, Jing Xiao, Wei-Ning Hsu, James R. Glass:
A Noise-Robust Self-Adaptive Multitarget Speaker Detection System. ICPR 2018: 1068-1072
[c15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuG18
Wei-Ning Hsu, James R. Glass:
Scalable Factorized Hierarchical Variational Autoencoder Training. INTERSPEECH 2018: 1462-1466
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuTG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuTG18
Wei-Ning Hsu, Hao Tang, James R. Glass:
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition. INTERSPEECH 2018: 1576-1580
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangHGG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangHGG18
Hao Tang, Wei-Ning Hsu, François Grondin, James R. Glass:
A Study of Enhancement, Augmentation and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition. INTERSPEECH 2018: 2928-2932
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShonHG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShonHG18
Suwon Shon, Wei-Ning Hsu, James R. Glass:
Unsupervised Representation Learning of Speech for Dialect Identification. SLT 2018: 105-111
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1803-02551
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1803-02551
Wei-Ning Hsu, James R. Glass:
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition. CoRR abs/1803.02551 (2018)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1804-03201
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1804-03201
Wei-Ning Hsu, James R. Glass:
Scalable Factorized Hierarchical Variational Autoencoder Training. CoRR abs/1804.03201 (2018)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1805-11264
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-11264
Wei-Ning Hsu, James R. Glass:
Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data. CoRR abs/1805.11264 (2018)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-04841
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-04841
Hao Tang, Wei-Ning Hsu, François Grondin, James R. Glass:
A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition. CoRR abs/1806.04841 (2018)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-04872
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-04872
Wei-Ning Hsu, Hao Tang, James R. Glass:
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition. CoRR abs/1806.04872 (2018)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-10128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-10128
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis. CoRR abs/1808.10128 (2018)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1809-04458
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1809-04458
Suwon Shon, Wei-Ning Hsu, James R. Glass:
Unsupervised Representation Learning of Speech for Dialect Identification. CoRR abs/1809.04458 (2018)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-07217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-07217
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. CoRR abs/1810.07217 (2018)
2017
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation. ASRU 2017: 16-23
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/NajafianHAG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/NajafianHAG17
Maryam Najafian, Wei-Ning Hsu, Ahmed Ali, James R. Glass:
Automatic speech recognition of Arabic multi-genre broadcast media. ASRU 2017: 353-359
[c9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. INTERSPEECH 2017: 1273-1277
[c8]
- view
- export record
  dblp key:
  - conf/nips/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. NIPS 2017: 1878-1889
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. CoRR abs/1704.04222 (2017)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG17aa
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG17aa
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation. CoRR abs/1707.06265 (2017)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1709-07902
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-07902
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. CoRR abs/1709.07902 (2017)
2016
[c7]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/coling/RomeoMBMBHZMG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/coling/RomeoMBMBHZMG16
Salvatore Romeo, Giovanni Da San Martino, Alberto Barrón-Cedeño, Alessandro Moschitti, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Mitra Mohtarami, James R. Glass:
Neural Attention for Learning to Rank Questions in Community Question Answering. COLING 2016: 1734-1745
[c6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuZLG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuZLG16
Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass:
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition. INTERSPEECH 2016: 395-399
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/semeval/MohtaramiBHZLBC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/semeval/MohtaramiBHZLBC16
Mitra Mohtarami, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Tao Lei, Kfir Bar, Scott Cyphers, James R. Glass:
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering. SemEval@NAACL-HLT 2016: 828-835
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HanaiHG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HanaiHG16
Tuka Al Hanai, Wei-Ning Hsu, James R. Glass:
Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge. SLT 2016: 299-304
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HsuZG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HsuZG16
Wei-Ning Hsu, Yu Zhang, James R. Glass:
A prioritized grid long short-term memory RNN for speech recognition. SLT 2016: 467-473
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG16
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Recurrent Neural Network Encoder with Attention for Community Question Answering. CoRR abs/1603.07044 (2016)
2015
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HsuL15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HsuL15
Wei-Ning Hsu, Hsuan-Tien Lin:
Active Learning by Learning. AAAI 2015: 2659-2665
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChungHLL15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChungHLL15
Cheng-Tao Chung, Wei-Ning Hsu, Cheng-Yi Lee, Lin-Shan Lee:
Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection. ICASSP 2015: 5231-5235
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ChungHLL15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ChungHLL15
Cheng-Tao Chung, Wei-Ning Hsu, Cheng-Yi Lee, Lin-Shan Lee:
Enhancing Automatically Discovered Multi-level Acoustic Patterns Considering Context Consistency With Applications in Spoken Term Detection. CoRR abs/1509.02217 (2015)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.