default search action

combined dblp search
author search
venue search
publication search

ask others

Zejun Ma

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

Journal Articles

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/tnn/QinMDLZMWLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tnn/QinMDLZMWLL24
Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Zejun Ma, Jiakai Wang, Jie Luo, Xianglong Liu:
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance. IEEE Trans. Neural Networks Learn. Syst. 35(8): 10674-10686 (2024)
2023
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/nn/LiangDZM0G23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/nn/LiangDZM0G23
Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, Junbin Gao:
Graph contrastive learning with implicit augmentations. Neural Networks 163: 156-164 (2023)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/WeiVQOM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/WeiVQOM23
Pengfei Wei, Thanh Vinh Vo, Xinghua Qu, Yew Soon Ong, Zejun Ma:
Transfer Kernel Learning for Multi-Source Transfer Gaussian Process Regression. IEEE Trans. Pattern Anal. Mach. Intell. 45(3): 3862-3876 (2023)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/WeiKOM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/WeiKOM23
Pengfei Wei, Yiping Ke, Yew-Soon Ong, Zejun Ma:
Adaptive Transfer Kernel Learning for Transfer Gaussian Process Regression. IEEE Trans. Pattern Anal. Mach. Intell. 45(6): 7142-7156 (2023)
2022
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/FanDCMX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/FanDCMX22
Zhiyun Fan, Linhao Dong, Meng Cai, Zejun Ma, Bo Xu:
Sequence-Level Speaker Change Detection With Difference-Based Continuous Integrate-and-Fire. IEEE Signal Process. Lett. 29: 1551-1554 (2022)

Conference and Workshop Papers

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FanDZ0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FanDZ0M24
Zhiyun Fan, Linhao Dong, Jun Zhang, Lu Lu, Zejun Ma:
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR. ICASSP 2024: 9986-9990
[c72]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KhassanovCCCLLM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KhassanovCCCLLM24
Yerbolat Khassanov, Zhipeng Chen, Tianfeng Chen, Tze Yuang Chong, Wei Li, Lu Lu, Zejun Ma:
Extending Multilingual ASR to New Languages Using Supplementary Encoder and Decoder Components. ICASSP 2024: 10586-10590
[c71]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TangYSC0LLMZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TangYSC0LLMZ24
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Extending Large Language Models for Speech and Audio Captioning. ICASSP 2024: 11236-11240
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YuTSC0L0M024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YuTSC0L0M024
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. ICASSP 2024: 12637-12641
[c69]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/0001L0HYJY0WW0M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/0001L0HYJY0WW0M24
Ziyue Jiang, Jinglin Liu, Yi Ren, Jinzheng He, Zhenhui Ye, Shengpeng Ji, Qian Yang, Chen Zhang, Pengfei Wei, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao:
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis. ICLR 2024
[c68]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/DongH00KZF0WCYB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/DongH00KZF0WCYB24
Qianqian Dong, Zhiying Huang, Qi Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang:
PolyVoice: Language Models for Speech to Speech Translation. ICLR 2024
[c67]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/TangYSC000M024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/TangYSC000M024
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
SALMONN: Towards Generic Hearing Abilities for Large Language Models. ICLR 2024
[c66]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/YeZ0YLH0HHL00MZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/YeZ0YLH0HHL00MZ24
Zhenhui Ye, Tianyun Zhong, Yi Ren, Jiaqi Yang, Weichuang Li, Jiawei Huang, Ziyue Jiang, Jinzheng He, Rongjie Huang, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao:
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis. ICLR 2024
[c65]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/SunYTC000M0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/SunYTC000M0024
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. ICML 2024
2023
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/DongAWZLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/DongAWZLM23
Linhao Dong, Zhecheng An, Peihao Wu, Jun Zhang, Lu Lu, Zejun Ma:
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training. ACL (Findings) 2023: 8894-8907
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/QiuHLZLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/QiuHLZLM23
Jin Qiu, Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma:
Improving Large-Scale Deep Biasing With Phoneme Features and Text-Only Data in Streaming Transducer. ASRU 2023: 1-8
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DuWLLZM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DuWLLZM23
Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma:
Bytecover3: Accurate Cover Song Identification On Short Queries. ICASSP 2023: 1-5
[c61]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuFTSLML23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuFTSLML23
Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee:
Leveraging Phone-Level Linguistic-Acoustic Similarity For Utterance-Level Pronunciation Scoring. ICASSP 2023: 1-5
[c60]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuFTSLML23a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuFTSLML23a
Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee:
An ASR-Free Fluency Scoring Approach with Self-Supervised Learning. ICASSP 2023: 1-5
[c59]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaWQQXWM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaWQQXWM23
Rao Ma, Xiaobo Wu, Jin Qiu, Yanan Qin, Haihua Xu, Peihao Wu, Zejun Ma:
Internal Language Model Estimation Based Adaptive Language Model Fusion for Domain Adaptation. ICASSP 2023: 1-5
[c58]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangHZZLYM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangHZZLYM23
Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma:
LiteG2P: A Fast, Light and High Accuracy Model for Grapheme-to-Phoneme Conversion. ICASSP 2023: 1-5
[c57]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LiW0MK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LiW0MK23
Zhi Li, Pengfei Wei, Xiang Yin, Zejun Ma, Alex C. Kot:
Virtual Try-On with Pose-Garment Keypoints Guided Inpainting. ICCV 2023: 22731-22740
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/icitee/MaJGZZWB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icitee/MaJGZZWB23
Zejun Ma, Hong Jiang, Huangxu Ge, Huajie Zhang, Mengshi Zhao, Ting Wang, Hongwu Bai:
Dynamics Analysis of Large-Scale Transmission Tower-Line Coupled System under Measured Typhoon Load. ICITEE 2023: 90-96
[c55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/QuYWLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/QuYWLM23
Xinghua Qu, Xiang Yin, Pengfei Wei, Lu Lu, Zejun Ma:
AudioQR: Deep Neural Audio Watermarks For QR Code. IJCAI 2023: 6192-6200
[c54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Song0LWW00M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Song0LWW00M23
Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma:
StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation. INTERSPEECH 2023: 42-46
[c53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangL00M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangL00M23
Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma:
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer. INTERSPEECH 2023: 386-390
[c52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinHXPKCHLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinHXPKCHLM23
Yist Y. Lin, Tao Han, Haihua Xu, Van Tung Pham, Yerbolat Khassanov, Tze Yuang Chong, Yi He, Lu Lu, Zejun Ma:
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition. INTERSPEECH 2023: 904-908
[c51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FuGSTLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FuGSTLM23
Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma:
Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring. INTERSPEECH 2023: 949-953
[c50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiFGTGLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiFGTGLM23
Shuju Shi, Kaiqi Fu, Yiwei Gu, Xiaohai Tian, Shaojun Gao, Wei Li, Zejun Ma:
Disentangling the Contribution of Non-native Speech in Automated Pronunciation Assessment. INTERSPEECH 2023: 954-958
[c49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenXKHLM023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenXKHLM023
Zhipeng Chen, Haihua Xu, Yerbolat Khassanov, Yi He, Lu Lu, Zejun Ma, Ji Wu:
Knowledge Distillation Approach for Efficient Internal Language Model Estimation. INTERSPEECH 2023: 1339-1343
[c48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wei0WLQXM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wei0WLQXM23
Pengfei Wei, Xiang Yin, Chunfeng Wang, Zhonghao Li, Xinghua Qu, Zhiqiang Xu, Zejun Ma:
S2CD: Self-heuristic Speaker Content Disentanglement for Any-to-Any Voice Conversion. INTERSPEECH 2023: 2288-2292
[c47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenLWHM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenLWHM23
Xianzhao Chen, Yist Y. Lin, Kang Wang, Yi He, Zejun Ma:
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition. INTERSPEECH 2023: 2908-2912
[c46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FanD0L00M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FanD0L00M23
Zhiyun Fan, Linhao Dong, Chen Shen, Zhenlin Liang, Jun Zhang, Lu Lu, Zejun Ma:
Language-specific Boundary Learning for Improving Mandarin-English Code-switching Speech Recognition. INTERSPEECH 2023: 3322-3326
[c45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongZL0W00M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongZL0W00M23
Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma:
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech. INTERSPEECH 2023: 5486-5490
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/0003ZLYMJ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/0003ZLYMJ23
Yuchen Liu, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma, Qin Jin:
Emotionally Situated Text-to-Speech Synthesis in User-Agent Conversation. ACM Multimedia 2023: 5966-5974
[c43]
- view
  authority control:
- export record
  dblp key:
  - conf/sigir/QuLS0OLM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/sigir/QuLS0OLM23
Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma:
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions and Prospects. SIGIR 2023: 2701-2711
2022
[c42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/0021DZMBD22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/0021DZMBD22
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
Zero-Shot Audio Source Separation through Query-Based Learning from Weakly-Labeled Data. AAAI 2022: 4441-4449
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/cikm/WeiQSM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cikm/WeiQSM22
Pengfei Wei, Xinghua Qu, Wen Song, Zejun Ma:
Dynamic Transfer Gaussian Process Regression. CIKM 2022: 2118-2127
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoZZMZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoZZMZ22
Hang Zhao, Chen Zhang, Bilei Zhu, Zejun Ma, Kejun Zhang:
S3T: Self-Supervised Pre-Training with Swin Transformer For Music Classification. ICASSP 2022: 606-610
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DuCWZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DuCWZM22
Xingjian Du, Ke Chen, Zijie Wang, Bilei Zhu, Zejun Ma:
Bytecover2: Towards Dimensionality Reduction of Latent Embedding for Efficient Cover Song Identification. ICASSP 2022: 616-620
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenDZMBD22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenDZMBD22
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection. ICASSP 2022: 646-650
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuTWBGYM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuTWBGYM22
Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma:
Towards Using Clothes Style Transfer for Scenario-Aware Person Video Generation. ICASSP 2022: 1745-1749
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuHQWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuHQWM22
Yizhou Lu, Mingkun Huang, Xinghua Qu, Pengfei Wei, Zejun Ma:
Language Adaptive Cross-Lingual Speech Representation Learning with Sparse Sharing Sub-Networks. ICASSP 2022: 6882-6886
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LingSCM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LingSCM22
Shaoshi Ling, Chen Shen, Meng Cai, Zejun Ma:
Improving Pseudo-Label Training For End-To-End Speech Recognition Using Gradient Mask. ICASSP 2022: 8397-8401
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HanDLCZMX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HanDLCZMX22
Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu:
Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection. ICASSP 2022: 8532-8536
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShenLFWWTZYM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShenLFWWTZYM22
Chen Shen, Yi Liu, Wenzhi Fan, Bin Wang, Shixue Wen, Yao Tian, Jun Zhang, Jingsheng Yang, Zejun Ma:
The Volcspeech System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 9176-9180
[c32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/QinMDLZTMLL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/QinMDLZTMLL22
Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Yao Tian, Zejun Ma, Jie Luo, Xianglong Liu:
BiFSMN: Binary Neural Network for Keyword Spotting. IJCAI 2022: 4346-4352
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuMXHMZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuMXHMZ22
Yufei Liu, Rao Ma, Haihua Xu, Yi He, Zejun Ma, Weibin Zhang:
Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR. INTERSPEECH 2022: 1666-1670
[c30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HouCLTZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HouCLTZM22
Junfeng Hou, Jinkun Chen, Wanyu Li, Yufeng Tang, Jun Zhang, Zejun Ma:
Bring dialogue-context into RNN-T for streaming ASR. INTERSPEECH 2022: 2048-2052
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FanLDLZCZMX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FanLDLZCZMX22
Zhiyun Fan, Zhenlin Liang, Linhao Dong, Yi Liu, Shiyu Zhou, Meng Cai, Jun Zhang, Zejun Ma, Bo Xu:
Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire. INTERSPEECH 2022: 3749-3753
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLT0WYM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLT0WYM22
Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma:
Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding. INTERSPEECH 2022: 4287-4291
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FuGT0M22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FuGT0M22
Kaiqi Fu, Shaojun Gao, Xiaohai Tian, Wei Li, Zejun Ma:
Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring. INTERSPEECH 2022: 4337-4341
[c26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TianFGGWLM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TianFGGWLM22
Xiaohai Tian, Kaiqi Fu, Shaojun Gao, Yiwei Gu, Kai Wang, Wei Li, Zejun Ma:
A Transfer and Multi-Task Learning based Approach for MOS Prediction. INTERSPEECH 2022: 5438-5442
[c25]
- view
  - electronic edition @ ismir.net
  - details & citations
- export record
  dblp key:
  - conf/ismir/DuLWL0ZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/DuLWL0ZM22
Xingjian Du, Huidong Liang, Yuan Wan, Yuheng Lin, Ke Chen, Bilei Zhu, Zejun Ma:
Latent feature augmentation for chorus detection. ISMIR 2022: 240-247
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/kdd/QuO0WSM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/kdd/QuO0WSM22
Xinghua Qu, Yew Soon Ong, Abhishek Gupta, Pengfei Wei, Zhu Sun, Zejun Ma:
Importance Prioritized Policy Distillation. KDD 2022: 1420-1429
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/kdd/QuWGSOM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/kdd/QuWGSOM22
Xinghua Qu, Pengfei Wei, Mingyong Gao, Zhu Sun, Yew Soon Ong, Zejun Ma:
Synthesising Audio Adversarial Examples for Automatic Speech Recognition. KDD 2022: 1430-1440
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/mir/SunLHZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mir/SunLHZM22
Xiaoheng Sun, Xia Liang, Qiqi He, Bilei Zhu, Zejun Ma:
GIO: A Timbre-informed Approach for Pitch Tracking in Highly Noisy Environments. ICMR 2022: 480-488
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/naacl/LinAWM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/LinAWM22
Yu Lin, Zhecheng An, Peihao Wu, Zejun Ma:
Improving Contextual Representation with Gloss Regularized Pre-training. NAACL-HLT (Findings) 2022: 907-920
2021
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GaoDZSLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GaoDZSLM21
Yongwei Gao, Xingjian Du, Bilei Zhu, Xiaoheng Sun, Wei Li, Zejun Ma:
An Hrnet-Blstm Model With Two-Stage Training For Singing Melody Extraction. ICASSP 2021: 56-60
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DuZKM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DuZKM21
Xingjian Du, Bilei Zhu, Qiuqiang Kong, Zejun Ma:
Singing Melody Extraction from Polyphonic Music based on Spectral Correlation Modeling. ICASSP 2021: 241-245
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DuYZCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DuYZCM21
Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma:
Bytecover: Cover Song Identification Via Multi-Loss Training. ICASSP 2021: 551-555
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HouDZMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HouDZMB21
Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren:
Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams. ICASSP 2021: 4165-4169
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TianYCLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TianYCLM21
Yao Tian, Haitao Yao, Meng Cai, Yaming Liu, Zejun Ma:
Improving RNN Transducer Modeling for Small-Footprint Keyword Spotting. ICASSP 2021: 5624-5628
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PanWYWXM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PanWYWXM21
Junjie Pan, Lin Wu, Xiang Yin, Pengfei Wu, Chenchang Xu, Zejun Ma:
A Chapter-Wise Understanding System for Text-To-Speech in Chinese Novels. ICASSP 2021: 6069-6073
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiTYWXSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiTYWXSM21
Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Ling Xu, Chen Shen, Zejun Ma:
PPG-Based Singing Voice Conversion with Adversarial Representation Learning. ICASSP 2021: 7073-7077
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HouYLDZMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HouYLDZMB21
Yuanbo Hou, Zhesong Yu, Xia Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Dick Botteldooren:
Attention-Based Cross-Modal Fusion for Audio-Visual Voice Activity Detection in Musical Video Streams. Interspeech 2021: 321-325
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangSTHCZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangSTHCZM21
Lu Huang, Jingyu Sun, Yufeng Tang, Junfeng Hou, Jinkun Chen, Jun Zhang, Zejun Ma:
HMM-Free Encoder Pre-Training for Streaming RNN Transducer. Interspeech 2021: 1797-1801
[c11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenNHWMX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenNHWMX21
Xianzhao Chen, Hao Ni, Yi He, Kang Wang, Zejun Ma, Zongxia Xie:
Emitting Word Timings with HMM-Free End-to-End System in Automatic Speech Recognition. Interspeech 2021: 2571-2575
[c10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZouLYLWZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZouLYLWZM21
Yuxiang Zou, Shichao Liu, Xiang Yin, Haopeng Lin, Chunfeng Wang, Haoyu Zhang, Zejun Ma:
Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation. Interspeech 2021: 3146-3150
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/GuYRWTZCWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/GuYRWTZCWM21
Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma:
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders. ISCSLP 2021: 1-5
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XieLBTYYWYZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XieLBTYYWYZM21
Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma:
Towards Realistic Visual Dubbing with Heterogeneous Sources. ACM Multimedia 2021: 1739-1747
2020
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PanYZLZMW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PanYZLZMW20
Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang:
A Unified Sequence-to-Sequence Front-End Model for Mandarin Text-to-Speech Synthesis. ICASSP 2020: 6689-6693
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangPY0LZWM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangPY0LZWM20
Junhui Zhang, Junjie Pan, Xiang Yin, Chen Li, Shichao Liu, Yang Zhang, Yuxuan Wang, Zejun Ma:
A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin. ICASSP 2020: 6694-6698
2019
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/AnWYMX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/AnWYMX19
Xiaochun An, Yuxuan Wang, Shan Yang, Zejun Ma, Lei Xie:
Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis. ASRU 2019: 184-191
2012
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaWX12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaWX12
Zejun Ma, Xiaorui Wang, Bo Xu:
Unsupervised training of subspace gaussian mixture models for conversational telephone speech recognition. ICASSP 2012: 4829-4832
2011
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaWX11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaWX11
Zejun Ma, Xiaorui Wang, Bo Xu:
An Empirical Study of Multilingual Spoken Term Detection. INTERSPEECH 2011: 1921-1924
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaWX11a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaWX11a
Zejun Ma, Xiaorui Wang, Bo Xu:
Fusing Multiple Confidence Measures for Chinese Spoken Term Detection. INTERSPEECH 2011: 1925-1928
2010
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/wcsp/MaSZY10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/wcsp/MaSZY10
Zejun Ma, Li Song, Cheng Zhi, Libo Yang:
Distributed link-aware rate allocation for R-D optimal multiple video streaming over wireless networks. WCSP 2010: 1-6

Informal and Other Publications

see FAQ

What is the meaning of the colors in the publication lists?

2024
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-08503
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-08503
Zhenhui Ye, Tianyun Zhong, Yi Ren, Jiaqi Yang, Weichuang Li, Jiawei Huang, Ziyue Jiang, Jinzheng He, Rongjie Huang, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao:
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis. CoRR abs/2401.08503 (2024)
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-07485
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-07485
Hang Zhao, Yifei Xin, Zhesong Yu, Bilei Zhu, Lu Lu, Zejun Ma:
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning. CoRR abs/2402.07485 (2024)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-02010
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-02010
Zhiyun Fan, Linhao Dong, Jun Zhang, Lu Lu, Zejun Ma:
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR. CoRR abs/2403.02010 (2024)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07914
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07914
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Jun Zhang, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
Can Large Language Models Understand Spatial Audio? CoRR abs/2406.07914 (2024)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-15704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-15704
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. CoRR abs/2406.15704 (2024)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-07895
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-07895
Feng Li, Renrui Zhang, Hao Zhang, Yuanhan Zhang, Bo Li, Wei Li, Zejun Ma, Chunyuan Li:
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models. CoRR abs/2407.07895 (2024)
2023
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-10444
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-10444
Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee:
Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring. CoRR abs/2302.10444 (2023)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01086
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01086
Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma:
LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion. CoRR abs/2303.01086 (2023)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-11692
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-11692
Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma:
ByteCover3: Accurate Cover Song Identification on Short Queries. CoRR abs/2303.11692 (2023)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-13343
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-13343
Xinnian Liang, Bing Wang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li:
Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System. CoRR abs/2304.13343 (2023)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-00787
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-00787
Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao:
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation. CoRR abs/2305.00787 (2023)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11438
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11438
Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma:
Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring. CoRR abs/2305.11438 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-17499
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-17499
Linhao Dong, Zhecheng An, Peihao Wu, Jun Zhang, Lu Lu, Zejun Ma:
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training. CoRR abs/2305.17499 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-17732
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-17732
Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma:
StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation. CoRR abs/2305.17732 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18474
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18474
Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao:
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation. CoRR abs/2305.18474 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-02982
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-02982
Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang:
PolyVoice: Language Models for Speech to Speech Translation. CoRR abs/2306.02982 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-03504
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-03504
Zhenhui Ye, Ziyue Jiang, Yi Ren, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao:
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis. CoRR abs/2306.03504 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-03509
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-03509
Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao:
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias. CoRR abs/2306.03509 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-04076
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-04076
Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma:
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer. CoRR abs/2306.04076 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-05279
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-05279
Zhiyun Fan, Linhao Dong, Chen Shen, Zhenlin Liang, Jun Zhang, Lu Lu, Zejun Ma:
Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition. CoRR abs/2306.05279 (2023)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-07949
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-07949
Xianzhao Chen, Yist Y. Lin, Kang Wang, Yi He, Zejun Ma:
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition. CoRR abs/2306.07949 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08219
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08219
Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma:
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects. CoRR abs/2306.08219 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15304
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15304
Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma:
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech. CoRR abs/2306.15304 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-07218
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-07218
Ziyue Jiang, Jinglin Liu, Yi Ren, Jinzheng He, Chen Zhang, Zhenhui Ye, Pengfei Wei, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao:
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts. CoRR abs/2307.07218 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13963
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13963
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. CoRR abs/2309.13963 (2023)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-05863
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-05863
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models. CoRR abs/2310.05863 (2023)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-13289
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-13289
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
SALMONN: Towards Generic Hearing Abilities for Large Language Models. CoRR abs/2310.13289 (2023)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-08966
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-08966
Jin Qiu, Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma:
Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer. CoRR abs/2311.08966 (2023)
2022
[i39]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-06260
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-06260
Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma:
Towards Realistic Visual Dubbing with Heterogeneous Sources. CoRR abs/2201.06260 (2022)
[i38]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-11627
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-11627
Yufei Liu, Rao Ma, Haihua Xu, Yi He, Zejun Ma, Weibin Zhang:
Internal language model estimation through explicit context vector learning for attention-based encoder-decoder ASR. CoRR abs/2201.11627 (2022)
[i37]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-12806
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-12806
Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu:
Improving End-to-End Contextual Speech Recognition with Fine-grained Contextual Knowledge Selection. CoRR abs/2201.12806 (2022)
[i36]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-00874
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-00874
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection. CoRR abs/2202.00874 (2022)
[i35]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-04261
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-04261
Chen Shen, Yi Liu, Wenzhi Fan, Bin Wang, Shixue Wen, Yao Tian, Jun Zhang, Jingsheng Yang, Zejun Ma:
The Volcspeech system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge. CoRR abs/2202.04261 (2022)
[i34]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-06483
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-06483
Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Yao Tian, Zejun Ma, Jie Luo, Xianglong Liu:
BiFSMN: Binary Neural Network for Keyword Spotting. CoRR abs/2202.06483 (2022)
[i33]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-10139
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-10139
Hang Zhao, Chen Zhang, Belei Zhu, Zejun Ma, Kejun Zhang:
S3T: Self-Supervised Pre-training with Swin Transformer for Music Classification. CoRR abs/2202.10139 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-01826
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-01826
Kaiqi Fu, Shaojun Gao, Kai Wang, Wei Li, Xiaohai Tian, Zejun Ma:
Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information. CoRR abs/2203.01826 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-04583
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-04583
Yizhou Lu, Mingkun Huang, Xinghua Qu, Pengfei Wei, Zejun Ma:
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks. CoRR abs/2203.04583 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-06603
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-06603
Yu Lin, Zhecheng An, Peihao Wu, Zejun Ma:
Improving Contextual Representation with Gloss Regularized Pre-training. CoRR abs/2205.06603 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-04922
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-04922
Wudi Bao, Junhui Zhang, Junjie Pan, Xiang Yin, Zejun Ma:
A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation. CoRR abs/2206.04922 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-13110
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-13110
Zhiyun Fan, Linhao Dong, Meng Cai, Zejun Ma, Bo Xu:
Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire. CoRR abs/2206.13110 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-07365
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-07365
Pengfei Wei, Lingdong Kong, Xinghua Qu, Xiang Yin, Zhiqiang Xu, Jing Jiang, Zejun Ma:
Unsupervised Video Domain Adaptation: A Disentanglement Perspective. CoRR abs/2208.07365 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15876
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15876
Haihua Xu, Van Tung Pham, Yerbolat Khassanov, Yist Y. Lin, Tao Han, Tze Yuan Chong, Yi He, Zejun Ma:
Improving short-video speech recognition using random utterance concatenation. CoRR abs/2210.15876 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-00968
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-00968
Rao Ma, Xiaobo Wu, Jin Qiu, Yanan Qin, Haihua Xu, Peihao Wu, Zejun Ma:
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation. CoRR abs/2211.00968 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-03710
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-03710
Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, Junbin Gao:
Graph Contrastive Learning with Implicit Augmentations. CoRR abs/2211.03710 (2022)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-06987
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-06987
Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Zejun Ma, Jiakai Wang, Jie Luo, Xianglong Liu:
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance. CoRR abs/2211.06987 (2022)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-09381
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-09381
Zhiyun Fan, Zhenlin Liang, Linhao Dong, Yi Liu, Shiyu Zhou, Meng Cai, Jun Zhang, Zejun Ma, Bo Xu:
Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire. CoRR abs/2211.09381 (2022)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-05805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-05805
Junhui Zhang, Junjie Pan, Xiang Yin, Zejun Ma:
Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features. CoRR abs/2212.05805 (2022)
2021
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-10764
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-10764
Lu Huang, Jingyu Sun, Yufeng Tang, Junfeng Hou, Jinkun Chen, Jun Zhang, Zejun Ma:
HMM-Free Encoder Pre-Training for Streaming RNN Transducer. CoRR abs/2104.10764 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-11411
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-11411
Yuanbo Hou, Zhesong Yu, Xia Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Dick Botteldooren:
Attention-based cross-modal fusion for audio-visual voice activity detection in musical video streams. CoRR abs/2106.11411 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04056
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04056
Shaoshi Ling, Chen Shen, Meng Cai, Zejun Ma:
Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient Mask. CoRR abs/2110.04056 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04153
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04153
Pengfei Wu, Junjie Pan, Chenchang Xu, Junhui Zhang, Lin Wu, Xiang Yin, Zejun Ma:
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech. CoRR abs/2110.04153 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04754
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04754
Chao Wang, Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Yibiao Yu, Zejun Ma:
Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding. CoRR abs/2110.04754 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-11894
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-11894
Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma:
Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation. CoRR abs/2110.11894 (2021)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2112-07891
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-07891
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data. CoRR abs/2112.07891 (2021)
2020
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-11012
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-11012
Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma:
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders. CoRR abs/2004.11012 (2020)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-09271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-09271
Wenjie Li, Benlai Tang, Xiang Yin, Yushi Zhao, Wei Li, Kang Wang, Hao Huang, Yuxuan Wang, Zejun Ma:
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech. CoRR abs/2005.09271 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13540
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13540
Zhesong Yu, Xingjian Du, Bilei Zhu, Zejun Ma:
Contrastive Unsupervised Learning for Audio Fingerprinting. CoRR abs/2010.13540 (2020)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-14022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-14022
Xingjian Du, Zhesong Yu, Bilei Zhu, Xiaoou Chen, Zejun Ma:
ByteCover: Cover Song Identification via Multi-Loss Training. CoRR abs/2010.14022 (2020)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-14168
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-14168
Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren:
Rule-embedded network for audio-visual voice activity detection in live musical video streams. CoRR abs/2010.14168 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-14804
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-14804
Zhonghao Li, Benlai Tang, Xiang Yin, Yuan Wan, Ling Xu, Chen Shen, Zejun Ma:
PPG-based singing voice conversion with adversarial representation learning. CoRR abs/2010.14804 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-01570
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-01570
Mingkun Huang, Meng Cai, Jun Zhang, Yang Zhang, Yongbin You, Yi He, Zejun Ma:
Dynamic latency speech recognition with asynchronous revision. CoRR abs/2011.01570 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-01576
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-01576
Mingkun Huang, Jun Zhang, Meng Cai, Yang Zhang, Jiali Yao, Yongbin You, Yi He, Zejun Ma:
Improving RNN transducer with normalized jointer network. CoRR abs/2011.01576 (2020)
2019
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-04111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-04111
Junjie Pan, Xiang Yin, Zhiling Zhang, Shichao Liu, Yang Zhang, Zejun Ma, Yuxuan Wang:
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis. CoRR abs/1911.04111 (2019)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-04128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-04128
Junhui Zhang, Junjie Pan, Xiang Yin, Chen Li, Shichao Liu, Yang Zhang, Yuxuan Wang, Zejun Ma:
A hybrid text normalization system using multi-head self-attention for mandarin. CoRR abs/1911.04128 (2019)
2017
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/TianZMHW17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/TianZMHW17
Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei:
Exponential Moving Average Model in Parallel Speech Recognition Training. CoRR abs/1703.01024 (2017)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/TianZMHWWSLZ17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/TianZMHWWSLZ17
Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei, Peihao Wu, Wenchang Situ, Shuai Li, Yang Zhang:
Deep LSTM for Large Vocabulary Continuous Speech Recognition. CoRR abs/1703.07090 (2017)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/TianZMHW17a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/TianZMHW17a
Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei:
Frame Stacking and Retaining for Recurrent Neural Network Acoustic Model. CoRR abs/1705.05992 (2017)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.