default search action

combined dblp search
author search
venue search
publication search

ask others

SLT 2024: Macao

Name: dblp XML data dump
Creator: Schloss Dagstuhl - Leibniz Center for Informatics
Published: 1993
License: https://creativecommons.org/publicdomain/zero/1.0/
Keywords: dblp, XML, computer science, scholarly publications, metadata

> Home > Conferences and Workshops > SLT

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

- view
  authority control:
- export record
  dblp key:
  - conf/slt/2024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/2024
IEEE Spoken Language Technology Workshop, SLT 2024, Macao, December 2-5, 2024. IEEE 2024, ISBN 979-8-3503-9225-8
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangHL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangHL24
Chih-Kai Yang, Kuan-Po Huang, Hung-Yi Lee:
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper. 1-8
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LuSTK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LuSTK24
Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Temporal Order Preserved Optimal Transport-Based Cross-Modal Knowledge Transfer Learning for ASR. 1-8
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GaoC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GaoC24
Xiaoxue Gao, Nancy F. Chen:
Speech-Mamba: Long-Context Speech Recognition with Selective State Spaces Models. 1-8
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MasuyamaMM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MasuyamaMM24
Yoshiki Masuyama, Koichi Miyazaki, Masato Murata:
Mamba-Based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition. 1-6
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WagnerBRRB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WagnerBRRB24
Dominik Wagner, Ilja Baumann, Thomas Ranzenberger, Korbinian Riedhammer, Tobias Bocklet:
Personalizing Large Sequence-to-Sequence Speech Foundation Models With Speaker Representations. 1-6
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BataevXGLG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BataevXGLG24
Vladimir Bataev, Hainan Xu, Daniel Galvez, Vitaly Lavrukhin, Boris Ginsburg:
Label-Looping: Highly Efficient Decoding For Transducers. 7-13
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiJXXZWSZY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiJXXZWSZY24
Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong Yu:
Advancing Multi-Talker ASR Performance With Large Language Models. 14-21
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KerenZK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KerenZK24
Gil Keren, Wei Zhou, Ozlem Kalinli:
Token-Weighted RNN-T For Learning From Flawed Data. 22-29
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HuangLWLGLH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HuangLWLGLH24
Hukai Huang, Jiayan Lin, Kaidi Wang, Yishuang Li, Wenhao Guan, Lin Li, Qingyang Hong:
Enhancing Code-Switching Speech Recognition With LID-Based Collaborative Mixture of Experts Model. 30-36
- view
  authority control:
- export record
  dblp key:
  - conf/slt/StoreyHB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/StoreyHB24
Edward Storey, Naomi Harte, Peter Bell:
Language Bias in Self-Supervised Learning For Automatic Speech Recognition. 37-42
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuPLCSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuPLCSW24
Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe:
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts. 43-48
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LingYZG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LingYZG24
Shaoshi Ling, Guoli Ye, Rui Zhao, Yifan Gong:
Hybrid Attention-Based Encoder-Decoder Model for Efficient Language Model Adaptation. 49-55
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShaoXKY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShaoXKY24
Yiwen Shao, Yong Xu, Sanjeev Khudanpur, Dong Yu:
Spatialemb: Extract and Encode Spatial Information for 1-Stage Multi-Channel Multi-Speaker ASR on Arbitrary Microphone Arrays. 56-63
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaLK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaLK24
Yingyi Ma, Zhe Liu, Ozlem Kalinli:
Effective Text Adaptation For LLM-Based ASR Through Soft Prompt Fine-Tuning. 64-69
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SudoFSPW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SudoFSPW24
Yui Sudo, Yosuke Fukumoto, Muhammad Shakeel, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition With Dynamic Vocabulary. 78-85
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangPYWLC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangPYWLC24
Yi-Cheng Wang, Li-Ting Pai, Bi-Cheng Yan, Hsin-Wei Wang, Chi-Han Lin, Berlin Chen:
An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition. 94-101
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChauhanCTTN24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChauhanCTTN24
Geeticka Chauhan, Steve Chien, Om Thakkar, Abhradeep Thakurta, Arun Narayanan:
Training Large ASR Encoders With Differential Privacy. 102-109
- view
  authority control:
- export record
  dblp key:
  - conf/slt/TsengTA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/TsengTA24
Cindy Tseng, Yun Tang, Vijendra Raj Apsingekar:
Transducer Consistency Regularization For Speech to Text Applications. 110-117
- view
  authority control:
- export record
  dblp key:
  - conf/slt/TsengCCLHL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/TsengCCLHL24
Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-Yi Lee:
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation For Code-Switching ASR Using Realistic Data. 118-125
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangMGZC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangMGZC24
Guanrou Yang, Ziyang Ma, Zhifu Gao, Shiliang Zhang, Xie Chen:
CTC-Assisted LLM-Based Contextual ASR. 126-131
- view
  authority control:
- export record
  dblp key:
  - conf/slt/JiangZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/JiangZW24
Dongcheng Jiang, Chao Zhang, Philip C. Woodland:
Automatic Time Alignment Generation For End-to-End ASR Using Acoustic Probability Modelling. 132-139
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YuenYC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YuenYC24
Kwok Chin Yuen, Jia Qi Yip, Eng Siong Chng:
Continual Learning With Embedding Layer Surgery and Task-Wise Beam Search Using Whisper. 140-146
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenHHPKZBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenHHPKZBG24
Zhehuai Chen, He Huang, Oleksii Hrinchuk, Krishna C. Puvvada, Nithin Rao Koluguri, Piotr Zelasko, Jagadeesh Balam, Boris Ginsburg:
Bestow: Efficient and Streamable Speech Language Model with The Best of Two Worlds in GPT and T5. 147-154
- view
  authority control:
- export record
  dblp key:
  - conf/slt/VietingBNBSH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/VietingBNBSH24
Peter Vieting, Simon Berger, Thilo von Neumann, Christoph Boeddeker, Ralf Schlüter, Reinhold Haeb-Umbach:
Combining TF-GridNet And Mixture Encoder For Continuous Speech Separation For Meeting Transcription. 155-162
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WhettenPMDE24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WhettenPMDE24
Ryan Whetten, Titouan Parcollet, Adel Moumen, Marco Dinarelli, Yannick Estève:
An Analysis of Linear Complexity Attention Substitutes With Best-RQ. 169-176
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaryU24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaryU24
Narla John Metilda Sagaya Mary, Srinivasan Umesh:
Lite ASR Transformer: A Light Weight Transformer Architecture For Automatic Speech Recognition. 185-192
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiGNK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiGNK24
Hao Shi, Yuan Gao, Zhaoheng Ni, Tatsuya Kawahara:
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition. 193-199
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PonceletWh24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PonceletWh24
Jakob Poncelet, Yujun Wang, Hugo Van hamme:
Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models. 200-207
- view
  authority control:
- export record
  dblp key:
  - conf/slt/RainaG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/RainaG24
Vyas Raina, Mark J. F. Gales:
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Multi-Task Automatic Speech Recognition Models. 208-215
- view
  authority control:
- export record
  dblp key:
  - conf/slt/JogiANVK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/JogiANVK24
Yash Jogi, Vaibhav Aggarwal, Shabari S. Nair, Yash Verma, Aayush Kubba:
Improving Rare-Word Recognition of Whisper in Zero-Shot Settings. 216-223
- view
  authority control:
- export record
  dblp key:
  - conf/slt/AmannLBN24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/AmannLBN24
Robin Amann, Zhaolin Li, Barbara Bruno, Jan Niehues:
Augmenting Automatic Speech Recognition Models With Disfluency Detection. 224-231
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangLZDZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangLZDZ24
Yuting Yang, Yuke Li, Lifeng Zhou, Binbin Du, Haoqi Zhu:
Enhancing Unified Streaming and Non-Streaming ASR Through Curriculum Learning With Easy-To-Hard Tasks. 232-239
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShaoLWGQ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShaoLWGQ24
Hang Shao, Bei Liu, Wei Wang, Xun Gong, Yanmin Qian:
DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition. 240-246
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangSHWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangSHWL24
Shih-Heng Wang, Jiatong Shi, Chien-Yu Huang, Shinji Watanabe, Hung-Yi Lee:
Fusion Of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition. 247-254
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KoluguriBXHBGK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KoluguriBXHBGK24
Nithin Rao Koluguri, Travis M. Bartley, Hainan Xu, Oleksii Hrinchuk, Jagadeesh Balam, Boris Ginsburg, Georg Kucsko:
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation. 255-262
- view
  authority control:
- export record
  dblp key:
  - conf/slt/XiDYL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/XiDYL24
Yu Xi, Wen Ding, Kai Yu, Junjie Lai:
Semi-Supervised Learning For Code-Switching ASR With Large Language Model Filter. 263-270
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PlantingaYGD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PlantingaYGD24
Peter Plantinga, Jaekwon Yoo, Abenezer Girma, Chandra Dhir:
Parameter Averaging Is All You Need To Prevent Forgetting. 271-278
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhaoB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhaoB24
Zeyu Zhao, Peter Bell:
Advancing CTC Models for Better Speech Alignment: A Topological Approach. 279-285
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangSZLLX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangSZLLX24
Ziqian Wang, Jiayao Sun, Zihan Zhang, Xingchen Li, Jie Liu, Lei Xie:
Dualsep: A Light-Weight Dual-Encoder Convolutional Recurrent Network For Real-Time In-Car Speech Separation. 286-293
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangQYWYLZQ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangQYWYLZQ24
Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Hemin Yang, Shujie Liu, Long Zhou, Yanmin Qian:
DDTSE: Discriminative Diffusion Model for Target Speech Extraction. 294-301
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChaoCQSYFT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChaoCQSYFT24
Rong Chao, Wen-Huang Cheng, Moreno La Quatra, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Szu-Wei Fu, Yu Tsao:
An Investigation of Incorporating Mamba For Speech Enhancement. 302-308
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangCLCW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangCLCW24
Chien-Chun Wang, Li-Wei Chen, Hung-Shin Lee, Berlin Chen, Hsin-Min Wang:
Effective Noise-Aware Data Simulation For Domain-Adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation. 309-316
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SunLCZYZY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SunLCZYZY24
Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu:
SMRU: Split-And-Merge Recurrent-Based UNet For Acoustic Echo Cancellation And Noise Suppression. 317-324
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiZWLML24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiZWLML24
Junjie Li, Ke Zhang, Shuai Wang, Haizhou Li, Man-Wai Mak, Kong Aik Lee:
On the Effectiveness of Enrollment Speech Augmentation For Target Speaker Extraction. 325-332
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiCWQ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiCWQ24
Chenda Li, Samuele Cornell, Shinji Watanabe, Yanmin Qian:
Diffusion-Based Generative Modeling With Discriminative Guidance for Streamable Speech Enhancement. 333-340
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SilvaCPSL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SilvaCPSL24
Dashanka De Silva, Siqi Cai, Saurav Pahuja, Tanja Schultz, Haizhou Li:
Neurospex: Neuro-Guided Speaker Extraction With Cross-Modal Fusion. 341-348
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangWLZQL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangWLZQL24
Jiahe Wang, Shuai Wang, Junjie Li, Ke Zhang, Yanmin Qian, Haizhou Li:
Enhancing Speaker Extraction Through Rectifying Target Confusion. 349-356
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangC24
Da-Hee Yang, Joon-Hyuk Chang:
Diff-PLC: A Diffusion-Based Approach For Effective Packet Loss Concealment. 357-363
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuLY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuLY24
Yun Liu, Xuechen Liu, Junichi Yamagishi:
Improving Curriculum Learning For Target Speaker Extraction With Synthetic Speakers. 364-370
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangPGLCLCHDZZCTBGSCBLW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangPGLCLCHDZZCTBGSCBLW24
Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, Piotr Zelasko, Chao Zhang, Yun-Nung Chen, Yu Tsao, Jagadeesh Balam, Boris Ginsburg, Sabato Marco Siniscalchi, Eng Siong Chng, Peter Bell, Catherine Lai, Shinji Watanabe, Andreas Stolcke:
Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition. 371-378
- view
  authority control:
- export record
  dblp key:
  - conf/slt/JiangWZDXZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/JiangWZDXZ24
Han Jiang, Wenyu Wang, Yiquan Zhou, Hongwu Ding, Jiacheng Xu, Jihua Zhu:
FGCL: Fine-Grained Contrastive Learning For Mandarin Stuttering Event Detection. 379-384
- view
  authority control:
- export record
  dblp key:
  - conf/slt/XueGSXWXBZQDLZJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/XueGSXWXBZQDLZJ24
Hongfei Xue, Rong Gong, Mingchen Shao, Xin Xu, Lezhi Wang, Lei Xie, Hui Bu, Jiaming Zhou, Yong Qin, Jun Du, Ming Li, Binbin Zhang, Bin Jia:
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge. 385-392
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HuangZDZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HuangZDZ24
Shangkun Huang, Dejun Zhang, Jing Deng, Rong Zheng:
Enhanced ASR FOR Stuttering Speech: Combining Adversarial and Signal-Based Data Augmentation. 393-400
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LinLLT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LinLLT24
Tzu-Quan Lin, Guan-Ting Lin, Hung-Yi Lee, Hao Tang:
Property Neurons in Self-Supervised Speech Transformers. 401-408
- view
  authority control:
- export record
  dblp key:
  - conf/slt/CaiXGGDKAW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/CaiXGGDKAW24
Zexin Cai, Henry Li Xinyuan, Ashi Garg, Leibny Paola García-Perera, Kevin Duh, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner:
Privacy Versus Emotion Preservation Trade-Offs in Emotion-Preserving Speaker Anonymization. 409-414
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YehT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YehT24
Sung-Lin Yeh, Hao Tang:
Estimating the Completeness of Discrete Speech Units. 415-422
- view
  authority control:
- export record
  dblp key:
  - conf/slt/AshiharaMHPODMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/AshiharaMHPODMS24
Takanori Ashihara, Takafumi Moriya, Shota Horiguchi, Junyi Peng, Tsubasa Ochiai, Marc Delcroix, Kohei Matsuura, Hiroshi Sato:
Investigation of Speaker Representation for Target-Speaker Speech Processing. 423-430
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiCBL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiCBL24
Yuanchao Li, Pinzhen Chen, Peter Bell, Catherine Lai:
Crossmodal ASR Error Correction With Discrete Speech Units. 431-438
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LinLYLCKL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LinLYLCKL24
Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-Yi Lee:
Listen and Speak Fairly: a Study on Semantic Gender Bias in Speech Integrated Large Language Models. 439-446
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KimJBKY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KimJBKY24
Sungnyun Kim, Kangwook Jang, Sangmin Bae, Hoirin Kim, Se-Young Yun:
Learning Video Temporal Dynamics With Cross-Modal Attention For Robust Audio-Visual Speech Recognition. 447-454
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuNSLKNMXSKHSC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuNSLKNMXSKHSC24
Lemeng Wu, Zhaoheng Ni, Bowen Shi, Gaël Le Lan, Anurag Kumar, Varun Nagaraja, Xinhao Mei, Yunyang Xiong, Bilge Soran, Raghuraman Krishnamoorthi, Wei-Ning Hsu, Yangyang Shi, Vikas Chandra:
Data Efficient Reflow for Few Step Audio Generation. 455-461
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HsiaoDMTZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HsiaoDMTZ24
Roger Hsiao, Liuhui Deng, Erik McDermott, Ruchir Travadi, Xiaodan Zhuang:
Optimizing Byte-Level Representation For End-To-End ASR. 462-467
- view
  authority control:
- export record
  dblp key:
  - conf/slt/DingJXXLG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/DingJXXLG24
Wen Ding, Fei Jia, Hainan Xu, Yu Xi, Junjie Lai, Boris Ginsburg:
Romanization Encoding For Multilingual ASR. 468-475
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangWWC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangWWC24
Tzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, Berlin Chen:
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection. 476-481
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuLH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuLH24
Chang Liu, Zhen-Hua Ling, Ya-Jun Hu:
Language-Independent Prosody-Enhanced Speech Representations For Multilingual Speech Synthesis. 482-488
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ElishaMBB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ElishaMBB24
Shahar Elisha, Andrew McDowell, Mariano Beguerisse-Díaz, Emmanouil Benetos:
Classification Of Spontaneous And Scripted Speech For Multilingual Audio. 489-495
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PanYHJYHLMZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PanYHJYHLMZ24
Yu Pan, Yuguang Yang, Yuheng Huang, Tiancheng Jin, Jingjing Yin, Yanni Hu, Heng Lu, Lei Ma, Jianjun Zhao:
GMP-TL: Gender-Augmented Multi-Scale Pseudo-Label Enhanced Transfer Learning For Speech Emotion Recognition. 496-501
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChouWGLSBLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChouWGLSBLL24
Huang-Cheng Chou, Haibin Wu, Lucas Goncalves, Seong-Gyun Leem, Ali Salman, Carlos Busso, Hung-Yi Lee, Chi-Chun Lee:
Embracing Ambiguity And Subjectivity Using The All-Inclusive Aggregation Rule For Evaluating Multi-Label Speech Emotion Recognition Systems. 502-509
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuCCGDJLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuCCGDJLL24
Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:
Open-Emotion: A Reproducible EMO-Superb For Speech Emotion Recognition Systems. 510-517
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiBL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiBL24
Yuanchao Li, Peter Bell, Catherine Lai:
Speech Emotion Recognition With ASR Transcripts: a Comprehensive Study on Word Error Rate and Fusion Techniques. 518-525
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SanchezRM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SanchezRM24
Ariadna Sanchez, Alice Ross, Nina Markl:
Beyond The Binary: Limitations and Possibilities of Gender-Related Speech Technology Research. 526-532
- view
  authority control:
- export record
  dblp key:
  - conf/slt/Lee24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/Lee24
Shi-wook Lee:
Enhancing Domain Generalization in Speech Emotion Recognition by Combining Domain-Variant Representations and Domain-Invariant Classifiers. 533-539
- view
  authority control:
- export record
  dblp key:
  - conf/slt/JiangAZDLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/JiangAZDLL24
Xiao-Hang Jiang, Yang Ai, Rui-Chen Zheng, Hui-Peng Du, Ye-Xin Lu, Zhen-Hua Ling:
MDCTCodec: A Lightweight MDCT-Based Neural Audio Codec Towards High Sampling Rate and Low Bitrate Scenarios. 540-547
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuoXYLWM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuoXYLWM24
Haohan Guo, Fenglong Xie, Dongchao Yang, Hui Lu, Xixin Wu, Helen Meng:
Addressing Index Collapse of Large-Codebook Speech Tokenizer With Dual-Decoding Product-Quantized Variational Auto-Encoder. 548-553
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiWWQZLYLTXLCZLWZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiWWQZLYLTXLCZLWZ24
Jiaqi Li, Dongmei Wang, Xiaofei Wang, Yao Qian, Long Zhou, Shujie Liu, Midia Yousefi, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yanqing Liu, Junkun Chen, Sheng Zhao, Jinyu Li, Zhizheng Wu, Michael Zeng:
Investigating Neural Audio Codecs For Speech Language Model-Based Speech Generation. 554-561
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-Weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs For Audio, Music, and Speech. 562-569
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuCLCDLLCWYLWTGWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuCLCDLLCWYLWTGWL24
Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kaiwei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-Yi Lee:
Codec-Superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models. 570-577
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuKGZGWX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuKGZGWX24
Shuiyun Liu, Yuxiang Kong, Pengcheng Guo, Weiji Zhuang, Peng Gao, Yujun Wang, Lei Xie:
Optimizing Dysarthria Wake-Up Word Spotting: an End-to-End Approach For SLT 2024 LRDWWS Challenge. 578-585
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangZZQ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangZZQ24
Shiyao Wang, Jiaming Zhou, Shiwan Zhao, Yong Qin:
PB-LRDWWS System For the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge. 586-591
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GaoCDXGBLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GaoCDXGBLL24
Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Ming Li, Chin-Hui Lee:
Summary of Low-Resource Dysarthria Wake-Up Word Spotting Challenge. 592-599
- view
  authority control:
- export record
  dblp key:
  - conf/slt/TurMR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/TurMR24
Ada Defne Tur, Adel Moumen, Mirco Ravanelli:
Progres: Prompted Generative Rescoring on ASR N-Best. 600-607
- view
  authority control:
- export record
  dblp key:
  - conf/slt/QuatraSTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/QuatraSTS24
Moreno La Quatra, Valerio Mario Salerno, Yu Tsao, Sabato Marco Siniscalchi:
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction. 608-615
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiXWCYWX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiXWCYWX24
Zhipeng Li, Xiaofen Xing, Jun Wang, Shuaiqi Chen, Guoqiao Yu, Guanglu Wan, Xiangmin Xu:
As-Speech: Adaptive Style For Speech Synthesis. 616-622
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LuongTLC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LuongTLC24
Hieu-Thi Luong, Duc-Tuan Truong, Kong Aik Lee, Eng Siong Chng:
Room Impulse Responses Help Attackers to Evade Deep Fake Detection. 623-629
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangDGWCY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangDGWCY24
Hankun Wang, Chenpeng Du, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu:
Attention-Constrained Inference For Robust Decoder-Only Text-to-Speech. 630-637
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuADLZL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuADLZL24
Fei Liu, Yang Ai, Hui-Peng Du, Ye-Xin Lu, Rui-Chen Zheng, Zhen-Hua Ling:
Stage-Wise and Prior-Aware Neural Speech Phase Prediction. 638-644
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuoXXYGWM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuoXXYGWM24
Haohan Guo, Fenglong Xie, Kun Xie, Dongchao Yang, Dake Guo, Xixin Wu, Helen Meng:
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec For Efficient Language Model Based Text-to-Speech Synthesis. 645-651
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HuangKCYYTWLF24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HuangKCYYTWLF24
Sung-Feng Huang, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Chao-Han Huck Yang, Yu Tsao, Yu-Chiang Frank Wang, Hung-Yi Lee, Szu-Wei Fu:
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits. 652-659
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HyodoTNKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HyodoTNKS24
Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura, Junya Koguchi, Hiroshi Saruwatari:
DNN-Based Ensemble Singing Voice Synthesis With Interactions Between Singers. 660-667
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KarapiperisEVOJHR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KarapiperisEVOJHR24
Sotirios Karapiperis, Nikolaos Ellinas, Alexandra Vioni, Junkwang Oh, Gunu Jho, Inchul Hwang, Spyros Raptis:
Investigating Disentanglement in a Phoneme-Level Speech Codec for Prosody Modeling. 668-674
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZengWMZJC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZengWMZJC24
Chang Zeng, Chunhui Wang, Xiaoxiao Miao, Jian Zhao, Zhonglin Jiang, Yong Chen:
Instructsing: High-Fidelity Singing Voice Generation Via Instructing Yourself. 675-681
- view
  authority control:
- export record
  dblp key:
  - conf/slt/EskimezWTLTXYZTTLZK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/EskimezWTLTXYZTTLZK24
Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Hemin Yang, Zirun Zhu, Min Tang, Xu Tan, Yanqing Liu, Sheng Zhao, Naoyuki Kanda:
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS. 682-689
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuWETTTLXZLK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuWETTTLXZLK24
Haibin Wu, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Daniel Tompkins, Chung-Hsien Tsai, Canrun Li, Zhen Xiao, Sheng Zhao, Jinyu Li, Naoyuki Kanda:
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-To-Speech. 690-697
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenWZLYQ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenWZLYQ24
Zhengyang Chen, Shuai Wang, Mingyang Zhang, Xuechen Liu, Junichi Yamagishi, Yanmin Qian:
Disentangling The Prosody And Semantic Information With Pre-Trained Model For In-Context Learning Based Zero-Shot Voice Conversion. 698-704
- view
  authority control:
- export record
  dblp key:
  - conf/slt/NiuCZMCL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/NiuCZMCL24
Zhikang Niu, Sanyuan Chen, Long Zhou, Ziyang Ma, Xie Chen, Shujie Liu:
NDVQ: Robust Neural Audio Codec With Normal Distribution-Based Vector Quantization. 705-710
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuYLWCA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuYLWCA24
Yisi Liu, Bohan Yu, Drake Lin, Peter Wu, Cheol Jun Cho, Gopala Krishna Anumanchipalli:
Fast, High-Quality and Parameter-Efficient Articulatory Synthesis Using Differentiable DSP. 711-718
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YuSWTW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YuSWTW24
Yifeng Yu, Jiatong Shi, Yuning Wu, Yuxun Tang, Shinji Watanabe:
Visinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation. 719-726
- view
  authority control:
- export record
  dblp key:
  - conf/slt/QuamerG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/QuamerG24
Waris Quamer, Ricardo Gutierrez-Osuna:
End-To-End Streaming Model For Low-Latency Speech Anonymization. 727-734
- view
  authority control:
- export record
  dblp key:
  - conf/slt/Chung24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/Chung24
Raymond Chung:
Emotion-Coherent Speech Data Augmentation And Self-Supervised Contrastive Style Training For Enhancing Kids's Story Speech Synthesis. 735-741
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LeeUS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LeeUS24
Philip H. Lee, Ismail Rasim Ulgen, Berrak Sisman:
Discrete Unit Based Masking For Improving Disentanglement in Voice Conversion. 742-749
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YamauchiSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YamauchiSS24
Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari:
Cross-Dialect Text-to-Speech In Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level Bert. 750-757
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangFGCZZXW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangFGCZZXW24
Xueyao Zhang, Zihao Fang, Yicheng Gu, Haopeng Chen, Lexiao Zou, Junan Zhang, Liumeng Xue, Zhizheng Wu:
Leveraging Diverse Semantic-Based Audio Pretrained Models for Singing Voice Conversion. 758-765
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MinixhoferKB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MinixhoferKB24
Christoph Minixhofer, Ondrej Klejch, Peter Bell:
TTSDS - Text-to-Speech Distribution Score. 766-773
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuragainLPSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuragainLPSW24
Anmol Guragain, Tianchi Liu, Zihan Pan, Hardik B. Sailor, Qiongqiong Wang:
Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CTRSVDD) Challenge 2024. 774-781
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangZSYTD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangZSYTD24
You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan:
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge. 782-787
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangWYHL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangWYHL24
Qishan Zhang, Shuangbing Wen, Fangke Yan, Tao Hu, Jun Li:
XWSB: A Blend System Utilizing XLS-R and Wavlm With SLS Classifier Detection System for SVDD 2024 Challenge. 788-794
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangDZZD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangDZZD24
Yankai Wang, Yuxuan Du, Dejun Zhang, Rong Zheng, Jing Deng:
Integrating Self-Supervised Pre-Training With Adversarial Learning for Synthesized Song Detection. 795-802
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HuangFCZTWYT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HuangFCZTWYT24
Wen-Chin Huang, Szu-Wei Fu, Erica Cooper, Ryandhimas E. Zezario, Tomoki Toda, Hsin-Min Wang, Junichi Yamagishi, Yu Tsao:
The Voicemos Challenge 2024: Beyond Speech Quality Prediction. 803-810
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiALDL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiALDL24
Yu-Fei Shi, Yang Ai, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling:
Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion. 811-817
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BabaNSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BabaNSS24
Kaito Baba, Wataru Nakata, Yuki Saito, Hiroshi Saruwatari:
The T05 System for the voicemos challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech. 818-824
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiYLWHC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiYLWHC24
Jiun-Ting Li, Bi-Cheng Yan, Tien-Hong Lo, Yi-Cheng Wang, Yung-Chang Hsu, Berlin Chen:
Automated Speaking Assessment of Conversation Tests with Novel Graph-Based Modeling on Spoken Response Coherence. 825-832
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BeckerPSUBM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BeckerPSUBM24
Luca Becker, Philip Pracht, Peter Sertdal, Jil Uboreck, Alexander Bendel, Rainer Martin:
Conditional Label Smoothing For LLM-Based Data Augmentation in Medical Text Classification. 833-840
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuoCHF24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuoCHF24
Mengfei Guo, Si Chen, Yi Huang, Junlan Feng:
Plan, Generate and Optimize: Extending Large Language Models for Dialogue Systems Via Prompt-Based Collaborativec Method. 841-848
- view
  authority control:
- export record
  dblp key:
  - conf/slt/RohmatillahC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/RohmatillahC24
Mahdin Rohmatillah, Jen-Tzung Chien:
Taming NLU Noise: Student-Teacher Learning for Robust Dialogue Policy. 849-856
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KacprzakK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KacprzakK24
Stanislaw Kacprzak, Konrad Kowalczyk:
Heightceleb - An Enrichment of Voxceleb Dataset With Speaker Height Information. 857-862
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SomekiCACCHPSSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SomekiCACCHPSSW24
Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-EZ: Python-Only ESPnet For Easy Fine-Tuning And Integration. 863-870
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LinCL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LinCL24
Yi-Cheng Lin, Wei-Chih Chen, Hung-Yi Lee:
Spoken Stereoset: on Evaluating Social Bias Toward Speaker in Speech Large Language Models. 871-878
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangXGWLHWLCZFCTZWHCLW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangXGWLHWLCZFCTZWHCLW24
Xueyao Zhang, Liumeng Xue, Yicheng Gu, Yuancheng Wang, Jiaqi Li, Haorui He, Chaoren Wang, Songting Liu, Xi Chen, Junan Zhang, Zihao Fang, Haopeng Chen, Tze Ying Tang, Lexiao Zou, Mingxuan Wang, Jun Han, Kai Chen, Haizhou Li, Zhizheng Wu:
Amphion: an Open-Source Audio, Music, and Speech Generation Toolkit. 879-884
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HeSWLGHLYLSWCZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HeSWLGHLYLSWCZW24
Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu:
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset For Large-Scale Speech Generation. 885-890
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenYCW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenYCW24
William Chen, Brian Yan, Chih-Chen Chen, Shinji Watanabe:
Floras 50: A Massively Multilingual Multitask Benchmark for Long-Form Conversational Speech. 891-898
- view
  authority control:
- export record
  dblp key:
  - conf/slt/InagumaKNPT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/InagumaKNPT24
Hirofumi Inaguma, Ilia Kulikov, Zhaoheng Ni, Sravya Popuri, Paden Tomasello:
Massively Multilingual Forced Aligner Leveraging Self-Supervised Discrete Units. 899-905
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SrivastavaCSLG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SrivastavaCSLG24
Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff, Karen Livescu, Christopher Graziul:
Speech Recognition For Analysis of Police Radio Communication. 906-912
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KaziLZHT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KaziLZHT24
Taaha Kazi, Ruiliang Lyu, Sizhe Zhou, Dilek Hakkani-Tür, Gokhan Tur:
Large Language Models as User-Agents For Evaluating Task-Oriented-Dialogue Systems. 913-920
- view
  authority control:
- export record
  dblp key:
  - conf/slt/DuLCCWRTLJ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/DuLCCWRTLJ24
Jiawei Du, I-Ming Lin, I-Hsiang Chiu, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang:
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset. 921-928
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuWHHWZSXW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuWHHWZSXW24
Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu:
SPMIS: An Investigation of Synthetic Spoken Misinformation Detection. 929-936
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShihGDH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShihGDH24
Yi-Jen Shih, Zoi Gkalitsiou, Alexandros G. Dimakis, David Harwath:
Self-Supervised Speech Models For Word-Level Stuttered Speech Detection. 937-944
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PengCC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PengCC24
Wen-Hsuan Peng, Sally Chen, Berlin Chen:
Enhancing Automatic Speech Assessment Leveraging Heterogeneous Features and Soft Labels For Ordinal Classification. 945-952
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChoiLK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChoiLK24
Yerin Choi, Jeehyun Lee, Myoung-Wan Koo:
Speech Recognition-Based Feature Extraction For Enhanced Automatic Severity Classification in Dysarthric Speech. 953-960
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuLWWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuLWWL24
Andy T. Liu, Yi-Cheng Lin, Haibin Wu, Stefan Winkler, Hung-Yi Lee:
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget. 961-968
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhengJHQFLZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhengJHQFLZ24
Xinhu Zheng, Anbai Jiang, Bing Han, Yanmin Qian, Pingyi Fan, Jia Liu, Wei-Qiang Zhang:
Improving Anomalous Sound Detection Via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models. 969-974
- view
  authority control:
- export record
  dblp key:
  - conf/slt/NguyenFGBW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/NguyenFGBW24
Tuan Nguyen, Corinne Fredouille, Alain Ghio, Mathieu Balaguer, Virginie Woisard:
Exploring ASR-Based WAV2VEC2 for Automated Speech Disorder Assessment: Insights and Analysis. 975-982
- view
  authority control:
- export record
  dblp key:
  - conf/slt/FengZSCWZXZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/FengZSCWZXZ24
Chang Feng, Yiyang Zhao, Guangzhi Sun, Zehua Chen, Shuai Wang, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Hierarchical Multi-Path and Multi-Model Selection For Fake Speech Detection. 983-990
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangWLC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangWLC24
Huayun Zhang, Jeremy H. M. Wong, Geyu Lin, Nancy F. Chen:
Semi-Supervised Learning for Robust Speech Evaluation. 991-998
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhuBAPPW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhuBAPPW24
Pai Zhu, Jacob W. Bartel, Dhruuv Agarwal, Kurt Partridge, Hyun Jin Park, Quan Wang:
GE2E-KWS: Generalized End-to-End Training and Evaluation for Zero-Shot Keyword Spotting. 999-1006
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangT24
Gene-Ping Yang, Hao Tang:
A Simple HMM with Self-Supervised Representations for Phone Segmentation. 1007-1014
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BhatiGKKFG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BhatiGKKFG24
Saurabhchand Bhati, Yuan Gong, Leonid Karlinsky, Hilde Kuehne, Rogério Feris, James R. Glass:
DASS: Distilled Audio State Space Models are Stronger and More Duration-Scalable Learners. 1015-1022
- view
  authority control:
- export record
  dblp key:
  - conf/slt/QiuRDRH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/QiuRDRH24
David Qiu, David Rim, Shaojin Ding, Oleg Rybakov, Yanzhang He:
Rand: Robustness Aware Norm Decay for Quantized Neural Networks. 1023-1030
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhangTWMZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhangTWMZ24
Ziyang Zhang, Andrew Thwaites, Alexandra Woolgar, Brian Moore, Chao Zhang:
SWIM: Short-Window CNN Integrated With Mamba for EEG-Based Auditory Spatial Attention Decoding. 1031-1038
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhouCSMBVEMTGLA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhouCSMBVEMTGLA24
Xuanru Zhou, Cheol Jun Cho, Ayati Sharma, Brittany Morin, David Baquirin, Jet Vonk, Zoe Ezzes, Zachary Miller, Boon Lead Tee, Maria Luisa Gorno-Tempini, Jiachen Lian, Gopala Anumanchipalli:
Stutter-Solver: End-To-End Multi-Lingual Dysfluency Detection. 1039-1046
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LinCTCXX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LinCTCXX24
Huadong Lin, Yirong Chen, Wenyu Tao, Mingyu Chen, Xiangmin Xu, Xiaofen Xing:
Domain Adaption and Unified Knowledge Base Motivate Better Retrieval Models in Dialog Systems With RAG. 1047-1052
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShamsDJM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShamsDJM24
Siavash Shams, Sukru Samet Dindar, Xilin Jiang, Nima Mesgarani:
SSAMBA: Self-Supervised Audio Representation Learning With Mamba State Space Model. 1053-1059
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KuanYHLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KuanYHLL24
Chun-Yi Kuan, Chih-Kai Yang, Wei-Ping Huang, Ke-Han Lu, Hung-Yi Lee:
Speech-Copilot: Leveraging Large Language Models for Speech Processing Via Task Decomposition, Modularization, and Program Generation. 1060-1067
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZhaoLFP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZhaoLFP24
Rui Zhao, Jinyu Li, Ruchao Fan, Matt Post:
CTC-GMM: CTC Guided Modality Matching For Fast and Accurate Streaming Speech Translation. 1068-1075
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PolakB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PolakB24
Peter Polák, Ondrej Bojar:
Long-Form End-To-End Speech Translation VIA Latent Alignment Segmentation. 1076-1082
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SunDHT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SunDHT24
Yi-Jyun Sun, Suvodip Dey, Dilek Hakkani-Tür, Gokhan Tur:
Confidence Estimation For LLM-Based Dialogue State Tracking. 1083-1090
- view
  authority control:
- export record
  dblp key:
  - conf/slt/CaiCWHFO24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/CaiCWHFO24
Yucheng Cai, Si Chen, Yuxuan Wu, Yi Huang, Junlan Feng, Zhijian Ou:
The 2nd Futuredial Challenge: Dialog Systems With Retrieval Augmented Generation (Futuredial-RAG). 1091-1098
- view
  authority control:
- export record
  dblp key:
  - conf/slt/QianMLLKG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/QianMLLKG24
Mengjie Qian, Rao Ma, Adian Liusie, Erfan Loweimi, Kate M. Knill, Mark J. F. Gales:
Zero-Shot Audio Topic Reranking Using Large Language Models. 1099-1106
- view
  authority control:
- export record
  dblp key:
  - conf/slt/XinyuanJTVDK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/XinyuanJTVDK24
Henry Li Xinyuan, Sonal Joshi, Thomas Thebaud, Jesús Villalba, Najim Dehak, Sanjeev Khudanpur:
Clean Label Attacks Against SLU Systems. 1107-1114
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiDKFSD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiDKFSD24
Mohan Li, Cong-Thanh Do, Simon Keizer, Youmna Farag, Svetlana Stoyanchev, Rama Doddipatla:
WHISMA: A Speech-LLM to Perform Zero-Shot Spoken Language Understanding. 1115-1122
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SunderF24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SunderF24
Vishal Sunder, Eric Fosler-Lussier:
Improving Transducer-Based Spoken Language Understanding With Self-Conditioned CTC and Knowledge Transfer. 1123-1130
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KomatsuS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KomatsuS24
Ryota Komatsu, Takahiro Shinozaki:
Self-Supervised Syllable Discovery Based on Speaker-Disentangled Hubert. 1131-1136
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuFLJMHO24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuFLJMHO24
Junkai Wu, Xulin Fan, Bo-Ru Lu, Xilin Jiang, Nima Mesgarani, Mark Hasegawa-Johnson, Mari Ostendorf:
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify And Understand Speaker in Spoken Dialogue. 1137-1143
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenALX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenALX24
Zhiyong Chen, Zhiqi Ai, Xinnuo Li, Shugong Xu:
Enhancing Open-Set Speaker Identification Through Rapid Tuning With Speaker Reciprocal Points and Negative Sample. 1144-1149
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ZengMWCY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ZengMWCY24
Chang Zeng, Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi:
Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches. 1150-1157
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BaiZL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BaiZL24
Yibo Bai, Xiao-Lei Zhang, Xuelong Li:
Adversarial Purification For Speaker Verification By Two-Stage Diffusion Models. 1158-1164
- view
  authority control:
- export record
  dblp key:
  - conf/slt/TsengSHM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/TsengSHM24
Wei-Cheng Tseng, Yi-Jen Shih, David Harwath, Raymond Mooney:
Measuring Sound Symbolism In Audio-Visual Models. 1165-1172
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KukanovLKH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KukanovLKH24
Ivan Kukanov, Janne Laakkonen, Tomi Kinnunen, Ville Hautamäki:
Meta-Learning Approaches For Improving Detection of Unseen Speech Deepfakes. 1173-1178
- view
  authority control:
- export record
  dblp key:
  - conf/slt/GuoCLLLG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/GuoCLLLG24
Chenyang Guo, Liping Chen, Zhuhai Li, Kong Aik Lee, Zhen-Hua Ling, Wu Guo:
On The Generation and Removal of Speaker Adversarial Perturbation For Voice-Privacy Protection. 1179-1184
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiuKPWSL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiuKPWSL24
Tianchi Liu, Ivan Kukanov, Zihan Pan, Qiongqiong Wang, Hardik B. Sailor, Kong Aik Lee:
Towards Quantifying and Reducing Language Mismatch Effects in Cross-Lingual Speech Anti-Spoofing. 1185-1192
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaWKCYR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaWKCYR24
Min Ma, Gary Wang, Kyle Kastner, Isaac Caswell, Charles Yoon, Andrew Rosenberg:
Enhancing Low-Resource Spoken Language Identification Via Cross-Modality Retrieval and Cross-Lingual Text-to-Speech Synthesis. 1193-1200
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HoriguchiAMASTD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HoriguchiAMASTD24
Shota Horiguchi, Atsushi Ando, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix:
Recursive Attentive Pooling For Extracting Speaker Embeddings From Multi-Speaker Recordings. 1201-1208
- view
  authority control:
- export record
  dblp key:
  - conf/slt/BaaliADSR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/BaaliADSR24
Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal, Rita Singh, Bhiksha Raj:
PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification. 1209-1216
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaryU24a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaryU24a
Narla John Metilda Sagaya Mary, S. Umesh:
Inx-Speakerhub: A 2000-Hour Indian Multiligual Speaker Verification Corpus. 1217-1223
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangDPPMMHBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangDPPMMHBG24
Weiqing Wang, Kunal Dhawan, Taejin Park, Krishna C. Puvvada, Ivan Medennikov, Somshubra Majumdar, He Huang, Jagadeesh Balam, Boris Ginsburg:
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR. 1224-1231
- view
  authority control:
- export record
  dblp key:
  - conf/slt/Sankala24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/Sankala24
Sreekanth Sankala:
Exploring Self-Supervised Representations for Text-Dependent Speaker Verification. 1232-1239
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaLZXLW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaLZXLW24
Xinlei Ma, Wenhuan Lu, Ruiteng Zhang, Junhai Xu, Xugang Lu, Jianguo Wei:
Distillation-Based Feature Extraction Algorithm For Source Speaker Verification. 1240-1246
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangGKDLZX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangGKDLZX24
Qing Wang, Hongmei Guo, Jian Kang, Mengjie Du, Jie Li, Xiao-Lei Zhang, Lei Xie:
Speaker Contrastive Learning For Source Speaker Tracing. 1247-1253
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LiLYSZRCNL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LiLYSZRCNL24
Ze Li, Yuke Lin, Tian Yao, Hongbin Suo, Pengyuan Zhang, Yanzhen Ren, Zexin Cai, Hiromitsu Nishizaki, Ming Li:
The Database and Benchmark For the Source Speaker Tracing Challenge 2024. 1254-1261

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.