default search action
Daniel Povey
Person information
- affiliation: Xiaomi Inc., Beijing, China
- affiliation (former): Johns Hopkins University, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c164]Ruizhe Huang, Mahsa Yarmohammadi, Jan Trmal, Jing Liu, Desh Raj, Leibny Paola García, Alexei V. Ivanov, Patrick Ehlen, Mingzhi Yu, Dan Povey, Sanjeev Khudanpur:
ConEC: Earnings Call Dataset with Real-world Contexts for Benchmarking Contextual Speech Recognition. LREC/COLING 2024: 3700-3706 - [c163]Quandong Wang, Yuxuan Yuan, Xiaoyu Yang, Ruike Zhang, Kang Zhao, Wei Liu, Jian Luan, Daniel Povey, Bin Wang:
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM. ECAI 2024: 3685-3692 - [c162]Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. ICASSP 2024: 10401-10405 - [c161]Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey:
PromptASR for Contextualized ASR with Controllable Style. ICASSP 2024: 10536-10540 - [c160]Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey:
Libriheavy: A 50, 000 Hours ASR Corpus with Punctuation Casing and Context. ICASSP 2024: 10991-10995 - [c159]Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur:
Less Peaky and More Accurate CTC Forced Alignment by Label Priors. ICASSP 2024: 11831-11835 - [c158]Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey:
Zipformer: A faster and better encoder for automatic speech recognition. ICLR 2024 - [c157]Desh Raj, Matthew Wiesner, Matthew Maciejewski, Paola García, Daniel Povey, Sanjeev Khudanpur:
On Speaker Attribution with SURT. Odyssey 2024: 91-98 - [i37]Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
On Speaker Attribution with SURT. CoRR abs/2401.15676 (2024) - [i36]Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur:
Less Peaky and More Accurate CTC Forced Alignment by Label Priors. CoRR abs/2406.02560 (2024) - [i35]Quandong Wang, Yuxuan Yuan, Xiaoyu Yang, Ruike Zhang, Kang Zhao, Wei Liu, Jian Luan, Daniel Povey, Bin Wang:
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM. CoRR abs/2406.06571 (2024) - [i34]Ruizhe Huang, Mahsa Yarmohammadi, Sanjeev Khudanpur, Daniel Povey:
Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation. CoRR abs/2407.10303 (2024) - [i33]Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey:
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization. CoRR abs/2409.00819 (2024) - [i32]Zengwei Yao, Wei Kang, Xiaoyu Yang, Fangjun Kuang, Liyong Guo, Han Zhu, Zengrui Jin, Zhaoqing Li, Long Lin, Daniel Povey:
CR-CTC: Consistency regularization on CTC for improved speech recognition. CoRR abs/2410.05101 (2024) - 2023
- [j13]Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang, Yonghong Yan:
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3320-3330 (2023) - [j12]Desh Raj, Daniel Povey, Sanjeev Khudanpur:
SURT 2.0: Advances in Transducer-Based Multi-Talker Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3800-3813 (2023) - [c156]Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition. ASRU 2023: 1-8 - [c155]Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey:
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation. ICASSP 2023: 1-5 - [c154]Ruizhe Huang, Matthew Wiesner, Leibny Paola García-Perera, Daniel Povey, Jan Trmal, Sanjeev Khudanpur:
Building Keyword Search System from End-To-End Asr Systems. ICASSP 2023: 1-5 - [c153]Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Zelasko, Daniel Povey:
Fast and Parallel Decoding for Transducer. ICASSP 2023: 1-5 - [c152]Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long Lin, Piotr Zelasko, Daniel Povey:
Delay-Penalized Transducer for Low-Latency Streaming ASR. ICASSP 2023: 1-5 - [c151]Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola García, Daniel Povey, Sanjeev Khudanpur:
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts. INTERSPEECH 2023: 924-928 - [c150]Zengwei Yao, Wei Kang, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Yifan Yang, Long Lin, Daniel Povey:
Delay-penalized CTC Implemented Based on Finite State Transducer. INTERSPEECH 2023: 1329-1333 - [c149]Desh Raj, Daniel Povey, Sanjeev Khudanpur:
GPU-accelerated Guided Source Separation for Meeting Transcription. INTERSPEECH 2023: 3507-3511 - [c148]Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. INTERSPEECH 2023: 4409-4413 - [i31]Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. CoRR abs/2305.11558 (2023) - [i30]Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola García, Daniel Povey, Sanjeev Khudanpur:
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts. CoRR abs/2306.01031 (2023) - [i29]Desh Raj, Daniel Povey, Sanjeev Khudanpur:
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition. CoRR abs/2306.10559 (2023) - [i28]Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang, Yonghong Yan:
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition. CoRR abs/2308.06547 (2023) - [i27]Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. CoRR abs/2309.07377 (2023) - [i26]Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey:
PromptASR for contextualized ASR with controllable style. CoRR abs/2309.07414 (2023) - [i25]Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey:
Libriheavy: a 50, 000 hours ASR corpus with punctuation casing and context. CoRR abs/2309.08105 (2023) - [i24]Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition. CoRR abs/2309.15796 (2023) - [i23]Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey:
Zipformer: A faster and better encoder for automatic speech recognition. CoRR abs/2310.11230 (2023) - 2022
- [c147]Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey:
Pruned RNN-T for fast, memory-efficient ASR training. INTERSPEECH 2022: 2068-2072 - [e3]Lu Fang, Daniel Povey, Guangtao Zhai, Tao Mei, Ruiping Wang:
Artificial Intelligence - Second CAAI International Conference, CICAI 2022, Beijing, China, August 27-28, 2022, Revised Selected Papers, Part I. Lecture Notes in Computer Science 13604, Springer 2022, ISBN 978-3-031-20496-8 [contents] - [e2]Lu Fang, Daniel Povey, Guangtao Zhai, Tao Mei, Ruiping Wang:
Artificial Intelligence - Second CAAI International Conference, CICAI 2022, Beijing, China, August 27-28, 2022, Revised Selected Papers, Part II. Lecture Notes in Computer Science 13605, Springer 2022, ISBN 978-3-031-20499-9 [contents] - [e1]Lu Fang, Daniel Povey, Guangtao Zhai, Tao Mei, Ruiping Wang:
Artificial Intelligence - Second CAAI International Conference, CICAI 2022, Beijing, China, August 27-28, 2022, Revised Selected Papers, Part III. Lecture Notes in Computer Science 13606, Springer 2022, ISBN 978-3-031-20502-6 [contents] - [i22]Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey:
Pruned RNN-T for fast, memory-efficient ASR training. CoRR abs/2206.13236 (2022) - [i21]Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Zelasko, Daniel Povey:
Fast and parallel decoding for transducer. CoRR abs/2211.00484 (2022) - [i20]Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long Lin, Piotr Zelasko, Daniel Povey:
Delay-penalized transducer for low-latency streaming ASR. CoRR abs/2211.00490 (2022) - [i19]Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey:
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation. CoRR abs/2211.00508 (2022) - [i18]Desh Raj, Daniel Povey, Sanjeev Khudanpur:
GPU-accelerated Guided Source Separation for Meeting Transcription. CoRR abs/2212.05271 (2022) - 2021
- [j11]Hang Lv, Daniel Povey, Mahsa Yarmohammadi, Ke Li, Yiming Wang, Lei Xie, Sanjeev Khudanpur:
LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder With Exact Lattice Generation. IEEE Signal Process. Lett. 28: 703-707 (2021) - [c146]Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
Wake Word Detection with Streaming Transformers. ICASSP 2021: 5864-5868 - [c145]Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
An Asynchronous WFST-Based Decoder for Automatic Speech Recognition. ICASSP 2021: 6019-6023 - [c144]Ke Li, Daniel Povey, Sanjeev Khudanpur:
A Parallelizable Lattice Rescoring Strategy with Neural Language Models. ICASSP 2021: 6518-6522 - [c143]Kyu Jeong Han, Jing Pan, Venkata Krishna Naveen Tadala, Tao Ma, Dan Povey:
Multistream CNN for Robust Acoustic Modeling. ICASSP 2021: 6873-6877 - [c142]Guoguo Chen, Shuzhou Chai, Guan-Bo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio. Interspeech 2021: 3670-3674 - [c141]Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang:
speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment. Interspeech 2021: 3710-3714 - [c140]Desh Raj, Leibny Paola García-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur:
DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs. SLT 2021: 881-888 - [i17]Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
Wake Word Detection with Streaming Transformers. CoRR abs/2102.04488 (2021) - [i16]Ke Li, Daniel Povey, Sanjeev Khudanpur:
A Parallelizable Lattice Rescoring Strategy with Neural Language Models. CoRR abs/2103.05081 (2021) - [i15]Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
An Asynchronous WFST-Based Decoder For Automatic Speech Recognition. CoRR abs/2103.09063 (2021) - [i14]Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang:
speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment. CoRR abs/2104.01378 (2021) - [i13]Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio. CoRR abs/2106.06909 (2021) - [i12]Piotr Zelasko, Daniel Povey, Jan "Yenda" Trmal, Sanjeev Khudanpur:
Lhotse: a speech data representation library for the modern deep learning ecosystem. CoRR abs/2110.12561 (2021) - 2020
- [c139]Xiaohui Zhang, Daniel Povey, Sanjeev Khudanpur:
OOV Recovery with Efficient 2nd Pass Decoding and Open-vocabulary Word-level RNNLM Rescoring for Hybrid ASR. ICASSP 2020: 6334-6338 - [c138]Zili Huang, Shinji Watanabe, Yusuke Fujita, Paola García, Yiwen Shao, Daniel Povey, Sanjeev Khudanpur:
Speaker Diarization with Region Proposal Network. ICASSP 2020: 6514-6518 - [c137]Hugo Braun, Justin Luitjens, Ryan Leary, Tim Kaldewey, Daniel Povey:
Gpu-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition. ICASSP 2020: 7874-7878 - [c136]Ke Li, Zhe Liu, Tianxing He, Hongzhao Huang, Fuchun Peng, Daniel Povey, Sanjeev Khudanpur:
An Empirical Study of Transformer-Based Neural Language Model Adaptation. ICASSP 2020: 7934-7938 - [c135]Yiwen Shao, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR. INTERSPEECH 2020: 561-565 - [c134]Pegah Ghahramani, Hossein Hadian, Daniel Povey, Hynek Hermansky, Sanjeev Khudanpur:
An Alternative to MFCCs for ASR. INTERSPEECH 2020: 1664-1667 - [c133]Ke Li, Daniel Povey, Sanjeev Khudanpur:
Neural Language Modeling with Implicit Cache Pointers. INTERSPEECH 2020: 3625-3629 - [c132]Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
Wake Word Detection with Alignment-Free Lattice-Free MMI. INTERSPEECH 2020: 4258-4262 - [c131]Srikanth R. Madikeri, Banriskhem K. Khonglah, Sibo Tong, Petr Motlícek, Hervé Bourlard, Daniel Povey:
Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems. INTERSPEECH 2020: 4746-4750 - [c130]Ruizhe Huang, Ke Li, Ashish Arora, Daniel Povey, Sanjeev Khudanpur:
Efficient MDI Adaptation for n-Gram Language Models. INTERSPEECH 2020: 4916-4920 - [i11]Zili Huang, Shinji Watanabe, Yusuke Fujita, Paola García, Yiwen Shao, Daniel Povey, Sanjeev Khudanpur:
Speaker Diarization with Region Proposal Network. CoRR abs/2002.06220 (2020) - [i10]Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
Wake Word Detection with Alignment-Free Lattice-Free MMI. CoRR abs/2005.08347 (2020) - [i9]Yiwen Shao, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR. CoRR abs/2005.09824 (2020) - [i8]Kyu Jeong Han, Jing Pan, Venkata Krishna Naveen Tadala, Tao Ma, Dan Povey:
Multistream CNN for Robust Acoustic Modeling. CoRR abs/2005.10470 (2020) - [i7]Ruizhe Huang, Ke Li, Ashish Arora, Daniel Povey, Sanjeev Khudanpur:
Efficient MDI Adaptation for n-gram Language Models. CoRR abs/2008.02385 (2020) - [i6]Desh Raj, Leibny Paola García-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur:
DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs. CoRR abs/2011.01997 (2020) - [i5]Desh Raj, Jesús Villalba, Daniel Povey, Sanjeev Khudanpur:
Frustratingly Easy Noise-aware Training of Acoustic Models. CoRR abs/2011.02090 (2020)
2010 – 2019
- 2019
- [c129]Zhehuai Chen, Mahsa Yarmohammadi, Hainan Xu, Hang Lv, Lei Xie, Daniel Povey, Sanjeev Khudanpur:
Incremental Lattice Determinization for WFST Decoders. ASRU 2019: 1-7 - [c128]Desh Raj, David Snyder, Daniel Povey, Sanjeev Khudanpur:
Probing the Information Encoded in X-Vectors. ASRU 2019: 726-733 - [c127]David Snyder, Daniel Garcia-Romero, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
Speaker Recognition for Multi-speaker Conversations Using X-vectors. ICASSP 2019: 5796-5800 - [c126]Chun-Chieh Chang, Ashish Arora, Leibny Paola García-Perera, David Etter, Daniel Povey, Sanjeev Khudanpur:
Optical Character Recognition with Chinese and Korean Character Decomposition. WML@ICDAR 2019: 134-139 - [c125]Ashish Arora, Paola García, Shinji Watanabe, Vimal Manohar, Yiwen Shao, Sanjeev Khudanpur, Chun-Chieh Chang, Babak Rekabdar, Bagher BabaAli, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal:
Using ASR Methods for OCR. ICDAR 2019: 663-668 - [c124]Fei Wu, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network. INTERSPEECH 2019: 1-5 - [c123]Jiamin Xie, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Multi-PLDA Diarization on Children's Speech. INTERSPEECH 2019: 376-380 - [c122]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Fred Richardson, Suwon Shon, François Grondin, Réda Dehak, Leibny Paola García-Perera, Daniel Povey, Pedro A. Torres-Carrasquillo, Sanjeev Khudanpur, Najim Dehak:
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. INTERSPEECH 2019: 1488-1492 - [c121]Daniel Garcia-Romero, David Snyder, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition. INTERSPEECH 2019: 1493-1496 - [c120]Daniel Garcia-Romero, David Snyder, Shinji Watanabe, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
Speaker Recognition Benchmark Using the CHiME-5 Corpus. INTERSPEECH 2019: 1506-1510 - [c119]David Snyder, Jesús Villalba, Nanxin Chen, Daniel Povey, Gregory Sell, Najim Dehak, Sanjeev Khudanpur:
The JHU Speaker Recognition System for the VOiCES 2019 Challenge. INTERSPEECH 2019: 2468-2472 - [c118]Yiming Wang, David Snyder, Hainan Xu, Vimal Manohar, Phani Sankar Nidadavolu, Daniel Povey, Sanjeev Khudanpur:
The JHU ASR System for VOiCES from a Distance Challenge 2019. INTERSPEECH 2019: 2488-2492 - [c117]Mousmita Sarma, Pegah Ghahremani, Daniel Povey, Nagendra Kumar Goel, Kandarpa Kumar Sarma, Najim Dehak:
Improving Emotion Identification Using Phone Posteriors in Raw Speech Waveform Based DNN. INTERSPEECH 2019: 3925-3929 - [c116]Mahsa Yarmohammadi, Xutai Ma, Sorami Hisamoto, Muhammad Rahman, Yiming Wang, Hainan Xu, Daniel Povey, Philipp Koehn, Kevin Duh:
Robust Document Representations for Cross-Lingual Information Retrieval in Low-Resource Settings. MTSummit (1) 2019: 12-20 - [i4]Desh Raj, David Snyder, Daniel Povey, Sanjeev Khudanpur:
Probing the Information Encoded in x-vectors. CoRR abs/1909.06351 (2019) - 2018
- [j10]Vijayaditya Peddinti, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs. IEEE Signal Process. Lett. 25(3): 373-377 (2018) - [j9]Hossein Hadian, Hossein Sameti, Daniel Povey, Sanjeev Khudanpur:
Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR. IEEE ACM Trans. Audio Speech Lang. Process. 26(11): 1949-1961 (2018) - [c115]Zili Huang, L. Paola García-Perera, Jesús Villalba, Daniel Povey, Najim Dehak:
JHU Diarization System Description. IberSPEECH 2018: 236-239 - [c114]Vimal Manohar, Hossein Hadian, Daniel Povey, Sanjeev Khudanpur:
Semi-Supervised Training of Acoustic Models Using Lattice-Free MMI. ICASSP 2018: 4844-4848 - [c113]David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, Sanjeev Khudanpur:
X-Vectors: Robust DNN Embeddings for Speaker Recognition. ICASSP 2018: 5329-5333 - [c112]Daniel Povey, Hossein Hadian, Pegah Ghahremani, Ke Li, Sanjeev Khudanpur:
A Time-Restricted Self-Attention Layer for ASR. ICASSP 2018: 5874-5878 - [c111]Hainan Xu, Tongfei Chen, Dongji Gao, Yiming Wang, Ke Li, Nagendra Goel, Yishay Carmiel, Daniel Povey, Sanjeev Khudanpur:
A Pruned Rnnlm Lattice-Rescoring Algorithm for Automatic Speech Recognition. ICASSP 2018: 5929-5933 - [c110]Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur:
Neural Network Language Modeling with Letter-Based Features and Importance Sampling. ICASSP 2018: 6109-6113 - [c109]Hossein Hadian, Hossein Sameti, Daniel Povey, Sanjeev Khudanpur:
End-to-end Speech Recognition Using Lattice-free MMI. INTERSPEECH 2018: 12-16 - [c108]Pegah Ghahremani, Phani Sankar Nidadavolu, Nanxin Chen, Jesús Villalba, Daniel Povey, Sanjeev Khudanpur, Najim Dehak:
End-to-end Deep Neural Network Age Estimation. INTERSPEECH 2018: 277-281 - [c107]Pegah Ghahremani, Hossein Hadian, Hang Lv, Daniel Povey, Sanjeev Khudanpur:
Acoustic Modeling from Frequency Domain Representations of Speech. INTERSPEECH 2018: 1596-1600 - [c106]Gaofeng Cheng, Daniel Povey, Lu Huang, Ji Xu, Sanjeev Khudanpur, Yonghong Yan:
Output-Gate Projected Gated Recurrent Unit for Speech Recognition. INTERSPEECH 2018: 1793-1797 - [c105]Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
A GPU-based WFST Decoder with Exact Lattice Generation. INTERSPEECH 2018: 2212-2216 - [c104]Gregory Sell, David Snyder, Alan McCree, Daniel Garcia-Romero, Jesús Villalba, Matthew Maciejewski, Vimal Manohar, Najim Dehak, Daniel Povey, Shinji Watanabe, Sanjeev Khudanpur:
Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge. INTERSPEECH 2018: 2808-2812 - [c103]Mousmita Sarma, Pegah Ghahremani, Daniel Povey, Nagendra Kumar Goel, Kandarpa Kumar Sarma, Najim Dehak:
Emotion Identification from Raw Speech Signals Using DNNs. INTERSPEECH 2018: 3097-3101 - [c102]Ke Li, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition. INTERSPEECH 2018: 3373-3377 - [c101]Yingke Zhu, Tom Ko, David Snyder, Brian Mak, Daniel Povey:
Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification. INTERSPEECH 2018: 3573-3577 - [c100]Daniel Povey, Gaofeng Cheng, Yiming Wang, Ke Li, Hainan Xu, Mahsa Yarmohammadi, Sanjeev Khudanpur:
Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks. INTERSPEECH 2018: 3743-3747 - [c99]David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, Sanjeev Khudanpur:
Spoken Language Recognition using X-vectors. Odyssey 2018: 105-111 - [c98]Hossein Hadian, Daniel Povey, Hossein Sameti, Jan Trmal, Sanjeev Khudanpur:
Improving LF-MMI Using Unconstrained Supervisions for ASR. SLT 2018: 43-47 - [c97]Vimal Manohar, Pegah Ghahremani, Daniel Povey, Sanjeev Khudanpur:
A Teacher-Student Learning Approach for Unsupervised Domain Adaptation of Sequence-Trained ASR Models. SLT 2018: 250-257 - [i3]Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
A GPU-based WFST Decoder with Exact Lattice Generation. CoRR abs/1804.03243 (2018) - 2017
- [c96]Pegah Ghahremani, Vimal Manohar, Hossein Hadian, Daniel Povey, Sanjeev Khudanpur:
Investigation of transfer learning for ASR using LF-MMI trained neural networks. ASRU 2017: 279-286 - [c95]Vimal Manohar, Daniel Povey, Sanjeev Khudanpur:
JHU Kaldi system for Arabic MGB-3 ASR challenge using diarization, audio-transcript alignment and transfer learning. ASRU 2017: 346-352 - [c94]Daniel Garcia-Romero, David Snyder, Gregory Sell, Daniel Povey, Alan McCree:
Speaker diarization using deep neural network embeddings. ICASSP 2017: 4930-4934 - [c93]Tom Ko, Vijayaditya Peddinti, Daniel Povey, Michael L. Seltzer, Sanjeev Khudanpur:
A study on data augmentation of reverberant speech for robust speech recognition. ICASSP 2017: 5220-5224 - [c92]Hossein Hadian, Daniel Povey, Hossein Sameti, Sanjeev Khudanpur:
Phone Duration Modeling for LVCSR Using Neural Networks. INTERSPEECH 2017: 518-522 - [c91]David Snyder, Daniel Garcia-Romero, Daniel Povey, Sanjeev Khudanpur:
Deep Neural Network Embeddings for Text-Independent Speaker Verification. INTERSPEECH 2017: 999-1003 - [c90]Gaofeng Cheng, Vijayaditya Peddinti, Daniel Povey, Vimal Manohar, Sanjeev Khudanpur, Yonghong Yan:
An Exploration of Dropout with LSTMs. INTERSPEECH 2017: 1586-1590 - [c89]Yiming Wang, Vijayaditya Peddinti, Hainan Xu, Xiaohui Zhang, Daniel Povey, Sanjeev Khudanpur:
Backstitch: Counteracting Finite-Sample Bias via Negative Steps. INTERSPEECH 2017: 1631-1635 - [c88]Xiaohui Zhang, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur:
Acoustic Data-Driven Lexicon Learning Based on a Greedy Pronunciation Selection Framework. INTERSPEECH 2017: 2541-2545 - [c87]Jan Trmal, Matthew Wiesner, Vijayaditya Peddinti, Xiaohui Zhang, Pegah Ghahremani, Yiming Wang, Vimal Manohar, Hainan Xu, Daniel Povey, Sanjeev Khudanpur:
The Kaldi OpenKWS System: Improving Low Resource Keyword Search. INTERSPEECH 2017: 3597-3601 - [i2]Xiaohui Zhang, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur:
Acoustic data-driven lexicon learning based on a greedy pronunciation selection framework. CoRR abs/1706.03747 (2017) - 2016
- [c86]Guoguo Chen, Daniel Povey, Sanjeev Khudanpur:
Acoustic data-driven pronunciation lexicon generation for logographic languages. ICASSP 2016: 5350-5354 - [c85]Vijayaditya Peddinti, Vimal Manohar, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
Far-Field ASR Without Parallel Data. INTERSPEECH 2016: 1996-2000 - [c84]Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahremani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur:
Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI. INTERSPEECH 2016: 2751-2755 - [c83]Pegah Ghahremani, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur:
Acoustic Modelling from the Signal Domain Using CNNs. INTERSPEECH 2016: 3434-3438 - [c82]David Snyder, Pegah Ghahremani, Daniel Povey, Daniel Garcia-Romero, Yishay Carmiel, Sanjeev Khudanpur:
Deep neural network-based speaker embeddings for end-to-end speaker verification. SLT 2016: 165-170 - 2015
- [c81]David Snyder, Daniel Garcia-Romero, Daniel Povey:
Time delay deep neural network-based universal background models for speaker recognition. ASRU 2015: 92-97 - [c80]Vijayaditya Peddinti, Guoguo Chen, Vimal Manohar, Tom Ko, Daniel Povey, Sanjeev Khudanpur:
JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS. ASRU 2015: 539-546 - [c79]Gaurav Kumar, Graeme W. Blackwood, Jan Trmal, Daniel Povey, Sanjeev Khudanpur:
A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation. EMNLP 2015: 1902-1907 - [c78]Vassil Panayotov, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur:
Librispeech: An ASR corpus based on public domain audio books. ICASSP 2015: 5206-5210 - [c77]Guoguo Chen, Hainan Xu, Minhua Wu, Daniel Povey, Sanjeev Khudanpur:
Pronunciation and silence probability modeling for ASR. INTERSPEECH 2015: 533-537 - [c76]Hainan Xu, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur:
Modeling phonetic context with non-random forests for speech recognition. INTERSPEECH 2015: 2117-2121 - [c75]Vijayaditya Peddinti, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur:
Reverberation robust acoustic modeling using i-vectors with time delay neural networks. INTERSPEECH 2015: 2440-2444 - [c74]Vimal Manohar, Daniel Povey, Sanjeev Khudanpur:
Semi-supervised maximum mutual information training of deep neural network acoustic models. INTERSPEECH 2015: 2630-2634 - [c73]Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur:
A time delay neural network architecture for efficient modeling of long temporal contexts. INTERSPEECH 2015: 3214-3218 - [c72]Tom Ko, Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur:
Audio augmentation for speech recognition. INTERSPEECH 2015: 3586-3589 - [c71]Xiaohui Zhang, Daniel Povey, Sanjeev Khudanpur:
A diversity-penalizing ensemble training method for deep learning. INTERSPEECH 2015: 3590-3594 - [c70]Daniel Povey, Xiaohui Zhang, Sanjeev Khudanpur:
Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging. ICLR (Workshop) 2015 - [i1]David Snyder, Guoguo Chen, Daniel Povey:
MUSAN: A Music, Speech, and Noise Corpus. CoRR abs/1510.08484 (2015) - 2014
- [c69]Xiaohui Zhang, Jan Trmal, Daniel Povey, Sanjeev Khudanpur:
Improving deep neural network acoustic models using generalized maxout networks. ICASSP 2014: 215-219 - [c68]Pegah Ghahremani, Bagher BabaAli, Daniel Povey, Korbinian Riedhammer, Jan Trmal, Sanjeev Khudanpur:
A pitch extraction algorithm tuned for automatic speech recognition. ICASSP 2014: 2494-2498 - [c67]Gaurav Kumar, Matt Post, Daniel Povey, Sanjeev Khudanpur:
Some insights from translating conversational telephone speech. ICASSP 2014: 3231-3235 - [c66]Ngoc Thang Vu, David Imseng, Daniel Povey, Petr Motlícek, Tanja Schultz, Hervé Bourlard:
Multilingual deep neural network based acoustic modeling for rapid language adaptation. ICASSP 2014: 7639-7643 - [c65]David Nolden, Hagen Soltau, Daniel Povey, Pegah Ghahremani, Lidia Mangu, Hermann Ney:
Removing redundancy from lattices. INTERSPEECH 2014: 656-660 - [c64]Justin T. Chiu, Yun Wang, Jan Trmal, Daniel Povey, Guoguo Chen, Alexander I. Rudnicky:
Combination of FST and CN search in spoken term detection. INTERSPEECH 2014: 2784-2788 - [c63]Gaurav Kumar, Yuan Cao, Ryan Cotterell, Chris Callison-Burch, Daniel Povey, Sanjeev Khudanpur:
Translations of the Callhome Egyptian Arabic corpus for conversational speech translation. IWSLT 2014 - [c62]Daniel Garcia-Romero, Xiaohui Zhang, Alan McCree, Daniel Povey:
Improving speaker recognition performance in the domain adaptation challenge using deep neural networks. SLT 2014: 378-383 - [c61]Jan Trmal, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur, Pegah Ghahremani, Xiaohui Zhang, Vimal Manohar, Chunxi Liu, Aren Jansen, Dietrich Klakow, David Yarowsky, Florian Metze:
A keyword search system using open source software. SLT 2014: 530-535 - 2013
- [c60]Guoguo Chen, Oguz Yilmaz, Jan Trmal, Daniel Povey, Sanjeev Khudanpur:
Using proxies for OOV keywords in the keyword search task. ASRU 2013: 416-421 - [c59]Mirko Hannemann, Daniel Povey, Geoffrey Zweig:
Combining forward and backward search in decoding. ICASSP 2013: 6739-6743 - [c58]Petr Motlícek, Daniel Povey, Martin Karafiát:
Feature and score level combination of subspace Gaussinas in LVCSR task. ICASSP 2013: 7604-7608 - [c57]Guoguo Chen, Sanjeev Khudanpur, Daniel Povey, Jan Trmal, David Yarowsky, Oguz Yilmaz:
Quantifying the value of pronunciation lexicons for keyword search in lowresource languages. ICASSP 2013: 8560-8564 - [c56]Shakti P. Rath, Daniel Povey, Karel Veselý, Jan Cernocký:
Improved feature processing for deep neural networks. INTERSPEECH 2013: 109-113 - [c55]Karel Veselý, Arnab Ghoshal, Lukás Burget, Daniel Povey:
Sequence-discriminative training of deep neural networks. INTERSPEECH 2013: 2345-2349 - 2012
- [j8]Daniel Povey, Kaisheng Yao:
A basis representation of constrained MLLR transforms for robust adaptation. Comput. Speech Lang. 26(1): 35-51 (2012) - [c54]Oriol Vinyals, Suman V. Ravuri, Daniel Povey:
Revisiting Recurrent Neural Networks for robust ASR. ICASSP 2012: 4085-4088 - [c53]Daniel Povey, Mirko Hannemann, Gilles Boulianne, Lukás Burget, Arnab Ghoshal, Milos Janda, Martin Karafiát, Stefan Kombrink, Petr Motlícek, Yanmin Qian, Korbinian Riedhammer, Karel Veselý, Ngoc Thang Vu:
Generating exact lattices in the WFST framework. ICASSP 2012: 4213-4216 - [c52]Ngoc Thang Vu, Tanja Schultz, Daniel Povey:
Modeling gender dependency in the Subspace GMM framework. ICASSP 2012: 4345-4348 - [c51]Korbinian Riedhammer, Tobias Bocklet, Arnab Ghoshal, Daniel Povey:
Revisiting semi-continuous hidden Markov models. ICASSP 2012: 4721-4724 - [c50]Chao Weng, Biing-Hwang Juang, Daniel Povey:
Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech. INTERSPEECH 2012: 559-562 - [c49]Oriol Vinyals, Daniel Povey:
Krylov Subspace Descent for Deep Learning. AISTATS 2012: 1261-1268 - 2011
- [j7]Daniel Povey, Lukás Burget, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra K. Goel, Martin Karafiát, Ariya Rastrow, Richard C. Rose, Petr Schwarz, Samuel Thomas:
The subspace Gaussian mixture model - A structured model for speech recognition. Comput. Speech Lang. 25(2): 404-439 (2011) - [j6]Haihua Xu, Daniel Povey, Lidia Mangu, Jie Zhu:
Minimum Bayes Risk decoding and system combination based on a recursion for edit distance. Comput. Speech Lang. 25(4): 802-828 (2011) - [c48]Daniel Povey, Geoffrey Zweig, Alex Acero:
Speaker adaptation with an Exponential Transform. ASRU 2011: 158-163 - [c47]Tomás Mikolov, Anoop Deoras, Daniel Povey, Lukás Burget, Jan Cernocký:
Strategies for training large scale neural network language models. ASRU 2011: 196-201 - [c46]Yanmin Qian, Ji Xu, Daniel Povey, Jia Liu:
Strategies for using MLP based features with limited target-language training data. ASRU 2011: 354-358 - [c45]Daniel Povey, Kaisheng Yao:
A basis method for robust estimation of constrained MLLR. ICASSP 2011: 4460-4463 - [c44]Daniel Povey, Martin Karafiát, Arnab Ghoshal, Petr Schwarz:
A symmetrization of the Subspace Gaussian Mixture Model. ICASSP 2011: 4504-4507 - [c43]Yanmin Qian, Daniel Povey, Jia Liu:
State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs. INTERSPEECH 2011: 553-560 - 2010
- [c42]Stephen M. Chu, Daniel Povey:
Speaking rate adaptation using continuous frame rate normalization. ICASSP 2010: 4306-4309 - [c41]Arnab Ghoshal, Daniel Povey, Mohit Agarwal, Pinar Akyazi, Lukás Burget, Kai Feng, Ondrej Glembek, Nagendra Goel, Martin Karafiát, Ariya Rastrow, Richard C. Rose, Petr Schwarz, Samuel Thomas:
A novel estimation of feature-space MLLR for full-covariance models. ICASSP 2010: 4310-4313 - [c40]Daniel Povey, Lukás Burget, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra K. Goel, Martin Karafiát, Ariya Rastrow, Richard C. Rose, Petr Schwarz, Samuel Thomas:
Subspace Gaussian Mixture Models for speech recognition. ICASSP 2010: 4330-4333 - [c39]Lukás Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Nagendra K. Goel, Martin Karafiát, Daniel Povey, Ariya Rastrow, Richard C. Rose, Samuel Thomas:
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models. ICASSP 2010: 4334-4337 - [c38]Stephen M. Chu, Daniel Povey, Hong-Kwang Kuo, Lidia Mangu, Shilei Zhang, Qin Shi, Yong Qin:
The 2009 IBM GALE Mandarin broadcast transcription system. ICASSP 2010: 4374-4377 - [c37]George Saon, Hagen Soltau, Upendra V. Chaudhari, Stephen M. Chu, Brian Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Daniel Povey:
The IBM 2008 GALE Arabic speech transcription system. ICASSP 2010: 4378-4381 - [c36]Haihua Xu, Daniel Povey, Lidia Mangu, Jie Zhu:
An improved consensus-like method for Minimum Bayes Risk decoding and lattice combination. ICASSP 2010: 4938-4941 - [c35]Nagendra Goel, Samuel Thomas, Mohit Agarwal, Pinar Akyazi, Lukás Burget, Kai Feng, Arnab Ghoshal, Ondrej Glembek, Martin Karafiát, Daniel Povey, Ariya Rastrow, Richard C. Rose, Petr Schwarz:
Approaches to automatic lexicon learning with limited training examples. ICASSP 2010: 5094-5097
2000 – 2009
- 2009
- [j5]Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Ahmad Emami:
Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program. IEEE Trans. Speech Audio Process. 17(5): 884-894 (2009) - [c34]George Saon, Daniel Povey, Hagen Soltau:
Large margin semi-tied covariance transforms for discriminative training. ICASSP 2009: 3753-3756 - [c33]Haihua Xu, Daniel Povey, Jie Zhu, Guanyong Wu:
Minimum hypothesis phone error as a decoding method for speech recognition. INTERSPEECH 2009: 76-79 - 2008
- [c32]Daniel Povey, Dimitri Kanevsky, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Karthik Visweswariah:
Boosted MMI for model and feature-space discriminative training. ICASSP 2008: 4057-4060 - [c31]Balakrishnan Varadarajan, Daniel Povey, Selina M. Chu:
Quick fmllr for speaker adaptation in speech recognition. ICASSP 2008: 4297-4300 - [c30]Daniel Povey, Selina M. Chu, Balakrishnan Varadarajan:
Universal background model based speech recognition. ICASSP 2008: 4561-4564 - [c29]George Saon, Daniel Povey:
Penalty function maximization for large margin HMM training. INTERSPEECH 2008: 920-923 - [c28]Daniel Povey, Hong-Kwang Jeff Kuo, Hagen Soltau:
Fast speaker adaptive training for speech recognition. INTERSPEECH 2008: 1245-1248 - [c27]Daniel Povey, Brian Kingsbury:
Monte Carlo model-space noise adaptation for speech recognition. INTERSPEECH 2008: 1281-1284 - [c26]Daniel Povey, Hong-Kwang Jeff Kuo:
XMLLR for improved speaker adaptation in speech recognition. INTERSPEECH 2008: 1705-1708 - 2007
- [c25]Daniel Povey, Brian Kingsbury:
Evaluation of Proposed Modifications to MPE for Large Scale Discriminative Training. ICASSP (4) 2007: 321-324 - [c24]Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Geoffrey Zweig:
The IBM 2006 Gale Arabic ASR System. ICASSP (4) 2007: 349-352 - [c23]Ruhi Sarikaya, Bowen Zhou, Daniel Povey, Mohamed Afify, Yuqing Gao:
The Impact of ASR on Speech-to-Speech Translation Performance. ICASSP (4) 2007: 1289-1292 - 2006
- [j4]Thomas Hain, Philip C. Woodland, Gunnar Evermann, Mark J. F. Gales, Xunying Liu, Gareth L. Moore, Daniel Povey, Lan Wang:
Corrections to "Automatic Transcription of Conversational Telephone Speech". IEEE Trans. Speech Audio Process. 14(2): 727-727 (2006) - [j3]Stanley F. Chen, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon, Hagen Soltau, Geoffrey Zweig:
Advances in speech transcription at IBM under the DARPA EARS program. IEEE Trans. Speech Audio Process. 14(5): 1596-1608 (2006) - [c22]Jason W. Pelecanos, Daniel Povey, Ganesh N. Ramaswamy:
Secondary Classification for GMM Based Speaker Recognition. ICASSP (1) 2006: 109-112 - [c21]Geoffrey Zweig, Olivier Siohan, George Saon, Bhuvana Ramabhadran, Daniel Povey, Lidia Mangu, Brian Kingsbury:
Automated Quality Monitoring in the Call Center with ASR and Maximum Entropy. ICASSP (1) 2006: 589-592 - [c20]Ghinwa F. Choueiter, Daniel Povey, Stanley F. Chen, Geoffrey Zweig:
Morpheme-Based Language Modeling for Arabic Lvcsr. ICASSP (1) 2006: 1053-1056 - [c19]Daniel Povey:
SPAM and full covariance for speech recognition. INTERSPEECH 2006 - [c18]Daniel Povey, George Saon:
Feature and model space speaker adaptation with full covariance Gaussians. INTERSPEECH 2006 - [c17]Jing Huang, Martin Westphal, Stanley F. Chen, Olivier Siohan, Daniel Povey, Vit Libal, Alvaro Soneiro, Henrik Schulz, Thomas Ross, Gerasimos Potamianos:
The IBM Rich Transcription Spring 2006 Speech-to-Text System for Lecture Meetings. MLMI 2006: 432-443 - [c16]Geoffrey Zweig, Olivier Siohan, George Saon, Bhuvana Ramabhadran, Daniel Povey, Lidia Mangu, Brian Kingsbury:
Automated Quality Monitoring for Call Centers using Speech and NLP Technologies. HLT-NAACL 2006 - 2005
- [j2]Thomas Hain, Philip C. Woodland, Gunnar Evermann, Mark J. F. Gales, Xunying Liu, Gareth L. Moore, Daniel Povey, Lan Wang:
Automatic transcription of conversational telephone speech. IEEE Trans. Speech Audio Process. 13(6): 1173-1185 (2005) - [c15]Hagen Soltau, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon, Geoffrey Zweig:
The IBM 2004 Conversational Telephony System for Rich Transcription. ICASSP (1) 2005: 205-208 - [c14]Daniel Povey, Brian Kingsbury, Lidia Mangu, George Saon, Hagen Soltau, Geoffrey Zweig:
fMPE: Discriminatively Trained Features for Speech Recognition. ICASSP (1) 2005: 961-964 - [c13]George Saon, Daniel Povey, Geoffrey Zweig:
Anatomy of an extremely fast LVCSR decoder. INTERSPEECH 2005: 549-552 - [c12]Jing Huang, Daniel Povey:
Discriminatively trained features using fMPE for multi-stream audio-visual speech recognition. INTERSPEECH 2005: 777-780 - [c11]Daniel Povey:
Improvements to fMPE for discriminative training of features. INTERSPEECH 2005: 2977-2980 - 2004
- [c10]George Saon, Satya Dharanipragada, Daniel Povey:
Feature space Gaussianization. ICASSP (1) 2004: 329-332 - [c9]Daniel Povey:
Phone duration modeling for LVCSR. ICASSP (1) 2004: 829-832 - 2003
- [c8]Daniel Povey, Philip C. Woodland, Mark J. F. Gales:
Discriminative map for acoustic model adaptation. ICASSP (1) 2003: 312-315 - [c7]Mark J. F. Gales, Yuan Dong, Daniel Povey, Philip C. Woodland:
Porting: SwitchBoard to the VoiceMail task. ICASSP (1) 2003: 536-539 - [c6]Roongroj Nopsuwanchai, Daniel Povey:
Discriminative Training for HMM-Based Offline Handwritten Character Recognition. ICDAR 2003: 114-118 - [c5]Daniel Povey, Mark J. F. Gales, Do Yeong Kim, Philip C. Woodland:
MMI-MAP and MPE-MAP for acoustic model adaptation. INTERSPEECH 2003: 1981-1984 - 2002
- [j1]Philip C. Woodland, Daniel Povey:
Large scale discriminative training of hidden Markov models for speech recognition. Comput. Speech Lang. 16(1): 25-47 (2002) - [c4]Daniel Povey, Philip C. Woodland:
Minimum Phone Error and I-smoothing for improved discriminative training. ICASSP 2002: 105-108 - 2001
- [c3]Daniel Povey, Philip C. Woodland:
Improved discriminative training techniques for large vocabulary continuous speech recognition. ICASSP 2001: 45-48 - [c2]Thomas Hain, Philip C. Woodland, Gunnar Evermann, Daniel Povey:
New features in the CU-HTK system for transcription of conversational telephone speech. ICASSP 2001: 57-60
1990 – 1999
- 1999
- [c1]Daniel Povey, Philip C. Woodland:
Frame discrimination training for HMMs for large vocabulary speech recognition. ICASSP 1999: 333-336
Coauthor Index
aka: Pegah Ghahramani
aka: Jan "Yenda" Trmal
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-03 21:20 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint