default search action
Zoltán Tüske
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c60]Sadeen Alharbi, Areeb Alowisheq, Zoltán Tüske, Kareem Darwish, Abdullah Alrajeh, Abdulmajeed Alrowithi, Aljawharah Bin Tamran, Asma Ibrahim, Raghad Aloraini, Raneem Alnajim, Ranya Alkahtani, Renad Almuasaad, Sara Alrasheed, Shaykhah Alsubaie, Yaser Alonaizan:
SADA: Saudi Audio Dataset for Arabic. ICASSP 2024: 10286-10290 - [i15]Jintao Jiang, Yingbo Gao, Mohammad Zeineldeen, Zoltán Tüske:
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR. CoRR abs/2402.15594 (2024) - 2023
- [c59]Parnia Bahar, Patrick Wilken, Javier Iranzo-Sánchez, Mattia Di Gangi, Evgeny Matusov, Zoltán Tüske:
Speech Translation with Style: AppTek's Submissions to the IWSLT Subtitling and Formality Tracks in 2023. IWSLT@ACL 2023: 251-260 - [i14]Jintao Jiang, Yingbo Gao, Zoltán Tüske:
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR. CoRR abs/2311.14835 (2023) - 2022
- [c58]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-end Models for Set Prediction in Spoken Language Understanding. ICASSP 2022: 7162-7166 - [i13]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-End Models for Set Prediction in Spoken Language Understanding. CoRR abs/2201.12105 (2022) - 2021
- [c57]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. ICASSP 2021: 5654-5658 - [c56]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-End Spoken Language Understanding Using Transformer Networks and Self-Supervised Pre-Trained Features. ICASSP 2021: 7483-7487 - [c55]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models for Spoken Language Understanding. ICASSP 2021: 7493-7497 - [c54]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. Interspeech 2021: 1254-1258 - [c53]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. Interspeech 2021: 1802-1806 - [c52]Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. Interspeech 2021: 2027-2031 - [c51]Zoltán Tüske, George Saon, Brian Kingsbury:
On the Limit of English Conversational Speech Recognition. Interspeech 2021: 2062-2066 - [c50]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. Interspeech 2021: 2586-2590 - [i12]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. CoRR abs/2103.09935 (2021) - [i11]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models For Spoken Language Understanding. CoRR abs/2104.03842 (2021) - [i10]Zoltán Tüske, George Saon, Brian Kingsbury:
On the limit of English conversational speech recognition. CoRR abs/2105.00982 (2021) - [i9]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. CoRR abs/2108.08405 (2021) - [i8]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. CoRR abs/2108.10803 (2021) - [i7]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-bit Quantization of LSTM-based Speech Recognition Models. CoRR abs/2108.12074 (2021) - 2020
- [b1]Zoltán Tüske:
Discriminative feature modeling for statistical speech recognition. RWTH Aachen University, Germany, 2020 - [c49]George Saon, Zoltán Tüske, Kartik Audhkhasi:
Alignment-Length Synchronous Decoding for RNN Transducer. ICASSP 2020: 7804-7808 - [c48]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single Headed Attention Based Sequence-to-Sequence Model for State-of-the-Art Results on Switchboard. INTERSPEECH 2020: 551-555 - [c47]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. INTERSPEECH 2020: 906-910 - [i6]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300. CoRR abs/2001.07263 (2020) - [i5]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. CoRR abs/2009.14386 (2020) - [i4]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features. CoRR abs/2011.08238 (2020)
2010 – 2019
- 2019
- [c46]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas:
Simplified LSTMS for Speech Recognition. ASRU 2019: 547-553 - [c45]Yinghui Huang, Samuel Thomas, Masayuki Suzuki, Zoltán Tüske, Larry Sansone, Michael Picheny:
Semi-Supervised Training and Data Augmentation for Adaptation of Automatic Broadcast News Captioning Systems. ASRU 2019: 867-874 - [c44]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury:
Sequence Noise Injected Training for End-to-end Speech Recognition. ICASSP 2019: 6261-6265 - [c43]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. ICASSP 2019: 6455-6459 - [c42]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. INTERSPEECH 2019: 326-330 - [c41]Kartik Audhkhasi, George Saon, Zoltán Tüske, Brian Kingsbury, Michael Picheny:
Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition. INTERSPEECH 2019: 2618-2622 - [c40]Samuel Thomas, Kartik Audhkhasi, Zoltán Tüske, Yinghui Huang, Michael Picheny:
Detection and Recovery of OOVs for Improved English Broadcast News Captioning. INTERSPEECH 2019: 2973-2977 - [c39]Zoltán Tüske, Kartik Audhkhasi, George Saon:
Advancing Sequence-to-Sequence Based Speech Recognition. INTERSPEECH 2019: 3780-3784 - [i3]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. CoRR abs/1904.13258 (2019) - [i2]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. CoRR abs/1908.03455 (2019) - 2018
- [c38]Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Acoustic Modeling of Speech Waveform Based on Multi-Resolution, Neural Network Signal Processing. ICASSP 2018: 4859-4863 - [c37]Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition. INTERSPEECH 2018: 3358-3362 - [c36]Xiaodong Cui, Wei Zhang, Zoltán Tüske, Michael Picheny:
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks. NeurIPS 2018: 6051-6061 - [i1]Xiaodong Cui, Wei Zhang, Zoltán Tüske, Michael Picheny:
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks. CoRR abs/1810.06773 (2018) - 2017
- [c35]Zoltán Tüske, Wilfried Michel, Ralf Schlüter, Hermann Ney:
Parallel Neural Network Features for Improved Tandem Acoustic Modeling. INTERSPEECH 2017: 1651-1655 - [c34]Pavel Golik, Zoltán Tüske, Kazuki Irie, Eugen Beck, Ralf Schlüter, Hermann Ney:
The 2016 RWTH Keyword Search System for Low-Resource Languages. SPECOM 2017: 719-730 - 2016
- [c33]Zoltán Tüske, Kazuki Irie, Ralf Schlüter, Hermann Ney:
Investigation on log-linear interpolation of multi-domain neural network language model. ICASSP 2016: 6005-6009 - [c32]Kazuki Irie, Zoltán Tüske, Tamer Alkhouli, Ralf Schlüter, Hermann Ney:
LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. INTERSPEECH 2016: 3519-3523 - [c31]Wilfried Michel, Zoltán Tüske, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney:
The RWTH Aachen LVCSR system for IWSLT-2016 German Skype conversation recognition task. IWSLT 2016 - [c30]Ralf Schlüter, Patrick Doetsch, Pavel Golik, Markus Kitza, Tobias Menne, Kazuki Irie, Zoltán Tüske, Albert Zeyer:
Automatic Speech Recognition Based on Neural Networks. SPECOM 2016: 3-17 - 2015
- [c29]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, Abhinav Sethy, Kartik Audhkhasi, Xiaodong Cui, Ellen Kislal, Lidia Mangu, Markus Nußbaum-Thom, Michael Picheny, Zoltán Tüske, Pavel Golik, Ralf Schlüter, Hermann Ney, Mark J. F. Gales, Kate M. Knill, Anton Ragni, Haipeng Wang, Philip C. Woodland:
Multilingual representations for low resource speech recognition and keyword search. ASRU 2015: 259-266 - [c28]Zoltán Tüske, Pavel Golik, Ralf Schlüter, Hermann Ney:
Speaker adaptive joint training of Gaussian mixture models and bottleneck features. ASRU 2015: 596-603 - [c27]Zoltán Tüske, Muhammad Ali Tahir, Ralf Schlüter, Hermann Ney:
Integrating Gaussian mixtures into deep neural networks: Softmax layer with hidden variables. ICASSP 2015: 4285-4289 - [c26]Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Convolutional neural networks for acoustic modeling of raw time signal in LVCSR. INTERSPEECH 2015: 26-30 - [c25]Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Multilingual features based keyword search for very low-resource languages. INTERSPEECH 2015: 1260-1264 - [c24]M. Ali Basha Shaik, Zoltán Tüske, Muhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney:
Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic. INTERSPEECH 2015: 3154-3158 - 2014
- [c23]Simon Wiesler, Kazuki Irie, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
The RWTH English lecture recognition system. ICASSP 2014: 3286-3290 - [c22]Zoltán Tüske, David Nolden, Ralf Schlüter, Hermann Ney:
Multilingual MRASTA features for low-resource keyword search and speech recognition systems. ICASSP 2014: 7854-7858 - [c21]Martin Sundermeyer, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Lattice decoding and rescoring with long-Span neural network language models. INTERSPEECH 2014: 661-665 - [c20]Zoltán Tüske, Pavel Golik, Ralf Schlüter, Hermann Ney:
Acoustic modeling with deep neural networks using raw time signal for LVCSR. INTERSPEECH 2014: 890-894 - [c19]M. Ali Basha Shaik, Zoltán Tüske, Muhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney:
RWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese. INTERSPEECH 2014: 973-977 - [c18]Zoltán Tüske, Pavel Golik, David Nolden, Ralf Schlüter, Hermann Ney:
Data augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages. INTERSPEECH 2014: 1420-1424 - 2013
- [c17]Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Deep hierarchical bottleneck MRASTA features for LVCSR. ICASSP 2013: 6970-6974 - [c16]Zoltán Tüske, Joel Pinto, Daniel Willett, Ralf Schlüter:
Investigation on cross- and multilingual MLP features under matched and mismatched acoustical conditions. ICASSP 2013: 7349-7353 - [c15]Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Multilingual hierarchical MRASTA features for ASR. INTERSPEECH 2013: 2222-2226 - [c14]Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Development of the RWTH transcription system for slovenian. INTERSPEECH 2013: 3107-3111 - [c13]M. Ali Basha Shaik, Zoltán Tüske, Simon Wiesler, Markus Nußbaum-Thom, Stephan Peitz, Ralf Schlüter, Hermann Ney:
The RWTH Aachen German and English LVCSR systems for IWSLT-2013. IWSLT (Evaluation Campaign) 2013 - 2012
- [c12]Zoltán Tüske, Ralf Schlüter, Hermann Ney:
Comparison and combination of different CRBE based MLP features for LVCSR. ICASSP 2012: 4081-4084 - [c11]Zoltán Tüske, Ralf Schlüter, Hermann Ney, Martin Sundermeyer:
Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both? INTERSPEECH 2012: 18-21 - [c10]Zoltán Tüske, Friedhelm R. Drepper, Ralf Schlüter:
Non-stationary signal processing and its application in speech recognition. SAPA@INTERSPEECH 2012: 34-39 - [c9]Markus Nußbaum-Thom, Zoltán Tüske, Georg Heigold, Ralf Schlüter, Hermann Ney:
Posterior-Scaled MPE: Novel Discriminative Training Criteria. INTERSPEECH 2012: 2614-2617 - [c8]Zoltán Tüske, Friedhelm R. Drepper, Ralf Schlüter:
Phase difference of filter-stable part-tones as acoustic feature. SSP 2012: 365-368 - 2011
- [c7]Zoltán Tüske, Pavel Golik, Ralf Schlüter, Friedhelm R. Drepper:
Non-stationary feature extraction for automatic speech recognition. ICASSP 2011: 5204-5207 - [c6]Zoltán Tüske, Christian Plahl, Ralf Schlüter:
A Study on Speaker Normalized MLP Features in LVCSR. INTERSPEECH 2011: 1089-1092 - 2010
- [j1]Péter Mihajlik, Zoltán Tüske, Balázs Tarján, Bottyán Németh, Tibor Fegyó:
Improved Recognition of Spontaneous Hungarian Speech - Morphological and Acoustic Modeling Techniques for a Less Resourced Task. IEEE Trans. Speech Audio Process. 18(6): 1588-1600 (2010)
2000 – 2009
- 2009
- [c5]Péter Mihajlik, Balázs Tarján, Zoltán Tüske, Tibor Fegyó:
Investigation of morph-based speech recognition improvements across speech genres. INTERSPEECH 2009: 2687-2690 - 2007
- [c4]Péter Mihajlik, Tibor Fegyó, Zoltán Tüske, Pavel Ircing:
A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian. INTERSPEECH 2007: 1497-1500 - [c3]Péter Mihajlik, Tibor Fegyó, Bottyán Németh, Zoltán Tüske, Viktor Trón:
Towards Automatic Transcription of Large Spoken Archives in Agglutinating Languages - Hungarian ASR for the MALACH Project. TSD 2007: 342-349 - 2005
- [c2]Zoltán Tüske, Péter Mihajlik, Zoltán Tobler, Tibor Fegyó:
Robust voice activity detection based on the entropy of noise-suppressed spectrum. INTERSPEECH 2005: 245-248 - [c1]Péter Mihajlik, Zoltán Tobler, Zoltán Tüske, Géza Gordos:
Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speech. INTERSPEECH 2005: 2677-2680
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-30 01:10 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint