default search action
Gakuto Kurata
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c55]Takuma Udagawa, Masayuki Suzuki, Masayasu Muraoka, Gakuto Kurata:
Robust ASR Error Correction with Conservative Data Filtering. EMNLP (Industry Track) 2024: 256-266 - [c54]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon:
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems. ICASSP 2024: 10176-10180 - [i12]Takuma Udagawa, Masayuki Suzuki, Masayasu Muraoka, Gakuto Kurata:
Robust ASR Error Correction with Conservative Data Filtering. CoRR abs/2407.13300 (2024) - 2023
- [c53]Ashish R. Mittal, Sunita Sarawagi, Preethi Jyothi, George Saon, Gakuto Kurata:
Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word Dictionaries. EMNLP 2023: 14820-14835 - [i11]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon:
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems. CoRR abs/2309.04031 (2023) - 2022
- [c52]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. INTERSPEECH 2022: 2638-2642 - [c51]Takashi Fukuda, Samuel Thomas, Masayuki Suzuki, Gakuto Kurata, George Saon, Brian Kingsbury:
Global RNN Transducer Models For Multi-dialect Speech Recognition. INTERSPEECH 2022: 3138-3142 - [c50]Sashi Novitasari, Takashi Fukuda, Gakuto Kurata:
Improving ASR Robustness in Noisy Condition Through VAD Integration. INTERSPEECH 2022: 3784-3788 - [c49]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon:
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems. INTERSPEECH 2022: 3919-3923 - [i10]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. CoRR abs/2203.15176 (2022) - [i9]Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon:
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems. CoRR abs/2204.00212 (2022) - 2021
- [c48]Takashi Fukuda, Gakuto Kurata:
Generalized Knowledge Distillation from an Ensemble of Specialized Teachers Leveraging Unsupervised Neural Clustering. ICASSP 2021: 6868-6872 - [c47]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models for Spoken Language Understanding. ICASSP 2021: 7493-7497 - [c46]Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. Interspeech 2021: 2027-2031 - [i8]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models For Spoken Language Understanding. CoRR abs/2104.03842 (2021) - [i7]Tohru Nagano, Takashi Fukuda, Gakuto Kurata:
Knowledge Distillation Leveraging Alternative Soft Targets from Non-Parallel Qualified Speech Data. CoRR abs/2112.08878 (2021) - 2020
- [c45]Yosuke Higuchi, Masayuki Suzuki, Gakuto Kurata:
Speaker Embeddings Incorporating Acoustic Conditions for Diarization. ICASSP 2020: 7129-7133 - [c44]Shintaro Ando, Masayuki Suzuki, Nobuyasu Itoh, Gakuto Kurata, Nobuaki Minematsu:
Converting Written Language to Spoken Language with Neural Machine Translation for Language Modeling. ICASSP 2020: 8124-8128 - [c43]Hagai Aronowitz, Weizhong Zhu, Masayuki Suzuki, Gakuto Kurata, Ron Hoory:
New Advances in Speaker Diarization. INTERSPEECH 2020: 279-283 - [c42]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. INTERSPEECH 2020: 906-910 - [c41]Gakuto Kurata, George Saon:
Knowledge Distillation from Offline to Streaming RNN Transducer for End-to-End Speech Recognition. INTERSPEECH 2020: 2117-2121 - [i6]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. CoRR abs/2009.14386 (2020)
2010 – 2019
- 2019
- [c40]Tohru Nagano, Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata:
Data Augmentation Based on Vowel Stretch for Improving Children's Speech Recognition. ASRU 2019: 502-508 - [c39]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. ICASSP 2019: 6455-6459 - [c38]Masayuki Suzuki, Nobuyasu Itoh, Tohru Nagano, Gakuto Kurata, Samuel Thomas:
Improvements to N-gram Language Model Using Text Generated from Neural Language Model. ICASSP 2019: 7245-7249 - [c37]Gakuto Kurata, Kartik Audhkhasi:
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation. INTERSPEECH 2019: 1616-1620 - [c36]Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata:
Direct Neuron-Wise Fusion of Cognate Neural Networks. INTERSPEECH 2019: 1621-1625 - [c35]Gakuto Kurata, Kartik Audhkhasi:
Multi-Task CTC Training with Auxiliary Feature Reconstruction for End-to-End Speech Recognition. INTERSPEECH 2019: 1636-1640 - [i5]Gakuto Kurata, Kartik Audhkhasi:
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation. CoRR abs/1904.08311 (2019) - [i4]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. CoRR abs/1904.13258 (2019) - 2018
- [c34]Takashi Fukuda, Raul Fernandez, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Alexander Sorin, Gakuto Kurata:
Data Augmentation Improves Recognition of Foreign Accented Speech. INTERSPEECH 2018: 2409-2413 - [c33]Masayuki Suzuki, Tohru Nagano, Gakuto Kurata, Samuel Thomas:
Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models. INTERSPEECH 2018: 2893-2897 - [c32]Gakuto Kurata, Kartik Audhkhasi:
Improved Knowledge Distillation from Bi-Directional to Uni-Directional LSTM CTC for End-to-End Speech Recognition. SLT 2018: 411-417 - 2017
- [c31]Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy:
Language modeling with highway LSTM. ASRU 2017: 244-251 - [c30]Takashi Fukuda, Osamu Ichikawa, Gakuto Kurata, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran:
Effective joint training of denoising feature space transforms and Neural Network based acoustic models. ICASSP 2017: 5190-5194 - [c29]Osamu Ichikawa, Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Bhuvana Ramabhadran:
Harmonic feature fusion for robust neural network-based acoustic modeling. ICASSP 2017: 5195-5199 - [c28]George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. INTERSPEECH 2017: 132-136 - [c27]Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, George Saon:
Empirical Exploration of Novel Architectures and Objectives for Language Models. INTERSPEECH 2017: 279-283 - [c26]Osamu Ichikawa, Takashi Fukuda, Gakuto Kurata, Steven J. Rennie:
Factorial Modeling for Effective Suppression of Directional Noise. INTERSPEECH 2017: 389-393 - [c25]Michael Heck, Masayuki Suzuki, Takashi Fukuda, Gakuto Kurata, Satoshi Nakamura:
Ensembles of Multi-Scale VGG Acoustic Models. INTERSPEECH 2017: 1616-1620 - [c24]Masayuki Suzuki, Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, Kenneth Ward Church, Mark Drake:
Symbol Sequence Search from Telephone Conversation. INTERSPEECH 2017: 3612-3616 - [c23]Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Samuel Thomas, Jia Cui, Bhuvana Ramabhadran:
Efficient Knowledge Distillation from an Ensemble of Teachers. INTERSPEECH 2017: 3697-3701 - [i3]George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. CoRR abs/1703.02136 (2017) - [i2]Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy:
Language Modeling with Highway LSTM. CoRR abs/1709.06436 (2017) - 2016
- [c22]Gakuto Kurata, Bing Xiang, Bowen Zhou, Mo Yu:
Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling. EMNLP 2016: 2077-2083 - [c21]Masayuki Suzuki, Gakuto Kurata, Tohru Nagano, Ryuki Tachibana:
Speech recognition robust against speech overlapping in monaural recordings of telephone conversations. ICASSP 2016: 5685-5689 - [c20]Gakuto Kurata, Brian Kingsbury:
Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling. INTERSPEECH 2016: 27-31 - [c19]Gakuto Kurata, Bing Xiang, Bowen Zhou:
Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling. INTERSPEECH 2016: 725-729 - [c18]Gakuto Kurata, Bing Xiang, Bowen Zhou:
Improved Neural Network-based Multi-label Classification with Better Initialization Leveraging Label Co-occurrence. HLT-NAACL 2016: 521-526 - [i1]Gakuto Kurata, Bing Xiang, Bowen Zhou, Mo Yu:
Leveraging Sentence-level Information with Encoder LSTM for Natural Language Understanding. CoRR abs/1601.01530 (2016) - 2015
- [j4]Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu:
Discriminative re-ranking for automatic speech recognition by leveraging invariant structures. Speech Commun. 72: 208-217 (2015) - [c17]Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana, Masafumi Nishimura:
A metric for evaluating speech recognizer output based on human-perception model. INTERSPEECH 2015: 1285-1288 - [c16]Gakuto Kurata, Daniel Willett:
Deep neural network training emphasizing central frames. INTERSPEECH 2015: 3595-3599 - 2014
- [c15]Congying Zhang, Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu:
Leveraging phonetic context dependent invariant structure for continuous speech recognition. ChinaSIP 2014: 52-56 - 2012
- [j3]Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, Ariya Rastrow, Nobuyasu Itoh, Masafumi Nishimura:
Acoustically discriminative language model training with pseudo-hypothesis. Speech Commun. 54(2): 219-228 (2012) - [j2]Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura, Abhinav Sethy, Bhuvana Ramabhadran:
Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech. Speech Commun. 54(3): 491-502 (2012) - [c14]Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu:
Discriminative Reranking for LVCSR Leveraging Invariant Structure. INTERSPEECH 2012: 563-566 - 2011
- [c13]Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura, Abhinav Sethy, Bhuvana Ramabhadran:
Named entity recognition from Conversational Telephone Speech leveraging Word Confusion Networks for training and recognition. ICASSP 2011: 5572-5575 - [c12]Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura:
Training of error-corrective model for ASR without using audio data. ICASSP 2011: 5576-5579 - [c11]Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu:
Continuous Digits Recognition Leveraging Invariant Structure. INTERSPEECH 2011: 993-996 - [c10]Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura:
Acoustic Model Training with Detecting Transcription Errors in the Training Data. INTERSPEECH 2011: 1689-1692
2000 – 2009
- 2009
- [c9]Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura:
Acoustically discriminative training for language models. ICASSP 2009: 4717-4720 - 2007
- [j1]Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi:
Automatic Prosody Labeling Using Multiple Models for Japanese. IEICE Trans. Inf. Syst. 90-D(11): 1805-1812 (2007) - [c8]Gakuto Kurata, Shinsuke Mori, Nobuyasu Itoh, Masafumi Nishimura:
Unsupervised Lexicon Acquisition from Speech and Text. ICASSP (4) 2007: 421-424 - [c7]Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi:
Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone. INTERSPEECH 2007: 1917-1920 - 2006
- [c6]Shinsuke Mori, Daisuke Takuma, Gakuto Kurata:
Phoneme-to-Text Transcription System with an Infinite Vocabulary. ACL 2006 - [c5]Gakuto Kurata, Shinsuke Mori, Masafumi Nishimura:
Unsupervised Adaptation of a Stochastic Language Model Using a Japanese Raw Corpus. ICASSP (1) 2006: 1037-1040 - 2005
- [c4]Shinsuke Mori, Gakuto Kurata:
Class-based variable memory length Markov model. INTERSPEECH 2005: 13-16 - 2004
- [c3]Gakuto Kurata, Naoaki Okazaki, Mitsuru Ishizuka:
GDQA: Graph Driven Question Answering System - NTCIR-4 QAC2 Experiments. NTCIR 2004 - 2002
- [c2]Nobuaki Minematsu, Gakuto Kurata, Keikichi Hirose:
Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition. INTERSPEECH 2002: 529-532 - [c1]Nobuaki Minematsu, Gakuto Kurata, Keikichi Hirose:
Corpus-based analysis of English spoken by Japanese students in view of the entire phonemic system of English. INTERSPEECH 2002: 1213-1216
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-22 20:41 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint