default search action
Hisashi Kawai
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j31]Haruki Yamashita, Takuma Okamoto, Ryoichi Takashima, Yamato Ohtani, Tetsuya Takiguchi, Tomoki Toda, Hisashi Kawai:
Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling. IEEE Access 12: 31409-31421 (2024) - [c158]Yamato Ohtani, Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter. ICASSP 2024: 10871-10875 - [c157]Takuma Okamoto, Yamato Ohtani, Tomoki Toda, Hisashi Kawai:
Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice Conversion. ICASSP 2024: 12456-12460 - [c156]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-Based ASR. ICASSP 2024: 13116-13120 - [i27]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR. CoRR abs/2409.02239 (2024) - 2023
- [j30]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Hisashi Kawai:
Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1902-1915 (2023) - [c155]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Cross-Modal Alignment With Optimal Transport For CTC-Based ASR. ASRU 2023: 1-7 - [c154]Takuma Okamoto, Haruki Yamashita, Yamato Ohtani, Tomoki Toda, Hisashi Kawai:
WaveNeXt: ConvNeXt-Based Fast Neural Vocoder Without ISTFT layer. ASRU 2023: 1-8 - [c153]Peng Shen, Xuguang Lu, Hisashi Kawai:
Generative Linguistic Representation for Spoken Language Identification. ASRU 2023: 1-8 - [c152]Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
E2E-S2S-VC: End-To-End Sequence-To-Sequence Voice Conversion. INTERSPEECH 2023: 2043-2047 - [c151]Mikiko Oono, Ayano Nomura, Koji Kitamura, Yoshifumi Nishida, Shunsaburo Nakahara, Hisashi Kawai:
Homeostatic System Design Based on Understanding the Living Environmental Determinants of Falls. SMC 2023: 4343-4348 - [i26]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Cross-modal Alignment with Optimal Transport for CTC-based ASR. CoRR abs/2309.13650 (2023) - [i25]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR. CoRR abs/2309.16093 (2023) - [i24]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Neural domain alignment for spoken language recognition based on optimal transport. CoRR abs/2310.13471 (2023) - [i23]Peng Shen, Xugang Lu, Hisashi Kawai:
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition. CoRR abs/2312.10959 (2023) - [i22]Peng Shen, Xuguang Lu, Hisashi Kawai:
Generative linguistic representation for spoken language identification. CoRR abs/2312.10964 (2023) - 2022
- [j29]Takuma Okamoto, Keisuke Matsubara, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Neural speech-rate conversion with multispeaker WaveNet vocoder. Speech Commun. 138: 1-12 (2022) - [c150]Peng Shen, Xugang Lu, Hisashi Kawai:
Transducer-based language embedding for spoken language identification. INTERSPEECH 2022: 3724-3728 - [c149]Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu:
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model. Odyssey 2022: 415-420 - [c148]Peng Shen, Xugang Lu, Hisashi Kawai:
Pronunciation-Aware Unique Character Encoding for RNN Transducer-Based Mandarin Speech Recognition. SLT 2022: 123-129 - [i21]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Partial Coupling of Optimal Transport for Spoken Language Identification. CoRR abs/2203.17036 (2022) - [i20]Peng Shen, Xugang Lu, Hisashi Kawai:
Transducer-based language embedding for spoken language identification. CoRR abs/2204.03888 (2022) - [i19]Detai Xin, Shinnosuke Takamichi, Takuma Okamoto, Hisashi Kawai, Hiroshi Saruwatari:
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation. CoRR abs/2204.10561 (2022) - [i18]Peng Shen, Xugang Lu, Hisashi Kawai:
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition. CoRR abs/2207.14578 (2022) - 2021
- [j28]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU. IEEE Access 9: 94923-94933 (2021) - [j27]Aly Magassouba, Komei Sugiura, Angelica Nakayama, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Hisashi Kawai:
Predicting and attending to damaging collisions for placing everyday objects in photo-realistic simulations. Adv. Robotics 35(12): 787-799 (2021) - [j26]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation. IEEE Robotics Autom. Lett. 6(4): 6258-6265 (2021) - [j25]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network. IEEE ACM Trans. Audio Speech Lang. Process. 29: 792-806 (2021) - [j24]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Coupling a Generative Model With a Discriminative Learning Framework for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3631-3641 (2021) - [c147]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification. APSIPA ASC 2021: 769-774 - [c146]Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
Multi-Stream HiFi-GAN with Data-Driven Waveform Decomposition. ASRU 2021: 610-617 - [c145]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Noise Level Limited Sub-Modeling for Diffusion Probabilistic Vocoders. ICASSP 2021: 6029-6033 - [c144]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
High-Intelligibility Speech Synthesis for Dysarthric Speakers with LPCNet-Based TTS and CycleVAE-Based VC. ICASSP 2021: 7058-7062 - [c143]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification. ICASSP 2021: 7213-7217 - [c142]Masakiyo Fujimoto, Hisashi Kawai:
Noise Robust Acoustic Modeling for Single-Channel Speech Recognition Based on a Stream-Wise Transformer Architecture. Interspeech 2021: 281-285 - [i17]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification. CoRR abs/2101.03329 (2021) - [i16]Aly Magassouba, Komei Sugiura, Angelica Nakayama, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Hisashi Kawai:
Predicting and Attending to Damaging Collisions for Placing Everyday Objects in Photo-Realistic Simulations. CoRR abs/2102.06507 (2021) - [i15]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation. CoRR abs/2103.00852 (2021) - [i14]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification. CoRR abs/2104.03004 (2021) - 2020
- [j23]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
A Multimodal Target-Source Classifier With Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects. IEEE Robotics Autom. Lett. 5(2): 532-539 (2020) - [j22]Tadashi Ogura, Aly Magassouba, Komei Sugiura, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Hisashi Kawai:
Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network. IEEE Robotics Autom. Lett. 5(4): 5945-5952 (2020) - [j21]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2674-2683 (2020) - [c141]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Transformer-Based Text-to-Speech with Weighted Forced Attention. ICASSP 2020: 6729-6733 - [c140]Peng Shen, Xugang Lu, Hisashi Kawai:
Investigation of NICT Submission for Short-Duration Speaker Verification Challenge 2020. INTERSPEECH 2020: 751-755 - [c139]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation. INTERSPEECH 2020: 3535-3539 - [c138]Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai:
Compensation on x-vector for Short Utterance Spoken Language Identification. Odyssey 2020: 47-52 - [c137]Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai:
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes. Odyssey 2020: 385-390 - [p1]Yutaka Ashikari, Hisashi Kawai:
Field Experiment System "VoiceTra". Speech-to-Speech Translation 2020: 67-75 - [e1]Yutaka Kidawara, Eiichiro Sumita, Hisashi Kawai:
Speech-to-Speech Translation. Springer Briefs in Computer Science, Springer 2020, ISBN 978-981-15-0594-2 [contents] - [i13]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation. CoRR abs/2005.08654 (2020) - [i12]Tadashi Ogura, Aly Magassouba, Komei Sugiura, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Hisashi Kawai:
Alleviating the Burden of Labeling: Sentence Generation by Attention Branch Encoder-Decoder Network. CoRR abs/2007.04557 (2020) - [i11]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network. CoRR abs/2007.12955 (2020) - [i10]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Unsupervised neural adaptation model based on optimal transport for spoken language identification. CoRR abs/2012.13152 (2020)
2010 – 2019
- 2019
- [j20]Aly Magassouba, Komei Sugiura, Anh Trinh Quoc, Hisashi Kawai:
Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target-Source Classification. IEEE Robotics Autom. Lett. 4(4): 3884-3891 (2019) - [c136]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Tacotron-Based Acoustic Model Using Phoneme Alignment for Practical Neural Text-to-Speech Systems. ASRU 2019: 214-221 - [c135]Saly Keo, Soky Kak, Yoshinori Shiga, Hiroaki Kato, Hisashi Kawai:
HMM-based TTS System Framework. CIFEr 2019: 1 - [c134]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
Multimodal Attention Branch Network for Perspective-Free Sentence Generation. CoRL 2019: 76-85 - [c133]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Interactive Learning of Teacher-student Model for Short Utterance Spoken Language Identification. ICASSP 2019: 5981-5985 - [c132]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
Investigation of Sequence-level Knowledge Distillation Methods for CTC Acoustic Models. ICASSP 2019: 6156-6160 - [c131]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Investigations of Real-time Gaussian Fftnet and Parallel Wavenet Neural Vocoders with Simple Acoustic Features. ICASSP 2019: 7020-7024 - [c130]Masakiyo Fujimoto, Hisashi Kawai:
One-Pass Single-Channel Noisy Speech Recognition Using a Combination of Noisy and Enhanced Features. INTERSPEECH 2019: 486-490 - [c129]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. INTERSPEECH 2019: 1308-1312 - [c128]Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition. INTERSPEECH 2019: 2145-2149 - [c127]Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese. INTERSPEECH 2019: 2200-2204 - [c126]Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai:
Incorporating Symbolic Sequential Modeling for Speech Enhancement. INTERSPEECH 2019: 2733-2737 - [c125]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection. INTERSPEECH 2019: 3614-3618 - [c124]Sheng Li, Raj Dabre, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. INTERSPEECH 2019: 4400-4404 - [c123]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai:
Duration Modeling with Global Phoneme-Duration Vectors. INTERSPEECH 2019: 4465-4469 - [c122]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
Latent-Space Data Augmentation for Visually-Grounded Language Understanding. JSAI 2019: 179-187 - [i9]Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai:
Incorporating Symbolic Sequential Modeling for Speech Enhancement. CoRR abs/1904.13142 (2019) - [i8]Aly Magassouba, Komei Sugiura, Anh Trinh Quoc, Hisashi Kawai:
Understanding Natural Language Instructions for Fetching Daily Objects Using GAN-Based Multimodal Target-Source Classification. CoRR abs/1906.06830 (2019) - [i7]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
Multimodal Attention Branch Network for Perspective-Free Sentence Generation. CoRR abs/1909.05664 (2019) - [i6]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects. CoRR abs/1912.10675 (2019) - [i5]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Deep progressive multi-scale attention for acoustic event classification. CoRR abs/1912.12011 (2019) - 2018
- [j19]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks From Ambiguous Language Instructions. IEEE Robotics Autom. Lett. 3(4): 3113-3120 (2018) - [j18]Szu-Wei Fu, Taowei Wang, Yu Tsao, Xugang Lu, Hisashi Kawai:
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 26(9): 1570-1584 (2018) - [c121]Masakiyo Fujimoto, Hisashi Kawai:
Comparative Evaluations of Various Factored Deep Convolutional Rnn Architectures for Noise Robust Speech Recognition. ICASSP 2018: 4829-4833 - [c120]Takuma Okamoto, Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features. ICASSP 2018: 5654-5658 - [c119]Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
An Investigation of Noise Shaping with Perceptual Weighting for Wavenet-Based Speech Generation. ICASSP 2018: 5664-5668 - [c118]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
An Investigation of a Knowledge Distillation Method for CTC Acoustic Models. ICASSP 2018: 5809-5813 - [c117]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
CTC Loss Function with a Unit-Level Ambiguity Penalty. ICASSP 2018: 5909-5913 - [c116]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Temporal Attentive Pooling for Acoustic Event Detection. INTERSPEECH 2018: 1354-1357 - [c115]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification. INTERSPEECH 2018: 1813-1817 - [c114]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai:
Multilingual Grapheme-to-Phoneme Conversion with Global Character Vectors. INTERSPEECH 2018: 2823-2827 - [c113]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks. INTERSPEECH 2018: 3708-3712 - [c112]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems. SLT 2018: 77-83 - [c111]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Improving FFTNet Vocoder with Noise Shaping and Subband Approaches. SLT 2018: 304-311 - [i4]Komei Sugiura, Hisashi Kawai:
Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification. CoRR abs/1801.05096 (2018) - [i3]Aly Magassouba, Komei Sugiura, Hisashi Kawai:
A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions. CoRR abs/1806.03847 (2018) - 2017
- [j17]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Regularization of neural network model with distance metric learning for i-vector based spoken language identification. Comput. Speech Lang. 44: 48-60 (2017) - [j16]Shigeki Matsuda, Teruaki Hayashi, Yutaka Ashikari, Yoshinori Shiga, Hidenori Kashioka, Keiji Yasuda, Hideo Okuma, Masao Uchiyama, Eiichiro Sumita, Hisashi Kawai, Satoshi Nakamura:
Development of the "VoiceTra" Multi-Lingual Speech Translation System. IEICE Trans. Inf. Syst. 100-D(4): 621-632 (2017) - [j15]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models. IEEE ACM Trans. Audio Speech Lang. Process. 25(5): 1023-1034 (2017) - [c110]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
Raw waveform-based speech enhancement by fully convolutional networks. APSIPA 2017: 6-12 - [c109]Sheng Li, Xugang Lu, Peng Shen, Ryoichi Takashima, Tatsuya Kawahara, Hisashi Kawai:
Incremental training and constructing the very deep convolutional residual network acoustic models. ASRU 2017: 222-227 - [c108]Komei Sugiura, Hisashi Kawai:
Grounded language understanding for manipulation instructions using GAN-based classification. ASRU 2017: 519-524 - [c107]Takuma Okamoto, Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Subband wavenet with overlapped single-sideband filterbanks. ASRU 2017: 698-704 - [c106]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Minimum Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework. ICASSP 2017: 4855-4859 - [c105]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai:
Global Syllable Vectors for Building TTS Front-End with Deep Learning. INTERSPEECH 2017: 769-773 - [c104]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Conditional Generative Adversarial Nets Classifier for Spoken Language Identification. INTERSPEECH 2017: 2814-2818 - [i2]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
Raw Waveform-based Speech Enhancement by Fully Convolutional Networks. CoRR abs/1703.02205 (2017) - [i1]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks. CoRR abs/1709.03658 (2017) - 2016
- [j14]Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, Chiori Hori, Hisashi Kawai, Shigeru Katagiri:
Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers. IEICE Trans. Inf. Syst. 99-D(10): 2431-2443 (2016) - [j13]Peng Shen, Xugang Lu, Xinhui Hu, Naoyuki Kanda, Masahiro Saiko, Chiori Hori, Hisashi Kawai:
Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription. Speech Commun. 82: 1-13 (2016) - [c103]Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, Hisashi Kawai, Shigeru Katagiri:
Bottleneck linear transformation network adaptation for speaker adaptive training-based hybrid DNN-HMM speech recognizer. ICASSP 2016: 5015-5019 - [c102]Peng Shen, Xugang Lu, Lemao Liu, Hisashi Kawai:
Local fisher discriminant analysis for spoken language identification. ICASSP 2016: 5825-5829 - [c101]Xiaoyun Wang, Xugang Lu, Hisashi Kawai, Seiichi Yamamoto:
F0 Contour Analysis Based on Empirical Mode Decomposition for DNN Acoustic Modeling in Mandarin Speech Recognition. INTERSPEECH 2016: 973-977 - [c100]Naoyuki Kanda, Shoji Harada, Xugang Lu, Hisashi Kawai:
Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks. INTERSPEECH 2016: 1325-1329 - [c99]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai:
Using Zero-Frequency Resonator to Extract Multilingual Intonation Structure. INTERSPEECH 2016: 1522-1526 - [c98]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum a posteriori Based Decoding for CTC Acoustic Models. INTERSPEECH 2016: 1868-1872 - [c97]Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework. INTERSPEECH 2016: 2288-2292 - [c96]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification. INTERSPEECH 2016: 3216-3220 - [c95]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
A pseudo-task design in multi-task learning deep neural network for speaker recognition. ISCSLP 2016: 1-5 - [c94]Peng Shen, Xugang Lu, Hisashi Kawai:
Comparison of regularization constraints in deep neural network based speaker adaptation. ISCSLP 2016: 1-5 - [c93]Peng Shen, Xugang Lu, Hisashi Kawai:
Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition. ISCSLP 2016: 1-5 - 2015
- [j12]Komei Sugiura, Yoshinori Shiga, Hisashi Kawai, Teruhisa Misu, Chiori Hori:
A cloud robotics approach towards dialogue-oriented robot speech. Adv. Robotics 29(7): 449-456 (2015) - [j11]Youzheng Wu, Chiori Hori, Hideki Kashioka, Hisashi Kawai:
Leveraging social Q&A collections for improving complex question answering. Comput. Speech Lang. 29(1): 1-19 (2015) - [c92]Hay Mar Soe Naing, Aye Mya Hlaing, Win Pa Pa, Xinhui Hu, Ye Kyaw Thu, Chiori Hori, Hisashi Kawai:
A Myanmar large vocabulary continuous speech recognition system. APSIPA 2015: 320-327 - [c91]Naoyuki Kanda, Mitsuyoshi Tachimori, Xugang Lu, Hisashi Kawai:
Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling. ASRU 2015: 15-21 - [c90]Xugang Lu, Peng Shen, Yu Tsao, Chiori Hori, Hisashi Kawai:
Sparse representation with temporal max-smoothing for acoustic event detection. INTERSPEECH 2015: 1176-1180 - [c89]Ye Kyaw Thu, Win Pa Pa, Jinfu Ni, Yoshinori Shiga, Andrew M. Finch, Chiori Hori, Hisashi Kawai, Eiichiro Sumita:
HMM based myanmar text to speech system. INTERSPEECH 2015: 2237-2241 - 2014
- [c88]Komei Sugiura, Yoshinori Shiga, Hisashi Kawai, Teruhisa Misu, Chiori Hori:
Non-monologue HMM-based speech synthesis for service robots: A cloud robotics approach. ICRA 2014: 2237-2242 - 2013
- [j10]Ken Nishihara, Hisashi Kawai, Yu Chiba, Naohiko Kanemura, Toshiaki Gomi:
Investigation of Innervation Zone Shift with Continuous Dynamic Muscle Contraction. Comput. Math. Methods Medicine 2013: 174342:1-174342:7 (2013) - [c87]Shigeki Matsuda, Xinhui Hu, Yoshinori Shiga, Hideki Kashioka, Chiori Hori, Keiji Yasuda, Hideo Okuma, Masao Uchiyama, Eiichiro Sumita, Hisashi Kawai, Satoshi Nakamura:
Multilingual Speech-to-Speech Translation System: VoiceTra. MDM (2) 2013: 229-233 - 2012
- [j9]Sakriani Sakti, Michael Paul, Andrew M. Finch, Xinhui Hu, Jinfu Ni, Noriyuki Kimura, Shigeki Matsuda, Chiori Hori, Yutaka Ashikari, Hisashi Kawai, Hideki Kashioka, Eiichiro Sumita, Satoshi Nakamura:
Distributed speech translation technologies for multiparty multilingual communication. ACM Trans. Speech Lang. Process. 9(2): 4:1-4:27 (2012) - [c86]Shinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, Sakriani Sakti, Satoshi Nakamura:
An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis. INTERSPEECH 2012: 1139-1142 - [c85]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai, Hideki Kashioka:
Resonance-based spectral deformation in HMM-based speech synthesis. ISCSLP 2012: 88-92 - [c84]Jinfu Ni, Yoshinori Shiga, Hisashi Kawai, Hideki Kashioka:
Experiments on unsupervised statistical parametric speech synthesis. ISCSLP 2012: 155-159 - 2011
- [j8]Komei Sugiura, Naoto Iwahashi, Hisashi Kawai, Satoshi Nakamura:
Situated Spoken Dialogue with Robots Using Active Learning. Adv. Robotics 25(17): 2207-2232 (2011) - [j7]Shinsuke Sakai, Tatsuya Kawahara, Hisashi Kawai:
Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis. IEICE Trans. Inf. Syst. 94-D(10): 2006-2014 (2011) - [j6]Teruhisa Misu, Komei Sugiura, Tatsuya Kawahara, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Modeling spoken decision support dialogue and optimization of its dialogue strategy. ACM Trans. Speech Lang. Process. 7(3): 10:1-10:18 (2011) - [c83]Sakriani Sakti, Andrew M. Finch, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model. ICASSP 2011: 4664-4667 - [c82]Yu Tsao, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Increasing discriminative capability on MAP-based mapping function estimation for acoustic model adaptation. ICASSP 2011: 5320-5323 - [c81]Yu Tsao, Shigeki Matsuda, Shinsuke Sakai, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
A sampling-based environment population projection approach for rapid acoustic model adaptation. ICASSP 2011: 5504-5507 - [c80]Youzheng Wu, Chiori Hori, Hisashi Kawai, Hideki Kashioka:
Improving Related Entity Finding via Incorporating Homepages and Recognizing Fine-grained Entities. IJCNLP 2011: 174-182 - [c79]Youzheng Wu, Chiori Hori, Hisashi Kawai, Hideki Kashioka:
Answering Complex Questions via Exploiting Social Q&A Collection. IJCNLP 2011: 956-964 - [c78]Minoru Tsuzaki, Keiichi Tokuda, Hisashi Kawai, Jinfu Ni:
Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task. INTERSPEECH 2011: 157-160 - [c77]Teruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hisashi Kawai, Satoshi Nakamura:
User Study of Spoken Decision Support System. INTERSPEECH 2011: 797-800 - [c76]Yu Tsao, Paul R. Dixon, Chiori Hori, Hisashi Kawai:
Incorporating Regional Information to Enhance MAP-Based Stochastic Feature Compensation for Robust Speech Recognition. INTERSPEECH 2011: 2585-2588 - [c75]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Adaptive Regularization Framework for Robust Voice Activity Detection. INTERSPEECH 2011: 2653-2656 - [c74]Nobuhiko Hattori, Tomoki Toda, Hisashi Kawai, Hiroshi Saruwatari, Kiyohiro Shikano:
Speaker-Adaptive Speech Synthesis Based on Eigenvoice Conversion and Language-Dependent Prosodic Conversion in Speech-to-Speech Translation. INTERSPEECH 2011: 2769-2772 - [c73]Teruhisa Misu, Etsuo Mizukami, Yoshinori Shiga, Shinichi Kawamoto, Hisashi Kawai, Satoshi Nakamura:
Analysis on Effects of Text-to-Speech and Avatar Agent in Evoking Users' Spontaneous Listener's Reactions. IWSDS 2011: 77-89 - [c72]Teruhisa Misu, Etsuo Mizukami, Yoshinori Shiga, Shinichi Kawamoto, Hisashi Kawai, Satoshi Nakamura:
Toward Construction of Spoken Dialogue System that Evokes Users' Spontaneous Backchannels. SIGDIAL Conference 2011: 259-265 - 2010
- [c71]Komei Sugiura, Naoto Iwahashi, Hisashi Kawai, Satoshi Nakamura:
Active Learning for Generating Motion and Utterances in Object Manipulation Dialogue Tasks. AAAI Fall Symposium: Dialog with Robots 2010 - [c70]Youzheng Wu, Hisashi Kawai:
Exploiting Social Q&A Collection in Answering Complex Questions. CIPS-SIGHAN 2010 - [c69]Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura:
NICT Blizzard Challenge 2010 Entry. Blizzard Challenge 2010 - [c68]Kentaro Kayama, Akihiro Kobayashi, Etsuo Mizukami, Teruhisa Misu, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Spoken Dialog System on Plasma Display Panel Estimating Users' Interest by Image Processing. Intelligent Environments (Workshops) 2010: 4-13 - [c67]Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Cluster-based language model for spoken document retrieval using NMF-based document clustering. INTERSPEECH 2010: 705-708 - [c66]Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai:
Improved training of excitation for HMM-based parametric speech synthesis. INTERSPEECH 2010: 809-812 - [c65]Jinfu Ni, Hisashi Kawai:
An unsupervised approach to creating web audio contents-based HMM voices. INTERSPEECH 2010: 849-852 - [c64]Kazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Brazilian portuguese acoustic model training based on data borrowing from other language. INTERSPEECH 2010: 861-864 - [c63]Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Utilizing a noisy-channel approach for Korean LVCSR. INTERSPEECH 2010: 1513-1516 - [c62]Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition. INTERSPEECH 2010: 1910-1913 - [c61]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Voice activity detection in a reguarized reproducing kernel hilbert space. INTERSPEECH 2010: 3086-3089 - [c60]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Speech enhancement as a functional approximation and generalization. ISCSLP 2010: 18-22 - [c59]Yu Tsao, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition. ISCSLP 2010: 29-32 - [c58]Sakriani Sakti, Andrew M. Finch, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Korean pronunciation variation modeling with probabilistic Bayesian networks. IUCS 2010: 52-57 - [c57]Hansjörg Hofmann, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura, Wolfgang Minker:
Improving spontaneous English ASR using a joint-sequence pronunciation model. IUCS 2010: 58-61 - [c56]Teruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Web text classification for response generation in spoken decision support dialogue systems. IUCS 2010: 131-134 - [c55]Naoto Kimura, Chiori Hori, Teruhisa Misu, Kiyonori Ohtake, Hisashi Kawai, Satoshi Nakamura:
Expansion of WFST-Based Dialog Management for Handling Multiple ASR Hypotheses. IWSDS 2010: 61-72 - [c54]Akihiro Kobayashi, Kentaro Kayama, Etsuo Mizukami, Teruhisa Misu, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Evaluation of Facial Direction Estimation from Cameras for Multi-modal Spoken Dialog System. IWSDS 2010: 73-84 - [c53]Hansjörg Hofmann, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura, Wolfgang Minker:
Sequence-Based Pronunciation Modeling Using a Noisy-Channel Approach. IWSDS 2010: 156-162 - [c52]Teruhisa Misu, Chiori Hori, Kiyonori Ohtake, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Construction and Experiment of a Spoken Consulting Dialogue System. IWSDS 2010: 169-175 - [c51]Etsuo Mizukami, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
A Study Toward an Evaluation Method for Spoken Dialogue Systems Considering User Criteria. IWSDS 2010: 176-181 - [c50]Teruhisa Misu, Chiori Hori, Kiyonori Ohtake, Etsuo Mizukami, Akihiro Kobayashi, Kentaro Kayama, Tetsuya Fujii, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Sightseeing Guidance Systems Based on WFST-Based Dialogue Manager. IWSDS 2010: 194-195 - [c49]Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy. SIGDIAL Conference 2010: 221-224 - [c48]Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura:
Dialogue strategy optimization to assist user's decision for spoken consulting dialogue systems. SLT 2010: 354-359 - [c47]Jinfu Ni, Hisashi Kawai:
An investigation of the impact of speech transcript errors on HMM voices. SSW 2010: 246-251 - [c46]Youzheng Wu, Chiori Hori, Hisashi Kawai:
NiCT at TREC 2010: Related Entity Finding. TREC 2010
2000 – 2009
- 2009
- [c45]Ranniery Maia, Tomoki Toda, Shinsuke Sakai, Yoshinori Shiga, Jinfu Ni, Hisashi Kawai, Keiichi Tokuda, Minoru Tsuzaki, Satoshi Nakamura:
The NICT Entry for the Blizzard Challenge 2009: an Enhanced HMM-based Speech Synthesis System with Trajectory Training considering Global Variance and State-Dependent Mixed Excitation. Blizzard Challenge 2009 - [c44]Shinsuke Sakai, Ranniery Maia, Hisashi Kawai, Satoshi Nakamura:
A close look into the probabilistic concatenation model for corpus-based speech synthesis. INTERSPEECH 2009: 752-755 - [c43]Xin Xu, Masaki Naito, Tsuneo Kato, Hisashi Kawai:
Robust and Fast Lyric Search based on Phonetic Confusion Matrix. ISMIR 2009: 417-422 - [c42]Jinfu Ni, Shinsuke Sakai, Hisashi Kawai, Satoshi Nakamura:
Hyperbolic structure of fundamental frequency contour. IUCS 2009: 389-394 - 2008
- [j5]Junichi Yamagishi, Hisashi Kawai, Takao Kobayashi:
Phone duration modeling using gradient tree boosting. Speech Commun. 50(5): 405-415 (2008) - [j4]Ken Nishihara, Hisashi Kawai, Toshiaki Gomi, Miho Terajima, Yu Chiba:
Investigation of Optimum Electrode Locations by Using an Automatized Surface Electromyography Analysis Technique. IEEE Trans. Biomed. Eng. 55(2): 636-642 (2008) - [c41]Nobuyuki Nishizawa, Hisashi Kawai:
Unit database pruning based on the cost degradation criterion for concatenative speech synthesis. ICASSP 2008: 3969-3972 - 2007
- [c40]Jinfu Ni, Toshio Hirai, Hisashi Kawai, Tomoki Toda, Keiichi Tokuda, Minoru Tsuzaki, Shinsuke Sakai, Ranniery Maia, Satoshi Nakamura:
ATRECSS - ATR English speech corpus for speech synthesis. Blizzard Challenge 2007 - [c39]Hao Yuan, Hisashi Kawai, Toshiharu Horiuchi:
Reduction of correlation computation in the permutation of the frequency domain ICA by selecting DOAs estimated in subarrays. EUSIPCO 2007: 418-422 - [c38]Nobuyuki Nishizawa, Hisashi Kawai:
A preselection method based on cost degradation from the optimal sequence for concatenative speech synthesis. INTERSPEECH 2007: 2869-2872 - [c37]Shinsuke Sakai, Jinfu Ni, Ranniery Maia, Keiichi Tokuda, Minoru Tsuzaki, Tomoki Toda, Hisashi Kawai, Satoshi Nakamura:
Communicative speech synthesis with XIMERA: a first step. SSW 2007: 28-33 - 2006
- [j3]Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki, Kiyohiro Shikano:
An evaluation of cost functions sensitively capturing local degradation of naturalness for segment selection in concatenative speech synthesis. Speech Commun. 48(1): 45-56 (2006) - [j2]Satoshi Nakamura, Konstantin Markov, Hiromi Nakaiwa, Gen-ichiro Kikui, Hisashi Kawai, Takatoshi Jitsuhiro, Jinsong Zhang, Hirofumi Yamamoto, Eiichiro Sumita, Seiichi Yamamoto:
The ATR multilingual speech-to-speech translation system. IEEE Trans. Speech Audio Process. 14(2): 365-376 (2006) - [c36]Tomoki Toda, Hisashi Kawai, Toshio Hirai, Jinfu Ni, Nobuyuki Nishizawa, Junichi Yamagishi, Minoru Tsuzaki, Keiichi Tokuda, Satoshi Nakamura:
Developing a Test Bed of English Text-to-Speech System XIMERA for the Blizzard Challenge 2006. Blizzard Challenge 2006 - [c35]Norihiro Fukumoto, Hideaki Yamada, Hisashi Kawai:
Evaluation result of transmission control mechanism for multimedia streams based on the multi-RTCP scheme over multiple IP-based networks. CCNC 2006: 308-313 - [c34]Hao Yuan, Makoto Yamada, Hisashi Kawai:
A DOA estimation method for 3D multiple source signals using independent component analysis. EUSIPCO 2006: 1-5 - [c33]Nobuyuki Nishizawa, Hisashi Kawai:
A Short-Latency Unit Selection Method with Redundant Search for Concatenative Speech Synthesis. ICASSP (1) 2006: 757-760 - [c32]Jinfu Ni, Toshio Hirai, Hisashi Kawai:
Constructing a Phonetic-Rich Speech Corpus While Controlling Time-Dependent Voice Quality Variability for English Speech Synthesis. ICASSP (1) 2006: 881-884 - [c31]Kengo Fujita, Tsuneo Kato, Hisashi Kawai:
Quick individual fitting methods of simplified hearing compensation for elderly people. INTERSPEECH 2006 - [c30]Tsuneo Kato, Hisashi Kawai:
A text-prompted distributed speaker verification system implemented on a cellular phone and a mobile terminal. INTERSPEECH 2006 - 2005
- [j1]Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang:
Discriminative training and explicit duration modeling for HMM-based automatic segmentation. Speech Commun. 47(4): 397-410 (2005) - [c29]Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Estimation of intonation variation with constrained tone transformations. INTERSPEECH 2005: 1397-1400 - [c28]Makoto Yamada, Tsuneo Kato, Masaki Naito, Hisashi Kawai:
Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywords. INTERSPEECH 2005: 1445-1448 - [c27]Toshio Hirai, Hisashi Kawai, Minoru Tsuzaki, Nobuyuki Nishizawa:
Analysis of major factors of naturalness degradation in concatenative synthesis. INTERSPEECH 2005: 1925-1928 - [c26]Kengo Fujita, Tsuneo Kato, Hideaki Yamada, Hisashi Kawai:
SNR-dependent background noise compensation of PESQ values for cellular phone speech. INTERSPEECH 2005: 3165 - 2004
- [c25]Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang:
Minimum segmentation error based discriminative training for speech synthesis application. ICASSP (1) 2004: 629-632 - [c24]Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki:
Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis. ICASSP (1) 2004: 657-660 - [c23]Hisashi Kawai, Tomoki Toda:
An evaluation of automatic phone segmentation for concatenative speech synthesis. ICASSP (1) 2004: 677-680 - [c22]Nobuyuki Nishizawa, Hisashi Kawai:
Scaling of waveform segments along the time axis for concatenative speech synthesis. ICASSP (1) 2004: 681-684 - [c21]Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Formulating contextual tonal variations in Mandarin. INTERSPEECH 2004: 749-752 - [c20]Nobuyuki Nishizawa, Hisashi Kawai:
Using a depth-restricted search to reduce delays in unit selection. INTERSPEECH 2004: 1209-1212 - [c19]Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang:
A study on automatic detection of Japanese vowel devoicing for speech synthesis. INTERSPEECH 2004: 2721-2724 - [c18]Hisashi Kawai, Tomoki Toda, Jinfu Ni, Minoru Tsuzaki, Keiichi Tokuda:
XIMERA: a new TTS from ATR based on corpus-based technologies. SSW 2004: 179-184 - 2003
- [c17]Jinfu Ni, Hisashi Kawai:
Tone feature extraction through parametric modeling and analysis-by-synthesis-based pattern matching. ICASSP (1) 2003: 72-75 - [c16]Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki, Kiyohiro Shikano:
Segment selection considering local degradation of naturalness in concatenative speech synthesis. ICASSP (1) 2003: 696-699 - [c15]Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki:
Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations. INTERSPEECH 2003: 297-300 - [c14]Jinfu Ni, Hisashi Kawai:
Tone pattern discrimination combining parametric modeling and maximum likelihood estimation. INTERSPEECH 2003: 465-468 - 2002
- [c13]Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki, Kiyohiro Shikano:
Unit selection algorithm for Japanese speech synthesis based on both phoneme unit and diphone unit. ICASSP 2002: 465-468 - [c12]Minoru Tsuzaki, Hisashi Kawai:
Feature extraction for unit selection in concatenative speech synthesis: comparison between AIM, LPC, and MFCC. INTERSPEECH 2002: 137-140 - [c11]Jinfu Ni, Hisashi Kawai:
Design of a Mandarin sentence set for corpus-based speech synthesis by use of a multi-tier algorithm taking account of the varied prosodic and spectral characteristics. INTERSPEECH 2002: 2361-2364 - [c10]Jinlin Lu, Hisashi Kawai:
Perceptual evaluation of naturalness due to substitution of Chinese syllable for concatenative speech synthesis. INTERSPEECH 2002: 2377-2380 - [c9]Hisashi Kawai, Minoru Tsuzaki:
Acoustic measures vs. phonetic features as predictors of audible discontinuity in concatenative speech synthesis. INTERSPEECH 2002: 2621-2624 - 2000
- [c8]Hisashi Kawai, Seiichi Yamamoto, Norio Higuchi, Tohru Shimizu:
A design method of speech corpus for text-to-speech synthesis taking account of prosody. INTERSPEECH 2000: 420-425
1990 – 1999
- 1998
- [c7]Hisashi Kawai, Norio Higuchi:
Recognition of connected digit speech in Japanese collected over the telephone network. ICSLP 1998 - 1994
- [c6]Hisashi Kawai, Norio Higuchi, Tohru Shimizu, Seiichi Yamamoto:
Development of a text-to-speech system for Japanese based on waveform splicing. ICASSP (1) 1994: 569-572 - 1990
- [c5]Hiroya Fujisaki, Keikichi Hirose, Hisashi Kawai, Yasuharu Asano:
A system for synthesizing Japanese speech from orthographic text. ICASSP 1990: 617-620 - [c4]Tohru Shimizu, Norio Higuchi, Hisashi Kawai, Seiichi Yamamoto:
The linguistic processing module for Japanese text-to-speech system. ICSLP 1990: 321-324 - [c3]Norio Higuchi, Hisashi Kawai, Tohru Shimizu, Seiichi Yamamoto:
Improvement of the synthetic speech quality of the formant-type speech synthesizer and its subjective evaluation. ICSLP 1990: 797-800
1980 – 1989
- 1988
- [c2]Hiroya Fujisaki, Hisashi Kawai:
Realization of linguistic information in the voice fundamental frequency contour of the spoken Japanese. ICASSP 1988: 663-666 - 1986
- [c1]Keikichi Hirose, Hiroya Fujisaki, Hisashi Kawai:
Generation of prosodic symbols for rule-synthesis of connected speech of Japanese. ICASSP 1986: 2415-2418
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 02:33 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint