default search action
Shoko Araki
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j38]Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Shoji Makino:
DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situations. EURASIP J. Audio Speech Music. Process. 2024(1): 52 (2024) - [j37]Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki, Shoji Makino:
Blind and Spatially-Regularized Online Joint Optimization of Source Separation, Dereverberation, and Noise Reduction. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1157-1172 (2024) - [j36]Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3589-3602 (2024) - [c149]Rino Kimura, Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki, Tetsuya Ueda, Shoji Makino:
Diffusion Model-Based MIMO Speech Denoising and Dereverberation. ICASSP Workshops 2024: 455-459 - [c148]Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocký:
Probing Self-Supervised Learning Models With Target Speech Extraction. ICASSP Workshops 2024: 535-539 - [c147]Keigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki, Akira Nakayama:
Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal Teacher. ICASSP 2024: 561-565 - [c146]Hao Shi, Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Ensemble Inference for Diffusion Model-Based Speech Enhancement. ICASSP Workshops 2024: 735-739 - [c145]Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký:
Target Speech Extraction with Pre-Trained Self-Supervised Learning Models. ICASSP 2024: 10421-10425 - [c144]Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino:
Neural Network-Based Virtual Microphone Estimation with Virtual Microphone and Beamformer-Level Multi-Task Loss. ICASSP 2024: 11021-11025 - [c143]Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How Does End-To-End Speech Recognition Training Impact Speech Enhancement Artifacts? ICASSP 2024: 11031-11035 - [c142]Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki:
Multi-Stream Diffusion Model for Probabilistic Integration of Model-Based and Data-Driven Speech Enhancement. IWAENC 2024: 65-69 - [c141]Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Interaural Time Difference Loss for Binaural Target Sound Extraction. IWAENC 2024: 210-214 - [i32]Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, Simon Doclo:
Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers. CoRR abs/2402.03058 (2024) - [i31]Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký:
Target Speech Extraction with Pre-trained Self-supervised Learning Models. CoRR abs/2402.13199 (2024) - [i30]Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocký:
Probing Self-supervised Learning Models with Target Speech Extraction. CoRR abs/2402.13200 (2024) - [i29]Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance. CoRR abs/2404.14860 (2024) - [i28]Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Interaural time difference loss for binaural target sound extraction. CoRR abs/2408.00344 (2024) - [i27]Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Daisuke Niizumi, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model. CoRR abs/2409.12528 (2024) - [i26]Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, Shoko Araki:
Mamba-based Segmentation Model for Speaker Diarization. CoRR abs/2410.06459 (2024) - 2023
- [j35]Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi, Shoko Araki:
SoundBeam: Target Sound Extraction Conditioned on Sound-Class Labels and Enrollment Clues for Increased Performance and Continuous Learning. IEEE ACM Trans. Audio Speech Lang. Process. 31: 121-136 (2023) - [j34]Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Mask-Based Neural Beamforming for Moving Speakers With Self-Attention-Based Tracking. IEEE ACM Trans. Audio Speech Lang. Process. 31: 835-848 (2023) - [c140]Ning Guo, Tomohiro Nakatani, Shoko Araki, Takehiro Moriya:
Modified Parametric Multichannel Wiener Filter for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers. APSIPA ASC 2023: 1042-1049 - [c139]Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Shoji Makino:
Spatially-Regularized Switching Independent Vector Analysis. APSIPA ASC 2023: 2024-2030 - [c138]Taishi Nakashima, Rintaro Ikeshita, Nobutaka Ono, Shoko Araki, Tomohiro Nakatani:
Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis. ICASSP 2023: 1-5 - [c137]Shoko Araki, Ayako Yamamoto, Tsubasa Ochiai, Kenichi Arai, Atsunori Ogawa, Tomohiro Nakatani, Toshio Irino:
Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine. INTERSPEECH 2023: 2503-2507 - [c136]Marc Delcroix, Naohiro Tawara, Mireia Díez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukás Burget, Shoko Araki:
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. INTERSPEECH 2023: 3477-3481 - [i25]Marc Delcroix, Naohiro Tawara, Mireia Díez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukás Burget, Shoko Araki:
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. CoRR abs/2305.13580 (2023) - [i24]Ning Guo, Tomohiro Nakatani, Shoko Araki, Takehiro Moriya:
Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers. CoRR abs/2306.17317 (2023) - [i23]Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki:
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. CoRR abs/2312.12764 (2023) - 2022
- [j33]Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1032-1047 (2022) - [c135]Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki:
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. ICASSP 2022: 6517-6521 - [c134]Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR. INTERSPEECH 2022: 5418-5422 - [c133]Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. ACM Multimedia 2022: 4252-4260 - [i22]Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR. CoRR abs/2201.06685 (2022) - [i21]Ayako Yamamoto, Toshio Irino, Shoko Araki, Kenichi Arai, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani:
Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening. CoRR abs/2203.16760 (2022) - [i20]Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Yasunori Ohishi, Shoko Araki:
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning. CoRR abs/2204.03895 (2022) - [i19]Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking. CoRR abs/2205.03568 (2022) - [i18]Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. CoRR abs/2207.11964 (2022) - 2021
- [j32]Rintaro Ikeshita, Tomohiro Nakatani, Shoko Araki:
Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction. IEEE Trans. Signal Process. 69: 3252-3267 (2021) - [c132]Tomohiro Nakatani, Rintaro Ikeshita, Naoyuki Kamo, Keisuke Kinoshita, Shoko Araki, Hiroshi Sawada:
Switching Convolutional Beamformer. EUSIPCO 2021: 266-270 - [c131]Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki, Shoji Makino:
Low Latency Online Source Separation and Noise Reduction Based on Joint Optimization with Dereverberation. EUSIPCO 2021: 1000-1004 - [c130]Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki, Shoji Makino:
Low Latency Online Blind Source Separation Based on Joint Optimization with Blind Dereverberation. ICASSP 2021: 506-510 - [c129]Julio Wissing, Benedikt T. Boenninghoff, Dorothea Kolossa, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Christopher Schymura:
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain. ICASSP 2021: 4705-4709 - [c128]Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki:
Neural Network-Based Virtual Microphone Estimator. ICASSP 2021: 6114-6118 - [c127]Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki:
Blind and Neural Network-Guided Convolutional Beamformer for Joint Denoising, Dereverberation, and Source Separation. ICASSP 2021: 6129-6133 - [c126]Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani:
Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility. Interspeech 2021: 181-185 - [c125]Christopher Schymura, Benedikt T. Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
PILOT: Introducing Transformers for Probabilistic Sound Event Localization. Interspeech 2021: 2117-2121 - [c124]Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki:
Few-Shot Learning of New Sound Classes for Target Sound Extraction. Interspeech 2021: 3500-3504 - [c123]Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Multimodal Attention Fusion for Target Speaker Extraction. SLT 2021: 778-784 - [i17]Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki:
Neural Network-based Virtual Microphone Estimator. CoRR abs/2101.04315 (2021) - [i16]Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki:
Multimodal Attention Fusion for Target Speaker Extraction. CoRR abs/2102.01326 (2021) - [i15]Julio Wissing, Benedikt T. Boenninghoff, Dorothea Kolossa, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Christopher Schymura:
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain. CoRR abs/2102.11588 (2021) - [i14]Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization. CoRR abs/2103.00417 (2021) - [i13]Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani:
Comparison of remote experiments using crowdsourcing and laboratory experiments on speech intelligibility. CoRR abs/2104.10001 (2021) - [i12]Christopher Schymura, Benedikt T. Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
PILOT: Introducing Transformers for Probabilistic Sound Event Localization. CoRR abs/2106.03903 (2021) - [i11]Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki:
Few-shot learning of new sound classes for target sound extraction. CoRR abs/2106.07144 (2021) - [i10]Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki:
Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation. CoRR abs/2108.01836 (2021) - [i9]Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and Its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithm. CoRR abs/2111.10574 (2021) - 2020
- [j31]Katsuhiko Yamamoto, Toshio Irino, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani:
GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech. Speech Commun. 123: 43-58 (2020) - [j30]Satoru Emura, Hiroshi Sawada, Shoko Araki, Noboru Harada:
Multi-Delay Sparse Approach to Residual Crosstalk Reduction for Blind Source Separation. IEEE Signal Process. Lett. 27: 1630-1634 (2020) - [c122]Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization. EUSIPCO 2020: 231-235 - [c121]Satoru Emura, Hiroshi Sawada, Shoko Araki, Noboru Harada:
A Frequency-Domain BSS Method Based on ℓ1 Norm, Unitary Constraint, and Cayley Transform. ICASSP 2020: 111-115 - [c120]Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani:
Tackling Real Noisy Reverberant Meetings with All-Neural Source Separation, Counting, and Diarization System. ICASSP 2020: 381-385 - [c119]Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
A Dynamic Stream Weight Backprop Kalman Filter for Audiovisual Speaker Tracking. ICASSP 2020: 581-585 - [c118]Rintaro Ikeshita, Tomohiro Nakatani, Shoko Araki:
Overdetermined Independent Vector Analysis. ICASSP 2020: 591-595 - [c117]Marc Delcroix, Tsubasa Ochiai, Katerina Zmolíková, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam. ICASSP 2020: 691-695 - [c116]Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki:
Beam-TasNet: Time-domain Audio Separation Network Meets Frequency-domain Beamformer. ICASSP 2020: 6384-6388 - [c115]Tomohiro Nakatani, Riki Takahashi, Tsubasa Ochiai, Keisuke Kinoshita, Rintaro Ikeshita, Marc Delcroix, Shoko Araki:
DNN-supported Mask-based Convolutional Beamforming for Simultaneous Denoising, Dereverberation, and Source Separation. ICASSP 2020: 6399-6403 - [c114]Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki:
Computationally Efficient and Versatile Framework for Joint Optimization of Blind Speech Separation and Dereverberation. INTERSPEECH 2020: 91-95 - [c113]Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani, Toshio Irino:
Predicting Intelligibility of Enhanced Speech Using Posteriors Derived from DNN-Based ASR System. INTERSPEECH 2020: 1156-1160 - [c112]Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki:
Listen to What You Want: Neural Network-Based Universal Sound Selector. INTERSPEECH 2020: 1441-1445 - [c111]Ali Aroudi, Marc Delcroix, Tomohiro Nakatani, Keisuke Kinoshita, Shoko Araki, Simon Doclo:
Cognitive-Driven Convolutional Beamforming Using EEG-Based Auditory Attention Decoding. MLSP 2020: 1-6 - [i8]Marc Delcroix, Tsubasa Ochiai, Katerina Zmolíková, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki:
Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam. CoRR abs/2001.08378 (2020) - [i7]Rintaro Ikeshita, Tomohiro Nakatani, Shoko Araki:
Overdetermined independent vector analysis. CoRR abs/2003.02458 (2020) - [i6]Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani:
Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system. CoRR abs/2003.03987 (2020) - [i5]Ali Aroudi, Marc Delcroix, Tomohiro Nakatani, Keisuke Kinoshita, Shoko Araki, Simon Doclo:
Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding. CoRR abs/2005.04669 (2020) - [i4]Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki:
Listen to What You Want: Neural Network-based Universal Sound Selector. CoRR abs/2006.05712 (2020)
2010 – 2019
- 2019
- [j29]Shinji Watanabe, Shoko Araki, Michiel Bacchiani, Reinhold Haeb-Umbach, Michael L. Seltzer:
Introduction to the Issue on Far-Field Speech Processing in the Era of Deep Learning: Speech Enhancement, Separation, and Recognition. IEEE J. Sel. Top. Signal Process. 13(4): 785-786 (2019) - [c110]Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Projection Back onto Filtered Observations for Speech Separation with Distributed Microphone Array. CAMSAP 2019: 291-295 - [c109]Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach:
All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis. ICASSP 2019: 91-95 - [c108]Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Estimation of Sampling Frequency Mismatch between Distributed Asynchronous Microphones under Existence of Source Movements with Stationary Time Periods Detection. ICASSP 2019: 785-789 - [c107]Yuki Kubo, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Shoko Araki:
Mask-based MVDR Beamformer for Noisy Multisource Environments: Introduction of Time-varying Spatial Covariance Model. ICASSP 2019: 6855-6859 - [c106]Marc Delcroix, Katerina Zmolíková, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki, Tomohiro Nakatani:
Compact Network for Speakerbeam Target Speaker Extraction. ICASSP 2019: 6965-6969 - [c105]Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani, Katsuhiko Yamamoto, Toshio Irino:
Predicting Speech Intelligibility of Enhanced Speech Using Phone Accuracy of DNN-Based ASR System. INTERSPEECH 2019: 4275-4279 - [c104]Tomohiro Nakatani, Keisuke Kinoshita, Rintaro Ikeshita, Hiroshi Sawada, Shoko Araki:
Simultaneous Denoising, Dereverberation, and Source Separation Using a Unified Convolutional Beamformer. WASPAA 2019: 224-228 - [i3]Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach:
All-neural online source separation, counting, and diarization for meeting analysis. CoRR abs/1902.07881 (2019) - [i2]Katsuhiko Yamamoto, Toshio Irino, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani:
GEDI: Gammachirp Envelope Distortion Index for Predicting Intelligibility of Enhanced Speech. CoRR abs/1904.02096 (2019) - 2018
- [j28]Satoru Emura, Shoko Araki, Tomohiro Nakatani, Noboru Harada:
Distortionless Beamforming Optimized With ℓ1-Norm Minimization. IEEE Signal Process. Lett. 25(7): 936-940 (2018) - [c103]Nobutaka Ito, Christopher Schymura, Shoko Araki, Tomohiro Nakatani:
Noisy cGMM: Complex Gaussian Mixture Model with Non-Sparse Noise Model for Joint Source Separation and Denoising. EUSIPCO 2018: 1662-1666 - [c102]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
FastFCA: Joint Diagonalization Based Acceleration of Audio Source Separation Using a Full-Rank Spatial Covariance Model. EUSIPCO 2018: 1667-1671 - [c101]Juan Azcarreta, Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Permutation-Free Cgmm: Complex Gaussian Mixture Model with Inverse Wishart Mixture Model Based Spatial Prior for Permutation-Free Source Separation and Source Counting. ICASSP 2018: 51-55 - [c100]Nobutaka Ito, Takashi Makino, Shoko Araki, Tomohiro Nakatani:
Maximum-Likelihood Online Speaker Diarization in Noisy Meetings Based on Categorical Mixture Model and Probabilistic Spatial Dictionary. ICASSP 2018: 546-550 - [c99]Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Meeting Recognition with Asynchronous Distributed Microphone Array Using Block-Wise Refinement of Mask-Based MVDR Beamformer. ICASSP 2018: 5694-5698 - [c98]Katsuhiko Yamamoto, Toshio Irino, Narumi Ohashi, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani:
Multi-resolution Gammachirp Envelope Distortion Index for Intelligibility Prediction of Noisy Speech. INTERSPEECH 2018: 1863-1867 - [c97]Yutaro Matsui, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Nobutaka Ito, Shoko Araki, Shoji Makino:
Online Integration of DNN-Based and Spatial Clustering-Based Mask Estimation for Robust MVDR Beamforming. IWAENC 2018: 71-75 - [c96]Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Comparison of Reference Microphone Selection Algorithms for Distributed Microphone Array Based Speech Enhancement in Meeting Recognition Scenarios. IWAENC 2018: 316-320 - [i1]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
FastFCA: A Joint Diagonalization Based Fast Algorithm for Audio Source Separation Using A Full-Rank Spatial Covariance Model. CoRR abs/1805.06572 (2018) - 2017
- [j27]Tomoko Kawase, Kenta Niwa, Masakiyo Fujimoto, Kazunori Kobayashi, Shoko Araki, Tomohiro Nakatani:
Integration of Spatial Cue-Based Noise Reduction and Speech Model-Based Source Restoration for Real Time Speech Enhancement. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 100-A(5): 1127-1136 (2017) - [j26]Takuya Higuchi, Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Tomohiro Nakatani:
Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR. IEEE ACM Trans. Audio Speech Lang. Process. 25(4): 780-793 (2017) - [c95]Shoko Araki, Nobutaka Ono, Keisuke Kinoshita, Marc Delcroix:
Meeting recognition with asynchronous distributed microphone array. ASRU 2017: 32-39 - [c94]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming. EUSIPCO 2017: 1165-1169 - [c93]Shoko Araki, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Takuya Higuchi, Takuya Yoshioka, Dung T. Tran, Shigeki Karita, Tomohiro Nakatani:
Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming. HSCMA 2017: 16-20 - [c92]Tomohiro Nakatani, Nobutaka Ito, Takuya Higuchi, Shoko Araki, Keisuke Kinoshita:
Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming. ICASSP 2017: 286-290 - [c91]Nobutaka Ito, Shoko Araki, Marc Delcroix, Tomohiro Nakatani:
Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments. ICASSP 2017: 681-685 - [c90]Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani:
Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio. INTERSPEECH 2017: 2949-2953 - [p3]Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 21-49 - 2016
- [c89]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Complex angular central Gaussian mixture model for directional statistics in mask-based microphone array signal processing. EUSIPCO 2016: 1153-1157 - [c88]Naoki Murata, Hirokazu Kameoka, Keisuke Kinoshita, Shoko Araki, Tomohiro Nakatani, Shoichi Koyama, Hiroshi Saruwatari:
Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution. EUSIPCO 2016: 1648-1652 - [c87]Shoko Araki, Masahiro Okada, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition. ICASSP 2016: 385-389 - [c86]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Modeling audio directional statistics using a complex bingham mixture model for blind source extraction from diffuse noise. ICASSP 2016: 465-468 - [c85]Tomoko Kawase, Kenta Niwa, Masakiyo Fujimoto, Noriyoshi Kamado, Kazunori Kobayashi, Shoko Araki, Tomohiro Nakatani:
Real-time integration of statistical model-based speech enhancement with unsupervised noise PSD estimation using microphone array. ICASSP 2016: 604-608 - [c84]Hendrik Meutzner, Shoko Araki, Masakiyo Fujimoto, Tomohiro Nakatani:
A generative-discriminative hybrid approach to multi-channel noise reduction for robust automatic speech recognition. ICASSP 2016: 5740-5744 - [c83]Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani:
Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank. INTERSPEECH 2016: 2885-2889 - [c82]Mahmoud Fakhry, Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings. IWAENC 2016: 1-5 - 2015
- [j25]Marc Delcroix, Takuya Yoshioka, Atsunori Ogawa, Yotaro Kubo, Masakiyo Fujimoto, Nobutaka Ito, Keisuke Kinoshita, Miquel Espi, Shoko Araki, Takaaki Hori, Tomohiro Nakatani:
Strategies for distant speech recognitionin reverberant environments. EURASIP J. Adv. Signal Process. 2015: 60 (2015) - [j24]Nobutaka Ito, Emmanuel Vincent, Tomohiro Nakatani, Nobutaka Ono, Shoko Araki, Shigeki Sagayama:
Blind Suppression of Nonstationary Diffuse Acoustic Noise Based on Spatial Covariance Matrix Decomposition. J. Signal Process. Syst. 79(2): 145-157 (2015) - [c81]Takuya Yoshioka, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Chengzhu Yu, Wojciech J. Fabian, Miquel Espi, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices. ASRU 2015: 436-443 - [c80]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Permutation-free clustering of relative transfer function features for blind source separation. EUSIPCO 2015: 409-413 - [c79]Shoko Araki, Tomoki Hayashi, Marc Delcroix, Masakiyo Fujimoto, Kazuya Takeda, Tomohiro Nakatani:
Exploring multi-channel features for denoising-autoencoder-based speech enhancement. ICASSP 2015: 116-120 - 2014
- [c78]Marc Delcroix, Takuya Yoshioka, Atsunori Ogawa, Yotaro Kubo, Masakiyo Fujimoto, Nobutaka Ito, Keisuke Kinoshita, Miquel Espi, Shoko Araki, Takaaki Hori, Tomohiro Nakatani:
Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition. GlobalSIP 2014: 522-526 - [c77]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Probabilistic integration of diffuse noise suppression and dereverberation. ICASSP 2014: 5167-5171 - [c76]Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Tomohiro Nakatani:
Relaxed disjointness based clustering for joint blind source separation and dereverberation. IWAENC 2014: 268-272 - 2013
- [j23]Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Atsunori Ogawa, Takaaki Hori, Shinji Watanabe, Masakiyo Fujimoto, Takuya Yoshioka, Takanobu Oba, Yotaro Kubo, Mehrez Souden, Seong-Jun Hahm, Atsushi Nakamura:
Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds. Comput. Speech Lang. 27(3): 851-873 (2013) - [j22]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda:
Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data. IEEE Trans. Speech Audio Process. 21(5): 971-982 (2013) - [j21]Mehrez Souden, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani, Hiroshi Sawada:
A Multichannel MMSE-Based Framework for Speech Source Separation and Noise Reduction. IEEE Trans. Speech Audio Process. 21(9): 1913-1928 (2013) - [j20]Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto:
Dominance Based Integration of Spatial and Spectral Features for Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 21(12): 2516-2531 (2013) - [c75]Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Permutation-free convolutive blind source separation via full-band clustering based on frequency-independent source presence priors. ICASSP 2013: 3238-3242 - [c74]Tomohiro Nakatani, Mehrez Souden, Shoko Araki, Takuya Yoshioka, Takaaki Hori, Atsunori Ogawa:
Coupling beamforming with spatial and spectral feature based spectral enhancement and its application to meeting recognition. ICASSP 2013: 7249-7253 - [c73]Ingrid Jafari, Nobutaka Ito, Mehrez Souden, Shoko Araki, Tomohiro Nakatani:
Source number estimation based on clustering of speech activity sequences for microphone array processing. MLSP 2013: 1-6 - 2012
- [j19]Emmanuel Vincent, Shoko Araki, Fabian J. Theis, Guido Nolte, Pau Bofill, Hiroshi Sawada, Alexey Ozerov, Vikrham Gowreesunker, Dominik Lutter, Ngoc Q. K. Duong:
The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges. Signal Process. 92(8): 1928-1936 (2012) - [j18]Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada:
Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information. IEEE Trans. Speech Audio Process. 20(2): 447-460 (2012) - [j17]Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato:
Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera. IEEE Trans. Speech Audio Process. 20(2): 499-513 (2012) - [c72]Takuro Maruyama, Shoko Araki, Tomohiro Nakatani, Shigeki Miyabe, Takeshi Yamada, Shoji Makino, Atsushi Nakamura:
New analytical calculation and estimation of TDOA for underdetermined BSS in noisy environments. APSIPA 2012: 1-6 - [c71]Shoko Araki, Francesco Nesta, Emmanuel Vincent, Zbynek Koldovský, Guido Nolte, Andreas Ziehe, Alexis Benichoux:
The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -. LVA/ICA 2012: 414-422 - [c70]Guido Nolte, Dominik Lutter, Andreas Ziehe, Francesco Nesta, Emmanuel Vincent, Zbynek Koldovský, Alexis Benichoux, Shoko Araki:
The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Biomedical Data Analysis -. LVA/ICA 2012: 423-429 - [c69]Mehrez Souden, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani, Hiroshi Sawada:
A multichannel MMSE-based framework for joint blind source separation and noise reduction. ICASSP 2012: 109-112 - [c68]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda:
Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. ICASSP 2012: 261-264 - [c67]Shoko Araki, Tomohiro Nakatani:
Sparse vector factorization for underdetermined BSS using wrapped-phase GMM and source log-spectral prior. ICASSP 2012: 265-268 - [c66]Takuro Maruyama, Shoko Araki, Tomohiro Nakatani, Shigeki Miyabe, Takeshi Yamada, Shoji Makino, Atsushi Nakamura:
New analytical update rule for TDOA inference for underdetermined BSS in noisy environments. ICASSP 2012: 269-272 - [c65]Tomohiro Nakatani, Takuya Yoshioka, Shoko Araki, Marc Delcroix, Masakiyo Fujimoto:
LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise. ICASSP 2012: 4029-4032 - 2011
- [j16]Hiroshi Sawada, Shoko Araki, Shoji Makino:
Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment. IEEE Trans. Speech Audio Process. 19(3): 516-527 (2011) - [j15]Shoko Araki, Hiroshi Sawada, Ryo Mukai, Shoji Makino:
DOA Estimation for Multiple Sparse Sources with Arbitrarily Arranged Multiple Sensors. J. Signal Process. Syst. 63(3): 265-275 (2011) - [c64]Shoko Araki, Tomohiro Nakatani:
Hybrid approach for multichannel source separation combining time-frequency mask with multi-channel Wiener filter. ICASSP 2011: 225-228 - [c63]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda:
Formulations and algorithms for multichannel complex NMF. ICASSP 2011: 229-232 - [c62]Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto:
Joint unsupervised learning of hidden Markov source models and source location models for multichannel source separation. ICASSP 2011: 237-240 - [c61]Tomohiro Nakatani, Shoko Araki, Marc Delcroix, Takuya Yoshioka, Masakiyo Fujimoto:
Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASR. INTERSPEECH 2011: 1785-1788 - [c60]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda:
New formulations and efficient algorithms for multichannel NMF. WASPAA 2011: 153-156 - 2010
- [j14]Kentaro Ishizuka, Shoko Araki, Tatsuya Kawahara:
Speech Activity Detection for Multi-Party Conversation Analyses Based on Likelihood Ratio Test on Spatial Magnitude. IEEE Trans. Speech Audio Process. 18(6): 1354-1365 (2010) - [c59]Shoko Araki, Alexey Ozerov, Vikrham Gowreesunker, Hiroshi Sawada, Fabian J. Theis, Guido Nolte, Dominik Lutter, Ngoc Q. K. Duong:
The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation. LVA/ICA 2010: 114-122 - [c58]Shoko Araki, Fabian J. Theis, Guido Nolte, Dominik Lutter, Alexey Ozerov, Vikrham Gowreesunker, Hiroshi Sawada, Ngoc Q. K. Duong:
The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Biomedical Source Separation. LVA/ICA 2010: 123-130 - [c57]Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada:
Simultaneous clustering of mixing and spectral model parameters for blind sparse source separation. ICASSP 2010: 5-8 - [c56]Tomohiro Nakatani, Shoko Araki:
Single channel source separation based on sparse source observation model with harmonic constraint. ICASSP 2010: 13-16 - [c55]Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto:
Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model. INTERSPEECH 2010: 2766-2769 - [c54]Yumi Ansa, Shoko Araki, Shoji Makino, Tomohiro Nakatani, Takeshi Yamada, Atsushi Nakamura, Nobuhiko Kitawaki:
Cepstral smoothing of separated signals for underdetermined speech separation. ISCAS 2010: 2506-2509 - [c53]Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato:
Real-time meeting recognition and understanding using distant microphones and omni-directional camera. SLT 2010: 424-429
2000 – 2009
- 2009
- [j13]Hiroko Kato Solvang, Yuichi Nagahara, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation. IEEE Trans. Speech Audio Process. 17(4): 639-649 (2009) - [c52]Takayoshi Tashiro, Shoko Araki, Yasuhiko Nakanishi, Hideaki Kimura, Kiyomi Kumozaki, Masato Miyoshi:
An Optical Access Network System without a Power Supply Using Blind Speech Separation and a Loopback Technique. GLOBECOM 2009: 1-6 - [c51]Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada, Shoji Makino:
Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior. ICASSP 2009: 33-36 - [c50]Kentaro Ishizuka, Shoko Araki, Kazuhiro Otsuka, Tomohiro Nakatani, Masakiyo Fujimoto:
A speaker diarization method based on the probabilistic fusion of audio-visual location information. ICMI 2009: 55-62 - [c49]Kazuhiro Otsuka, Shoko Araki, Dan Mikami, Kentaro Ishizuka, Masakiyo Fujimoto, Junji Yamato:
Realtime meeting analysis and 3D meeting viewer based on omnidirectional multimodal sensors. ICMI 2009: 219-220 - [c48]Emmanuel Vincent, Shoko Araki, Pau Bofill:
The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation. ICA 2009: 734-741 - [c47]Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada, Shoji Makino:
Stereo Source Separation and Source Counting with MAP Estimation with Dirichlet Prior Considering Spatial Aliasing Problem. ICA 2009: 742-750 - [c46]Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani:
A probabilistic speaker clustering for DOA-based diarization. WASPAA 2009: 241-244 - 2008
- [c45]Shoko Araki, Masakiyo Fujimoto, Kentaro Ishizuka, Hiroshi Sawada, Shoji Makino:
Speaker indexing and speech enhancement in real meetings / conversations. ICASSP 2008: 93-96 - [c44]Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Martin Heinrich, Junji Yamato:
A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization. ICMI 2008: 257-264 - [c43]Kentaro Ishizuka, Shoko Araki, Tatsuya Kawahara:
Statistical speech activity detection based on spatial power distribution for analyses of poster presentations. INTERSPEECH 2008: 99-102 - [c42]Tatsuya Kawahara, Hisao Setoguchi, Katsuya Takanashi, Kentaro Ishizuka, Shoko Araki:
Multi-modal recording, analysis and indexing of poster sessions. INTERSPEECH 2008: 1622-1625 - [c41]Dorothea Kolossa, Shoko Araki, Marc Delcroix, Tomohiro Nakatani, Reinhold Orglmeister, Shoji Makino:
Missing feature speech recognition in a meeting situation with maximum SNR beamforming. ISCAS 2008: 3218-3221 - 2007
- [j12]Shoko Araki, Hiroshi Sawada, Ryo Mukai, Shoji Makino:
Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Process. 87(8): 1833-1847 (2007) - [j11]Mirko Knaak, Shoko Araki, Shoji Makino:
Geometrically Constrained Independent Component Analysis. IEEE Trans. Speech Audio Process. 15(2): 715-726 (2007) - [j10]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation. IEEE Trans. Speech Audio Process. 15(5): 1592-1604 (2007) - [c40]Shoji Makino, Hiroshi Sawada, Shoko Araki:
Blind Audio Source Separation Based on Independent Component Analysis. ICA 2007: 843 - [c39]Shoko Araki, Hiroshi Sawada, Shoji Makino:
Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers. ICASSP (1) 2007: 41-44 - [c38]Jan Cermak, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Blind Source Separation Based on a Beamformer Array and Time Frequency Binary Masking. ICASSP (1) 2007: 145-148 - [c37]Juan E. Rubio, Kentaro Ishizuka, Hiroshi Sawada, Shoko Araki, Tomohiro Nakatani, Masakiyo Fujimoto:
Two-Microphone Voice Activity Detection Based on the Homogeneity of the Direction of Arrival Estimates. ICASSP (4) 2007: 385-388 - [c36]Hiroshi Sawada, Shoko Araki, Shoji Makino:
Measuring Dependence of Bin-wise Separated Signals for Permutation Alignment in Frequency-domain BSS. ISCAS 2007: 3247-3250 - [p2]Shoji Makino, Hiroshi Sawada, Shoko Araki:
Frequency-Domain Blind Source Separation. Blind Speech Separation 2007: 47-78 - [p1]Shoko Araki, Hiroshi Sawada, Shoji Makino:
K-means Based Underdetermined Blind Speech Separation. Blind Speech Separation 2007: 243-270 - 2006
- [j9]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models. EURASIP J. Adv. Signal Process. 2006 (2006) - [j8]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking. IEEE Trans. Speech Audio Process. 14(6): 2165-2173 (2006) - [c35]Shoko Araki, Hiroshi Sawada, Ryo Mukai, Shoji Makino:
Normalized observation vector clustering approach for sparse source separation. EUSIPCO 2006: 1-5 - [c34]Hiroko Kato, Yuichi Nagahara, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Parametric-Pearson-based independent component analysis for frequency-domain blind speech separation. EUSIPCO 2006: 1-5 - [c33]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
On Calculating the Inverse of Separation Matrix in Frequency-Domain Blind Source Separation. ICA 2006: 691-699 - [c32]Shoko Araki, Hiroshi Sawada, Ryo Mukai, Shoji Makino:
Doa Estimation for Multiple Sparse Sources with Normalized Observation Vector Clustering. ICASSP (5) 2006: 33-36 - [c31]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Solving the Permutation Problem of Frequency-Domain BSS when Spatial Aliasing Occurs with Wide Sensor Spacing. ICASSP (5) 2006: 77-80 - [c30]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Blind Source Separation of Many Signals in the Frequency Domain. ICASSP (5) 2006: 969-972 - [c29]Shoko Araki, Hiroshi Sawada, Ryo Mukai, Shoji Makino:
Underdetermined sparse source separation of convolutive mixtures with observation vector clustering. ISCAS 2006 - 2005
- [j7]Shoji Makino, Hiroshi Sawada, Ryo Mukai, Shoko Araki:
Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 88-A(7): 1640-1655 (2005) - [j6]Audrey Blin, Shoko Araki, Shoji Makino:
Underdetermined Blind Separation of Convolutive Mixtures of Speech Using Time-Frequency Mask and Mixing Matrix Estimation. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 88-A(7): 1693-1700 (2005) - [j5]Shoko Araki, Shoji Makino, Robert Aichner, Tsuyoki Nishikawa, Hiroshi Saruwatari:
Subband-Based Blind Separation for Convolutive Mixtures of Speech. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 88-A(12): 3593-3603 (2005) - [c28]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Blind extraction of a dominant source signal from mixtures of many sources [audio source separation applications]. ICASSP (3) 2005: 61-64 - [c27]Shoko Araki, Shoji Makino, Hiroshi Sawada, Ryo Mukai:
Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask. ICASSP (3) 2005: 81-84 - [c26]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Blind extraction of a dominant source from mixtures of many sources using ICA and time-frequency masking. ISCAS (6) 2005: 5882-5885 - 2004
- [j4]Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino:
A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech Audio Process. 12(5): 530-538 (2004) - [c25]Shoko Araki, Shoji Makino, Hiroshi Sawada, Ryo Mukai:
Underdetermined blind speech separation with directivity pattern based continuous mask and ICA. EUSIPCO 2004: 1991-1994 - [c24]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Frequency Domain Blind Source Separation for Many Speech Signals. ICA 2004: 461-469 - [c23]Hiroshi Sawada, Stefan Winter, Ryo Mukai, Shoko Araki, Shoji Makino:
Estimating the Number of Sources for Frequency-Domain Blind Source Separation. ICA 2004: 610-617 - [c22]Stefan Winter, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Overcomplete BSS for Convolutive Mixtures Based on Hierarchical Clustering. ICA 2004: 652-660 - [c21]Shoko Araki, Shoji Makino, Hiroshi Sawada, Ryo Mukai:
Underdetermined Blind Separation of Convolutive Mixtures of Speech with Directivity Pattern Based Mask and ICA. ICA 2004: 898-905 - [c20]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Near-field frequency domain blind source separation for convolutive mixtures. ICASSP (4) 2004: 49-52 - [c19]Audrey Blin, Shoko Araki, Shoji Makino:
A sparseness-mixing matrix estimation (SMME) solving the underdetermined BSS for convolutive mixtures. ICASSP (4) 2004: 85-88 - [c18]Shoko Araki, Shoji Makino, Audrey Blin, Ryo Mukai, Hiroshi Sawada:
Underdetermined blind separation for speech in real environments with sparseness and ICA. ICASSP (3) 2004: 881-884 - [c17]Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino:
Convolutive blind source separation for more than two sources in the frequency domain. ICASSP (3) 2004: 885-888 - [c16]Stefan Winter, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Hierarchical clustering applied to overcomplete BSS for convolutive mixtures. SAPA@INTERSPEECH 2004: 48 - [c15]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Frequency domain blind source separation using small and large spacing sensor pairs. ISCAS (5) 2004: 1-4 - [c14]Shoji Makino, Shoko Araki, Ryo Mukai, Hiroshi Sawada:
Audio source separation based on independent component analysis. ISCAS (5) 2004: 668-671 - 2003
- [j3]Shoko Araki, Shoji Makino, Yoichi Hinamoto, Ryo Mukai, Tsuyoki Nishikawa, Hiroshi Saruwatari:
Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures. EURASIP J. Adv. Signal Process. 2003(11): 1157-1166 (2003) - [j2]Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino:
Polar Coordinate Based Nonlinear Function for Frequency-Domain Blind Source Separation. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 86-A(3): 590-596 (2003) - [j1]Shoko Araki, Ryo Mukai, Shoji Makino, Tsuyoki Nishikawa, Hiroshi Saruwatari:
The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech. IEEE Trans. Speech Audio Process. 11(2): 109-116 (2003) - [c13]Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino:
A robust approach to the permutation problem of frequency-domain blind source separation. ICASSP (5) 2003: 381-384 - [c12]Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino:
Robust real-time blind source separation for moving speakers in a room. ICASSP (5) 2003: 469-472 - [c11]Shoko Araki, Shoji Makino, Robert Aichner, Tsuyoki Nishikawa, Hiroshi Saruwatari:
Subband based blind source separation for convolutive mixtures of speech. ICASSP (5) 2003: 509-512 - [c10]Mirko Knaak, Shoko Araki, Shoji Makino:
Geometrically constraint ICA for convolutive mixtures of sound. ICASSP (2) 2003: 725-728 - 2002
- [c9]Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino:
Polar coordinate based nonlinear function for frequency-domain blind source separation. ICASSP 2002: 1001-1004 - [c8]Shoko Araki, Yoichi Hinamoto, Shoji Makino, Tsuyoki Nishikawa, Ryo Mukai, Hiroshi Saruwatari:
Equivalence between frequency domain blind source separation and frequency domain adaptive beamforming. ICASSP 2002: 1785-1788 - [c7]Ryo Mukai, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Removal of residual cross-talk components in Blind Source Separation using time-delayed spectral subtraction. ICASSP 2002: 1789-1792 - [c6]Ryo Mukai, Shoko Araki, Hiroshi Sawada, Shoji Makino:
Removal of residual crosstalk components in blind source separation using LMS filters. NNSP 2002: 435-444 - [c5]Robert Aichner, Shoko Araki, Shoji Makino, Tsuyoki Nishikawa, Hiroshi Saruwatari:
Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming. NNSP 2002: 445-454 - [c4]Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Blind source separation with different sensor spacing and filter length for each frequency range. NNSP 2002: 465-474 - 2001
- [c3]Shoko Araki, Shoji Makino, Tsuyoki Nishikawa, Hiroshi Saruwatari:
Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech. ICASSP 2001: 2737-2740 - [c2]Shoko Araki, Shoji Makino, Ryo Mukai, Hiroshi Saruwatari:
Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers. INTERSPEECH 2001: 2595-2598 - [c1]Ryo Mukai, Shoko Araki, Shoji Makino:
Separation and dereverberation performance of frequency domain blind source separation for speech in a reverberant environment. INTERSPEECH 2001: 2599-2602
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-19 21:48 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint