default search action
Kazuki Shimada
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j2]Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Wei-Hsiang Liao, Yuki Mitsufuji:
HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes. Trans. Mach. Learn. Res. 2024 (2024) - [c14]Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-Shot Sound Event Localization and Detection. ICASSP 2024: 636-640 - [c13]Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. ICASSP 2024: 12951-12955 - [i16]Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Wei-Hsiang Liao, Yuki Mitsufuji:
HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes. CoRR abs/2401.00365 (2024) - 2023
- [c12]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification. ICASSP 2023: 1-5 - [c11]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [c10]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders toward Audio Restoration. WASPAA 2023: 1-5 - [d4]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.0.0. Zenodo, 2023 [all versions] - [d3]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.1.0. Zenodo, 2023 [all versions] - [i15]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification. CoRR abs/2302.08136 (2023) - [i14]Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Diffusion-based Signal Refiner for Speech Separation. CoRR abs/2305.05857 (2023) - [i13]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders Toward Audio Restoration. CoRR abs/2305.06701 (2023) - [i12]Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. CoRR abs/2305.10734 (2023) - [i11]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - [i10]Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-shot Sound Event Localization and Detection. CoRR abs/2309.09223 (2023) - 2022
- [c9]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c8]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. ICASSP 2022: 316-320 - [c7]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection. ICASSP 2022: 431-435 - [c6]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. ICASSP 2022: 8872-8876 - [c5]Kazuki Shimada, Taishi Sawabe, Hidehiko Shishido, Masayuki Kanbara, Itaru Kitahara:
Video Generation Unconsciously Evoking Pre-Motion to Passengers in Automated Vehicles. ISMAR Adjunct 2022: 342-347 - [d2]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.0.0. Zenodo, 2022 [all versions] - [d1]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.1.0. Zenodo, 2022 [all versions] - [i9]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - 2021
- [c4]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection. ICASSP 2021: 915-919 - [i8]Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji:
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection. CoRR abs/2106.10806 (2021) - [i7]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection. CoRR abs/2110.06126 (2021) - [i6]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. CoRR abs/2110.06501 (2021) - [i5]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training. CoRR abs/2110.07124 (2021) - 2020
- [c3]Kazuki Shimada, Yuichiro Koyama, Akira Inoue:
Metric Learning with Background Noise Class for Few-Shot Detection of Rare Sound Events. ICASSP 2020: 616-620 - [i4]Kazuki Shimada, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net. CoRR abs/2006.12014 (2020) - [i3]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. CoRR abs/2010.15306 (2020)
2010 – 2019
- 2019
- [j1]Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 27(5): 960-971 (2019) - [i2]Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition. CoRR abs/1903.09341 (2019) - [i1]Kazuki Shimada, Yuichiro Koyama, Akira Inoue:
Metric Learning with Background Noise Class for Few-shot Detection of Rare Sound Events. CoRR abs/1910.13724 (2019) - 2018
- [c2]Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition. ICASSP 2018: 5734-5738 - 2017
- [c1]Masato Mimura, Yoshiaki Bando, Kazuki Shimada, Shinsuke Sakai, Kazuyoshi Yoshii, Tatsuya Kawahara:
Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition. INTERSPEECH 2017: 2451-2455
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint