default search action
Shusuke Takahashi
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j4]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The whole is greater than the sum of its parts: improving music source separation by bridging networks. EURASIP J. Audio Speech Music. Process. 2024(1): 39 (2024) - [j3]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. Trans. Int. Soc. Music. Inf. Retr. 7(1): 44-62 (2024) - [c17]Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-Shot Sound Event Localization and Detection. ICASSP 2024: 636-640 - [c16]Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. ICASSP 2024: 12951-12955 - [i29]Shiqi Yang, Zhi Zhong, Mengjie Zhao, Shusuke Takahashi, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji:
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation. CoRR abs/2405.14598 (2024) - [i28]Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Shusuke Takahashi, Yuki Mitsufuji:
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training. CoRR abs/2406.01867 (2024) - [i27]Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond. CoRR abs/2406.17672 (2024) - [i26]Saurav Jha, Shiqi Yang, Masato Ishii, Mengjie Zhao, Christian Simon, Muhammad Jehanzeb Mirza, Dong Gong, Lina Yao, Shusuke Takahashi, Yuki Mitsufuji:
Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models. CoRR abs/2410.00700 (2024) - [i25]Mengjie Zhao, Zhi Zhong, Zhuoyuan Mao, Shiqi Yang, Wei-Hsiang Liao, Shusuke Takahashi, Hiromi Wakaki, Yuki Mitsufuji:
OpenMU: Your Swiss Army Knife for Music Understanding. CoRR abs/2410.15573 (2024) - [i24]Wei-Hsiang Liao, Yuhta Takida, Yukara Ikemiya, Zhi Zhong, Chieh-Hsin Lai, Giorgio Fabbro, Kazuki Shimada, Keisuke Toyama, Kin Wai Cheuk, Marco A. Martínez Ramírez, Shusuke Takahashi, Stefan Uhlich, Taketo Akama, Woosung Choi, Yuichiro Koyama, Yuki Mitsufuji:
Music Foundation Model as Generic Booster for Music Downstream Tasks. CoRR abs/2411.01135 (2024) - 2023
- [c15]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP 2023: 1-5 - [c14]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification. ICASSP 2023: 1-5 - [c13]Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement. INTERSPEECH 2023: 3824-3828 - [c12]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [c11]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders toward Audio Restoration. WASPAA 2023: 1-5 - [d4]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.0.0. Zenodo, 2023 [all versions] - [d3]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.1.0. Zenodo, 2023 [all versions] - [i23]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification. CoRR abs/2302.08136 (2023) - [i22]Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Diffusion-based Signal Refiner for Speech Separation. CoRR abs/2305.05857 (2023) - [i21]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders Toward Audio Restoration. CoRR abs/2305.06701 (2023) - [i20]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation. CoRR abs/2305.07855 (2023) - [i19]Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. CoRR abs/2305.10734 (2023) - [i18]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - [i17]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada P. Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. CoRR abs/2308.06981 (2023) - [i16]Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-shot Sound Event Localization and Detection. CoRR abs/2309.09223 (2023) - 2022
- [j2]Yuhta Takida, Wei-Hsiang Liao, Chieh-Hsin Lai, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji:
Preventing oversmoothing in VAE via generalized variance parameterization. Neurocomputing 509: 137-156 (2022) - [j1]Shusuke Takahashi, Yusuke Izawa, Hidehiko Masuhara, Youyou Cong:
An Approach to Collecting Object Graphs for Data-structure Live Programming Based on a Language Implementation Framework. J. Inf. Process. 30: 451-463 (2022) - [c10]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c9]Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation With Deep Equilibrium Models. ICASSP 2022: 296-300 - [c8]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. ICASSP 2022: 316-320 - [c7]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection. ICASSP 2022: 431-435 - [c6]Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi:
Improving Character Error Rate is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-Box Acoustic Models. ICASSP 2022: 991-995 - [c5]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. ICASSP 2022: 8872-8876 - [c4]Yuhta Takida, Takashi Shibuya, Wei-Hsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji:
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization. ICML 2022: 20987-21012 - [d2]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.0.0. Zenodo, 2022 [all versions] - [d1]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.1.0. Zenodo, 2022 [all versions] - [i15]Yuhta Takida, Takashi Shibuya, Wei-Hsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji:
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization. CoRR abs/2205.07547 (2022) - [i14]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - [i13]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. CoRR abs/2210.05148 (2022) - [i12]Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
A Versatile Diffusion-based Generative Refiner for Speech Enhancement. CoRR abs/2210.17287 (2022) - 2021
- [c3]Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
All For One And One For All: Improving Music Separation By Bridging Networks. ICASSP 2021: 51-55 - [c2]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection. ICASSP 2021: 915-919 - [c1]Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi:
Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex. Interspeech 2021: 1134-1138 - [i11]Yuhta Takida, Wei-Hsiang Liao, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji:
Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE. CoRR abs/2102.08663 (2021) - [i10]Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi:
Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex. CoRR abs/2106.02331 (2021) - [i9]Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji:
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection. CoRR abs/2106.10806 (2021) - [i8]Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi:
Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models. CoRR abs/2110.05968 (2021) - [i7]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection. CoRR abs/2110.06126 (2021) - [i6]Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation with Deep Equilibrium Models. CoRR abs/2110.06494 (2021) - [i5]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. CoRR abs/2110.06501 (2021) - [i4]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training. CoRR abs/2110.07124 (2021) - 2020
- [i3]Kazuki Shimada, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net. CoRR abs/2006.12014 (2020) - [i2]Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
All for One and One for All: Improving Music Separation by Bridging Networks. CoRR abs/2010.04228 (2020) - [i1]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. CoRR abs/2010.15306 (2020)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-12 21:57 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint