default search action
Anurag Kumar 0003
Person information
- affiliation: Facebook Research, Facebook Reality Labs, Redmond, WA USA
- affiliation (former): Indian Institute of Technology, Kanpur, India
Other persons with the same name
- Anurag Kumar — disambiguation page
- Anurag Kumar 0001 — Indian Institute of Science, Bangalore
- Anurag Kumar 0002 — Cadence Design Systems (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Orel Ben Zaken, Anurag Kumar, Vladimir Tourbabin, Boaz Rafaely:
Neural-Network-Based Direction-of-Arrival Estimation for Reverberant Speech - The Importance of Energetic, Temporal, and Spatial Information. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1298-1309 (2024) - [c55]Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard:
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark. CVPR 2024: 21886-21896 - [c54]Heeseung Yun, Ruohan Gao, Ishwarya Ananthabhotla, Anurag Kumar, Jacob Donley, Chao Li, Gunhee Kim, Vamsi Krishna Ithapu, Calvin Murdock:
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos. ECCV (24) 2024: 256-274 - [c53]Bar Shaybet, Anurag Kumar, Vladimir Tourbabin, Boaz Rafaely:
Ambisonics Networks - The Effect of Radial Functions Regularization. ICASSP 2024: 666-670 - [c52]Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar:
A Closer Look at Wav2vec2 Embeddings for On-Device Single-Channel Speech Enhancement. ICASSP 2024: 751-755 - [c51]Vahid Ahmadi Kalkhorani, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang:
Audiovisual Speaker Separation with Full- and Sub-Band Modeling in the Time-Frequency Domain. ICASSP 2024: 12001-12005 - [c50]Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe:
Cross-Talk Reduction. IJCAI 2024: 5171-5180 - [i54]Bar Shaybet, Anurag Kumar, Vladimir Tourbabin, Boaz Rafaely:
Ambisonics Networks - The Effect Of Radial Functions Regularization. CoRR abs/2402.18968 (2024) - [i53]Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar:
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement. CoRR abs/2403.01369 (2024) - [i52]Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard:
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark. CoRR abs/2403.18821 (2024) - [i51]Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe:
Cross-Talk Reduction. CoRR abs/2405.20402 (2024) - [i50]Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian:
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement. CoRR abs/2406.04660 (2024) - [i49]Vahid Ahmadi Kalkhorani, Cheng Yu, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang:
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling. CoRR abs/2406.11619 (2024) - [i48]Gaël Le Lan, Bowen Shi, Zhaoheng Ni, Sidd Srinivasan, Anurag Kumar, Brian Ellis, David Kant, Varun Nagaraja, Ernie Chang, Wei-Ning Hsu, Yangyang Shi, Vikas Chandra:
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching. CoRR abs/2407.03648 (2024) - [i47]Heeseung Yun, Ruohan Gao, Ishwarya Ananthabhotla, Anurag Kumar, Jacob Donley, Chao Li, Gunhee Kim, Vamsi Krishna Ithapu, Calvin Murdock:
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos. CoRR abs/2408.05364 (2024) - [i46]Daniel A. Mitchell, Boaz Rafaely, Anurag Kumar, Vladimir Tourbabin:
Improved direction of arrival estimations with a wearable microphone array for dynamic environments by reliability weighting. CoRR abs/2409.14346 (2024) - 2023
- [c49]Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao:
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch. ASRU 2023: 1-9 - [c48]Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu:
Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-Channel Speech Enhancement. ICASSP 2023: 1-5 - [c47]Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu:
Torchaudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in Torchaudio. ICASSP 2023: 1-5 - [c46]Pranay Manocha, Israel D. Gebru, Anurag Kumar, Dejan Markovic, Alexander Richard:
Nord: Non-Matching Reference Based Relative Depth Estimation from Binaural Speech. ICASSP 2023: 1-5 - [c45]Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic:
LA-VOCE: LOW-SNR Audio-Visual Speech Enhancement Using Neural Vocoders. ICASSP 2023: 1-5 - [c44]Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5 - [c43]Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5 - [c42]Vahid Ahmadi Kalkhorani, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang:
Time-domain Transformer-based Audiovisual Speaker Separation. INTERSPEECH 2023: 3472-3476 - [c41]Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong:
Rethinking Complex-Valued Deep Neural Networks for Monaural Speech Enhancement. INTERSPEECH 2023: 3889-3893 - [c40]Pranay Manocha, Israel Dejene Gebru, Anurag Kumar, Dejan Markovic, Alexander Richard:
Spatialization Quality Metric for Binaural Speech. INTERSPEECH 2023: 5426-5430 - [i45]Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong:
Rethinking complex-valued deep neural networks for monaural speech enhancement. CoRR abs/2301.04320 (2023) - [i44]Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08088 (2023) - [i43]Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08095 (2023) - [i42]Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis:
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch. CoRR abs/2310.17864 (2023) - 2022
- [j2]Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Paris Smaragdis, Anurag Kumar:
RemixIT: Continual Self-Training of Speech Enhancement Models via Bootstrapped Remixing. IEEE J. Sel. Top. Signal Process. 16(6): 1329-1341 (2022) - [c39]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990 - [c38]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
TPARN: Triple-Path Attentive Recurrent Network for Time-Domain Multichannel Speech Enhancement. ICASSP 2022: 6497-6501 - [c37]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
Multichannel Speech Enhancement Without Beamforming. ICASSP 2022: 6502-6506 - [c36]Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar:
Continual Self-Training With Bootstrapped Remixing For Speech Enhancement. ICASSP 2022: 6947-6951 - [c35]Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar:
The Impact of Removing Head Movements on Audio-Visual Speech Enhancement. ICASSP 2022: 7302-7306 - [c34]Hanan Beit-On, Moti Lugasi, Lior Madmoni, Anjali Menon, Anurag Kumar, Jacob Donley, Vladimir Tourbabin, Boaz Rafaely:
Audio Signal Processing for Telepresence Based on Wearable Array in Noisy and Dynamic Scenes. ICASSP 2022: 8797-8801 - [c33]Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf:
Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks. ICASSP 2022: 8862-8866 - [c32]Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel Dejene Gebru, Vamsi Krishna Ithapu, Paul Calamia:
SAQAM: Spatial Audio Quality Assessment Metric. INTERSPEECH 2022: 649-653 - [c31]Pranay Manocha, Anurag Kumar:
Speech Quality Assessment through MOS using Non-Matching References. INTERSPEECH 2022: 654-658 - [c30]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
Time-domain Ad-hoc Array Speech Enhancement Using a Triple-path Network. INTERSPEECH 2022: 729-733 - [c29]Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Improving Speech Enhancement through Fine-Grained Speech Characteristics. INTERSPEECH 2022: 2953-2957 - [c28]Orel Ben Zaken, Boaz Rafaely, Anurag Kumar, Vladimir Tourbabin:
Direction Of Arrival Estimation For Reverberant Speech Based On Neural Networks And The Direct-Path Dominance Test. IWAENC 2022: 1-5 - [i41]Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar:
The impact of removing head movements on audio-visual speech enhancement. CoRR abs/2202.00538 (2022) - [i40]Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris Smaragdis, Anurag Kumar:
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing. CoRR abs/2202.08862 (2022) - [i39]Pranay Manocha, Anurag Kumar:
Speech Quality Assessment through MOS using Non-Matching References. CoRR abs/2206.12285 (2022) - [i38]Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia:
SAQAM: Spatial Audio Quality Assessment Metric. CoRR abs/2206.12297 (2022) - [i37]Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Improving Speech Enhancement through Fine-Grained Speech Characteristics. CoRR abs/2207.00237 (2022) - [i36]Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu:
Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement. CoRR abs/2211.08624 (2022) - [i35]Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic:
LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders. CoRR abs/2211.10999 (2022) - 2021
- [j1]Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi:
SAGRNN: Self-Attentive Gated RNN For Binaural Speaker Separation With Interaural Cue Preservation. IEEE Signal Process. Lett. 28: 26-30 (2021) - [c27]Yangyang Xia, Buye Xu, Anurag Kumar:
Incorporating Real-World Noisy Speech in Neural-Network-Based Speech Enhancement Systems. ASRU 2021: 564-570 - [c26]Panagiotis Tzirakis, Anurag Kumar, Jacob Donley:
Multi-Channel Speech Enhancement Using Graph Neural Networks. ICASSP 2021: 3415-3419 - [c25]Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen:
Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning. Interspeech 2021: 1214-1218 - [c24]Pranay Manocha, Buye Xu, Anurag Kumar:
NORESQA: A Framework for Speech Quality Assessment using Non-Matching References. NeurIPS 2021: 22363-22378 - [c23]Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia:
DPLM: A Deep Perceptual Spatial-Audio Localization Metric. WASPAA 2021: 6-10 - [i34]Panagiotis Tzirakis, Anurag Kumar, Jacob Donley:
Multi-Channel Speech Enhancement using Graph Neural Networks. CoRR abs/2102.06934 (2021) - [i33]Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia:
DPLM: A Deep Perceptual Spatial-Audio Localization Metric. CoRR abs/2105.14180 (2021) - [i32]Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen:
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning. CoRR abs/2106.11335 (2021) - [i31]Ori Kabeli, Yossi Adi, Zhenyu Tang, Buye Xu, Anurag Kumar:
Online Self-Attentive Gated RNNs for Real-Time Speaker Separation. CoRR abs/2106.13493 (2021) - [i30]Yangyang Xia, Buye Xu, Anurag Kumar:
Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems. CoRR abs/2109.05172 (2021) - [i29]Pranay Manocha, Buye Xu, Anurag Kumar:
NORESQA - A Framework for Speech Quality Assessment using Non-Matching References. CoRR abs/2109.08125 (2021) - [i28]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021) - [i27]Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf:
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks. CoRR abs/2110.07313 (2021) - [i26]Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar:
Continual self-training with bootstrapped remixing for speech enhancement. CoRR abs/2110.10103 (2021) - [i25]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement. CoRR abs/2110.10757 (2021) - [i24]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement. CoRR abs/2110.11844 (2021) - [i23]Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
Multichannel Speech Enhancement without Beamforming. CoRR abs/2110.13130 (2021) - [i22]Jonah Casebeer, Jacob Donley, Daniel Wong, Buye Xu, Anurag Kumar:
NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers. CoRR abs/2112.04613 (2021) - 2020
- [c22]Anurag Kumar, Vamsi Krishna Ithapu:
SeCoST: : Sequential Co-Supervision for Large Scale Weakly Labeled Audio Event Detection. ICASSP 2020: 666-670 - [c21]Anurag Kumar, Vamsi K. Ithapu:
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition. ICML 2020: 5447-5457 - [i21]Anurag Kumar, Vamsi Krishna Ithapu:
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition. CoRR abs/2007.00144 (2020) - [i20]Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi:
SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation. CoRR abs/2009.01381 (2020)
2010 – 2019
- 2019
- [c20]Anurag Kumar, Ankit Shah, Alexander G. Hauptmann, Bhiksha Raj:
Learning Sound Events from Webly Labeled Data. IJCAI 2019: 2772-2778 - [i19]Anurag Kumar, Vamsi Krishna Ithapu:
SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection. CoRR abs/1910.11789 (2019) - 2018
- [c19]Anurag Kumar, Maksim Khadkevich, Christian Fügen:
Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes. ICASSP 2018: 326-330 - [c18]Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj:
Framework for Evaluation of Sound Event Detection in Web Videos. ICASSP 2018: 3096-3100 - [c17]Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, Bhiksha Raj:
Content-Based Representations of Audio Using Siamese Neural Networks. ICASSP 2018: 3136-3140 - [c16]Anurag Kumar, Bhiksha Raj:
Classifier Risk Estimation Under Limited Labeling Resources. PAKDD (1) 2018: 3-15 - [i18]Benjamin Elizalde, Rohan Badlani, Ankit Shah, Anurag Kumar, Bhiksha Raj:
NELS - Never-Ending Learner of Sounds. CoRR abs/1801.05544 (2018) - [i17]Ankit Shah, Anurag Kumar, Alexander G. Hauptmann, Bhiksha Raj:
A Closer Look at Weak Label Learning for Audio Events. CoRR abs/1804.09288 (2018) - [i16]Anurag Kumar, Ankit Shah, Alexander G. Hauptmann, Bhiksha Raj:
Learning Sound Events From Webly Labeled Data. CoRR abs/1811.09967 (2018) - 2017
- [c15]Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian R. Lane:
An approach for self-training audio event detectors using web data. EUSIPCO 2017: 1863-1867 - [c14]Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole:
Discovering sound concepts and acoustic relations in text. ICASSP 2017: 631-635 - [c13]Bhiksha Raj, Anurag Kumar:
Audio event and scene recognition: A unified approach using strongly and weakly labeled data. IJCNN 2017: 3475-3482 - [c12]Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
Audio Content Based Geotagging in Multimedia. INTERSPEECH 2017: 1874-1878 - [i15]Anurag Kumar, Bhiksha Raj:
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data. CoRR abs/1707.02530 (2017) - [i14]Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, Bhiksha Raj:
Content-based Representations of audio using Siamese neural networks. CoRR abs/1710.10974 (2017) - [i13]Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj:
Framework for evaluation of sound event detection in web videos. CoRR abs/1711.00804 (2017) - [i12]Anurag Kumar, Maksim Khadkevich, Christian Fügen:
Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes. CoRR abs/1711.01369 (2017) - 2016
- [c11]Benjamin Elizalde, Anurag Kumar, Ankit Shah, Rohan Badlani, Emmanuel Vincent, Bhiksha Raj, Ian R. Lane:
Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording. DCASE 2016: 20-24 - [c10]Anurag Kumar, Bhiksha Raj:
Weakly supervised scalable audio content analysis. ICME 2016: 1-6 - [c9]Anurag Kumar, Dinei A. F. Florêncio:
Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks. INTERSPEECH 2016: 3738-3742 - [c8]Anurag Kumar, Bhiksha Raj:
Audio Event Detection using Weakly Labeled Data. ACM Multimedia 2016: 1038-1047 - [i11]Anurag Kumar, Bhiksha Raj:
Audio Event Detection using Weakly Labeled Data. CoRR abs/1605.02401 (2016) - [i10]Anurag Kumar, Dinei Florêncio:
Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks. CoRR abs/1605.02427 (2016) - [i9]Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
Audio Content based Geotagging in Multimedia. CoRR abs/1606.02816 (2016) - [i8]Anurag Kumar, Bhiksha Raj:
Weakly Supervised Scalable Audio Content Analysis. CoRR abs/1606.03664 (2016) - [i7]Anurag Kumar, Bhiksha Raj:
Classifier Risk Estimation under Limited Labeling Resources. CoRR abs/1607.02665 (2016) - [i6]Anurag Kumar, Bhiksha Raj:
Features and Kernels for Audio Event Recognition. CoRR abs/1607.05765 (2016) - [i5]Benjamin Elizalde, Anurag Kumar, Ankit Shah, Rohan Badlani, Emmanuel Vincent, Bhiksha Raj, Ian R. Lane:
Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording. CoRR abs/1607.06706 (2016) - [i4]Ankit Shah, Rohan Badlani, Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
An Approach for Self-Training Audio Event Detectors Using Web Data. CoRR abs/1609.06026 (2016) - [i3]Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole:
Discovering Sound Concepts and Acoustic Relations In Text. CoRR abs/1609.07384 (2016) - [i2]Anurag Kumar, Bhiksha Raj:
Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data. CoRR abs/1611.04871 (2016) - 2015
- [c7]Anurag Kumar, Bhiksha Raj:
A novel ranking method for multiple classifier systems. ICASSP 2015: 1931-1935 - [c6]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann:
CMU Informedia@TRECVID 2015: MED/SIN/LNK/SED. TRECVID 2015 - [i1]Anurag Kumar, Bhiksha Raj:
Unsupervised Fusion Weight Learning in Multiple Classifier Systems. CoRR abs/1502.01823 (2015) - 2014
- [c5]Anurag Kumar, Rita Singh, Bhiksha Raj:
Detecting sound objects in audio recordings. EUSIPCO 2014: 905-909 - [c4]Karan Nathwani, Anurag Kumar, Rajesh M. Hegde:
Monaural speaker segregation using group delay spectral matrix factorization. NCC 2014: 1-6 - [c3]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann, Anil Armagan, Yicheng Zhao:
Informedia @ TRECVID 2014. TRECVID 2014 - 2013
- [c2]Anurag Kumar, Rajesh M. Hegde, Rita Singh, Bhiksha Raj:
Event detection in short duration audio using Gaussian Mixture Model and Random Forest Classifier. EUSIPCO 2013: 1-5 - 2012
- [c1]Anurag Kumar, Pranay Dighe, Rita Singh, Sourish Chaudhuri, Bhiksha Raj:
Audio event detection from acoustic unit occurrence patterns. ICASSP 2012: 489-492
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-16 23:11 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint