default search action
Andrew Rouditchenko
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c13]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CVPR 2024: 18419-18429 - [i18]Andrew Rouditchenko, Yuan Gong, Samuel Thomas, Leonid Karlinsky, Hilde Kuehne, Rogério Feris, James R. Glass:
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation. CoRR abs/2406.10082 (2024) - 2023
- [c12]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. ICASSP 2023: 1-5 - [c11]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. ICLR 2023 - [c10]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. INTERSPEECH 2023: 2268-2272 - [i17]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, when, and where? - Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CoRR abs/2303.16990 (2023) - [i16]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. CoRR abs/2305.12606 (2023) - [i15]Andrew Rouditchenko, Ronan Collobert, Tatiana Likhomanenko:
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition. CoRR abs/2309.17395 (2023) - 2022
- [j1]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: Towards Unifying Audio and Visual Models. IEEE Signal Process. Lett. 29: 2437-2441 (2022) - [c9]Alexander H. Liu, SouYoung Jin, Cheng-I Lai, Andrew Rouditchenko, Aude Oliva, James R. Glass:
Cross-Modal Discrete Representation Learning. ACL (1) 2022: 3013-3035 - [c8]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CVPR 2022: 19988-19997 - [i14]Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James R. Glass:
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification. CoRR abs/2203.06760 (2022) - [i13]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: A Unified Model for Audio-Visual Learning. CoRR abs/2208.00061 (2022) - [i12]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. CoRR abs/2210.03625 (2022) - [i11]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. CoRR abs/2210.07839 (2022) - 2021
- [c7]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. ICCV 2021: 7992-8001 - [c6]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogério Schmidt Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021: 1584-1588 - [c5]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021: 3006-3010 - [c4]Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James R. Glass:
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset. Interspeech 2021: 3650-3654 - [i10]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Schmidt Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. CoRR abs/2104.12671 (2021) - [i9]Alexander H. Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James R. Glass:
Cross-Modal Discrete Representation Learning. CoRR abs/2106.05438 (2021) - [i8]Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James R. Glass:
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset. CoRR abs/2110.07575 (2021) - [i7]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. CoRR abs/2111.04823 (2021) - [i6]Kevin Duarte, Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Samuel Thomas, Alexander H. Liu, David Harwath, James R. Glass, Hilde Kuehne, Mubarak Shah:
Routing with Self-Attention for Multimodal Capsule Networks. CoRR abs/2112.00775 (2021) - [i5]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CoRR abs/2112.04446 (2021) - 2020
- [i4]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogério Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. CoRR abs/2006.09199 (2020)
2010 – 2019
- 2019
- [c3]Andrew Rouditchenko, Hang Zhao, Chuang Gan, Josh H. McDermott, Antonio Torralba:
Self-Supervised Segmentation and Source Separation on Videos. CVPR Workshops 2019: 0 - [c2]Andrew Rouditchenko, Hang Zhao, Chuang Gan, Josh H. McDermott, Antonio Torralba:
Self-supervised Audio-visual Co-segmentation. ICASSP 2019: 2357-2361 - [i3]Andrew Rouditchenko, Hang Zhao, Chuang Gan, Josh H. McDermott, Antonio Torralba:
Self-Supervised Audio-Visual Co-Segmentation. CoRR abs/1904.09013 (2019) - [i2]Tyler Lee, Ting Gong, Suchismita Padhy, Andrew Rouditchenko, Anthony Ndirango:
Label-efficient audio classification through multitask learning and self-supervision. CoRR abs/1910.12587 (2019) - 2018
- [c1]Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh H. McDermott, Antonio Torralba:
The Sound of Pixels. ECCV (1) 2018: 587-604 - [i1]Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh H. McDermott, Antonio Torralba:
The Sound of Pixels. CoRR abs/1804.03160 (2018)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 02:27 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint