default search action
18th ECCV 2024: Milan, Italy - Part X
- Ales Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part X. Lecture Notes in Computer Science 15068, Springer 2025, ISBN 978-3-031-72683-5 - Chao Huang, Dejan Markovic, Chenliang Xu, Alexander Richard:
Modeling and Driving Human Body Soundfields Through Acoustic Primitives. 1-17 - Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna:
m &m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks. 18-34 - Jinxing Zhou, Dan Guo, Yuxin Mao, Yiran Zhong, Xiaojun Chang, Meng Wang:
Label-Anticipated Event Disentanglement for Audio-Visual Video Parsing. 35-51 - Qi Zuo, Xiaodong Gu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Lingteng Qiu, Liefeng Bo, Zilong Dong:
High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding. 52-69 - Hongtao Wu, Yijun Yang, Angelica I. Avilés-Rivero, Jingjing Ren, Sixiang Chen, Haoyu Chen, Lei Zhu:
Semi-supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization. 70-89 - Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang:
I-MedSAM: Implicit Medical Image Segmentation with Segment Anything. 90-107 - Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang:
ReMamber: Referring Image Segmentation with Mamba Twister. 108-126 - Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, Lin Gu:
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting. 127-145 - Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao:
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios. 146-164 - Hengyu Zhou, Hui Zhang, Bin Wang:
Segmentation-Guided Layer-Wise Image Vectorization with Gradient Fills. 165-180 - Yarden Frenkel, Yael Vinker, Ariel Shamir, Daniel Cohen-Or:
Implicit Style-Content Separation Using B-LoRA. 181-198 - Zijian Zhou, Zheng Zhu, Holger Caesar, Miaojing Shi:
OpenPSG: Open-Set Panoptic Scene Graph Generation via Large Multimodal Models. 199-215 - Liangyang Ouyang, Ruicong Liu, Yifei Huang, Ryosuke Furuta, Yoichi Sato:
ActionVOS: Actions as Prompts for Video Object Segmentation. 216-235 - Jiedong Zhuang, Jiaqi Hu, Lianrui Mu, Rui Hu, Xiaoyu Liang, Jiangnan Ye, Haoji Hu:
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance. 236-253 - Li Zhang, Weiqing Meng, Yan Zhong, Bin Kong, Mingliang Xu, Jianming Du, Xue Wang, Rujing Wang, Liu Liu:
U-COPE: Taking a Further Step to Universal 9D Category-Level Object Pose Estimation. 254-270 - Naiyu Yin, Hanjing Wang, Yue Yu, Tian Gao, Amit Dhurandhar, Qiang Ji:
Integrating Markov Blanket Discovery Into Causal Representation Learning for Domain Generalization. 271-288 - Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yun:
Rotary Position Embedding for Vision Transformer. 289-305 - Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim, Joon-Young Lee:
Local All-Pair Correspondence for Point Tracking. 306-325 - Youngmin Oh, Hyung-Il Kim, Seong Tae Kim, Jung Uk Kim:
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection. 326-345 - Taewoong Kim, Cheolhong Min, Byeonghwi Kim, Jinyeon Kim, Wonje Jeung, Jonghyun Choi:
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments. 346-364 - Dongze Li, Kang Zhao, Wei Wang, Yifeng Ma, Bo Peng, Yingya Zhang, Jing Dong:
S3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis. 365-382 - Hyolim Kang, Jeongseok Hyun, Joungbin An, Youngjae Yu, Seon Joo Kim:
ActionSwitch: Class-Agnostic Detection of Simultaneous Actions in Streaming Videos. 383-400 - Subin Jeon, In Cho, Minsu Kim, Woong Oh Cho, Seon Joo Kim:
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos. 401-419 - Xiaoyu Liu, Xin Ding, Lei Yu, Yuanyuan Xi, Wei Li, Zhijun Tu, Jie Hu, Hanting Chen, Baoqun Yin, Zhiwei Xiong:
PQ-SAM: Post-training Quantization for Segment Anything Model. 420-437 - Yuanhong Chen, Chong Wang, Yuyuan Liu, Hu Wang, Gustavo Carneiro:
CPM: Class-Conditional Prompting Machine for Audio-Visual Segmentation. 438-456 - Shreyank N. Gowda, Anurag Arnab, Jonathan Huang:
Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition. 457-474 - Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wang:
DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-directional Structure Alignment. 475-493
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.