default search action
MMM 2025, Nara, Japan - Part II
- Ichiro Ide, Ioannis Kompatsiaris, Changsheng Xu, Keiji Yanai, Wei-Ta Chu, Naoko Nitta, Michael Riegler, Toshihiko Yamasaki:
MultiMedia Modeling - 31st International Conference on Multimedia Modeling, MMM 2025, Nara, Japan, January 8-10, 2025, Proceedings, Part II. Lecture Notes in Computer Science 15521, Springer 2025, ISBN 978-981-96-2060-9
Regular Papers
- Yixiao Xu, Yubo Li, Wanzhao Xu, Yicheng Gu, Yun Wang, Jiangyuan Ma, Zhengwei Qi:
gFlow: Distributed Real-Time Reverse Remote Rendering System Model. 3-16 - Jiaxing Chen, Yuxuan Liu, Dehu Li, Xiang An, Weimo Deng, Ziyong Feng, Yongle Zhao, Yin Xie:
Grounding Deliberate Reasoning in Multimodal Large Language Models. 17-30 - Shuijing Zheng, Suxi Yu, Yi Wang, Jing Wen:
GWUNet: A UNet with Gated Attention and Improved Wavelet Transform for Thyroid Nodules Segmentation. 31-44 - Liang-Chia Chen, Wei-Ta Chu:
HCV: Lightweight Hybrid CNN-Vision Transformer for Visual Object Tracking. 45-59 - Alex Falcon, Ali Abdari, Giuseppe Serra:
HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse. 60-73 - Yuyao Ye, Jiayu Yang, Yang Zhao, Mengping Gao, Hongbin Cao, Ronggang Wang:
Hybrid Scalable Video Coding with Neural Compression and Enhancement for Streaming Media. 74-86 - Jingkun Li, Na Qi, Qing Zhu:
Hyper-NeuS: Hypernetworks for Neural SDF Implicit Surface Reconstruction by Volume Rendering. 87-100 - Vu Thi Ngoc Anh, Yoshiyuki Shoji, Yuma Oe, Huu-Long Pham, Hiroaki Ohshima:
Image-Generation AI Model Retrieval by Contrastive Learning-Based Style Distance Calculation. 101-114 - Miguel Perez, Holger Kirchhoff, Peter Grosche, Xavier Serra:
Improving Singing Voice Transcription Generalization with AI Generated Accompaniments. 115-128 - Xiuhong Li, Xinyue Zhu, Boyuan Li, Songlin Li, Luyao Wang, Zhenhong Jia:
Infrared Small Target Detection with Feature Refinement and Context Enhancement. 129-140 - Wolfgang Hürst, Yannick Visser:
Innovative Lifelog Visualization and Exploration in Virtual Reality - A Comparative Study. 141-154 - Xiukang Yang, Jingguo Ge, Hui Li, Liangxiong Li, Bingzhen Wu:
Integrating S1 &S2 Framework for Enhanced Semantic Match in Person Re-identification. 155-168 - Xiang Tian, Yuan Zhang, Chang Mu, Ziyang Zhang:
Intra-class Compact Facial Expression Recognition Based on Amplitude Phase Separation. 169-182 - Fei Wu, Ruixuan Zhou, Yimu Ji, Xiao-Yuan Jing:
Joint Decision Network with Modality-Specific and Dual Interactive Features for Fake News Detection. 183-196 - Kosetsu Tsukuda, Takumi Takahashi, Keisuke Ishida, Masahiro Hamasaki, Masataka Goto:
Kiite World: Socializing Map-Based Music Exploration Through Playlist Sharing and Synchronized Listening. 197-211 - Honghui Yuan, Keiji Yanai:
KuzushijiDiffuser: Japanese Kuzushiji Font Generation with FontDiffuser. 212-225 - Jingyao Zhang, Shijie Hao, Fuming Sun, Yuan Rao:
LIESA: Low-Light Image Enhancement with Semantic Awareness. 226-239 - Jiajie Liu, Zhibin Zhang:
Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network. 240-253 - Ilhwan Kwon, Jun Li, Rajiv Ratn Shah, Mukesh Prasad:
Lightweight Motion-Aware Video Super-Resolution for Compressed Videos. 254-267 - Tatsumi Sunada, Kaede Shiohara, Ling Xiao, Toshihiko Yamasaki:
LITA: LMM-Guided Image-Text Alignment for Art Assessment. 268-281 - Qing Wang, Chong-Wah Ngo, Ee-Peng Lim, Qianru Sun:
LLMs-Based Augmentation for Domain Adaptation in Long-Tailed Food Datasets. 282-295 - Jiahua Si, Youze Wang, Wenbo Hu, Qiang Liu, Richang Hong:
Making Strides Security in Multimodal Fake News Detection Models: A Comprehensive Analysis of Adversarial Attacks. 296-309 - Deli Zhu, Zhao Xu, Yunong Yang:
MambaTalk: Speech-Driven 3D Facial Animation with Mamba. 310-323 - Jingdong Wang, Xu Ding, Fanqi Meng:
MC-YOLO: Multi-scale Transmission Line Defect Target Recognition Network. 324-337 - Hao Yan, Jing Bai:
MDT-Net: A Mask Decoder Tuning Strategy for CLIP-Based Zero-Shot 3D Classification. 338-350 - Zepu Yi, Songfeng Lu, Xueming Tang, Jianxin Zhu, Junjun Wu:
MICAN: Multi-modal Inconsistency-Based Cooperation Attention Network for Fake News Detection. 351-363 - Yaling Hao, Wei Wu:
MineTinyNet-YOLO: An Efficient Small Object Detection Method for Complex Underground Coal Mine Scenarios. 364-378 - Xin Lim, Lai-Kuan Wong, Yuen Peng Loh, Ke Gu, Weisi Lin:
Mix-YOLONet: Deep Image Dehazing for Improving Object Detection. 379-393 - Jiahao Zhang, Xiao Zhao, Guangyu Gao:
MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms. 394-407 - Zeyu Cai, Can Zhang, Yuchong Chen, Xunhao Chen, Jiming Yang, Wubin Shi, Feipeng Da, Chengqian Jin:
MLP-AMDC: A MLP Architecture for Adaptive-Mask-Based Dual-Camera Snapshot Hyperspectral Imaging. 408-423 - Junhao Guo, Chenhan Fu, Guoming Wang, Rongxing Lu, Dong Chen, Siliang Tang:
MM-CARP: Multimodal Model with Cross-Modal Retrieval-Augmented and Visual Region Perception. 424-437 - Guohui Ding, Zhonghua Li, Yongqiang Ren:
Modality-Specific Hashing: Transform Cross-Modal Retrieval Into Single-Modal Retrieval. 438-451
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.