default search action
MMAsia 2024: Auckland, New Zealand
- Ruili Wang, Zhiyong Wang, Jiaying Liu, Alberto Del Bimbo, Jun Zhou, Anup Basu, Min Xu:
Proceedings of the 6th ACM International Conference on Multimedia in Asia, MMAsia 2024, AucklandNew Zealand, December 3-6, 2024. ACM 2024, ISBN 979-8-4007-1273-9
Full Papers
- Chen-Hsiu Huang, Ja-Ling Wu:
SLIC: Secure Learned Image Codec through Compressed Domain Watermarking to Defend Image Manipulation. 1:1-1:7 - Bingyang Cui, Yujie Zhang, Qi Yang, Yiling Xu:
MS-GeodesicPSIM: Predicting the Quality of Static Mesh with Texture Map via multi-scale Geodesic Patch Similarity. 2:1-2:7 - Haowei Lou, Hye-Young Paik, Wen Hu, Lina Yao:
StyleSpeech: Parameter-efficient Fine Tuning for Pre-trained Controllable Text-to-Speech. 3:1-3:7 - Jian Ma, Bin Zhu, Kun Li, Dima Damen:
Active Object Segmentation: A New Modality for Egocentric Action Recognition. 4:1-4:7 - Xiaoyi Han, Yanfei Wu, Nan Pu, Zunlei Feng, Qifei Zhang, Yijun Bei, Lechao Cheng:
Fire and Smoke Detection with Burning Intensity Representation. 5:1-5:8 - Xianbin Hu, Wei Wu, Zhu Li:
RandommaskFormer: Light Weight Remote Sensing Scene Classification with Masked Transformer. 6:1-6:7 - Tailin Yang, Wei Wu, Zhu Li, Rui Zhou:
Multi-Frame Sparse Convolutional Learning for Point Cloud Color Denoising. 7:1-7:7 - Yiran Chen, Haoran Liu, Mingzhe Liu, Yanhua Liu, Ruili Wang, Peng Li:
Moving Object Tracking based on Kernel and Random-coupled Neural Network. 8:1-8:6 - Zihuang Wu, Xinyu Xiong, Ying Chen, Siying Li, Hua Chen:
MoE-Polyp: Shifting More Attention to Small Polyp Segmentation via Mixture-of-Experts. 9:1 - Zhiyuan Wang, Cong Yang, Yulu Zhang, Zeyd Boukhers, Wei Sui, Yi Ji, Chunping Liu:
Transition in Focus of Prediction Tasks for Skeleton Graph Component Detection with Transformer. 10:1-10:7 - Chenqiu Zhao, Guanfang Dong, Anup Basu:
Accelerating Inference of Networks in the Frequency Domain. 11:1-11:7 - Qi Yang, Kaifa Yang, Yuke Xing, Yiling Xu, Zhu Li:
A Benchmark for Gaussian Splatting Compression and Quality Assessment Study. 12:1-12:8 - Zichuan Huang, Yifan Li, Shuai Yang, Jiaying Liu:
CoolColor: Text-guided COherent OLd film COLORization. 13:1-13:7 - Takara Taniguchi, Ryosuke Furuta:
Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga. 14:1-14:8 - Shuwei Peng, Xu Zhang, Aiwen Jiang, Changhong Liu, Jihua Ye:
Low-Light Image Enhancement via FourierTMamba: A Hybrid Frequency-Spatial Approach. 15:1-15:8 - Xin Li, Feng Xu, Yao Tong, Fan Liu, Yiwei Fang, Xin Lyu, Jun Zhou:
FreqFormer: A Frequency Transformer for Semantic Segmentation of Remote Sensing Images. 16:1-16:8 - Ze Kun Wang, Zhan Jun Si:
Adaptive Both homo- and hetero-Feature Integration for Multimodal Emotion Recognition. 17:1 - Ruikun Zhang, Hao Yang, Yan Yang, Ying Fu, Liyuan Pan:
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset. 18:1 - Guan-Yu Wu, Wei-Ta Chu:
Incremental Few-Shot Object Detection by Leveraging External Information from Large Multimodal Models. 19:1-19:7 - Yuang Liu, Dacheng Liao, Mengshi Qi, Liang Liu, Huadong Ma:
RoboFormer: A Robust Multi-Modal Transformer for 3D Object Detection in Autonomous Driving. 20:1-20:7 - Fanyi Wang, Peng Liu, Haotian Hu, Dan Meng, Jingwen Su, Jinjin Xu, Yanhao Zhang, Xiaoming Ren, Zhiwang Zhang:
LoopAnimate: Loopable Salient Object Animation. 21:1-21:8 - Yuankang Pan, Zhaoquan Yuan, Xiao Wu, Zechao Li, Changsheng Xu:
TMM-CLIP: Task-guided Multi-Modal Alignment for Rehearsal-Free Class Incremental Learning. 22:1-22:7 - Long H. Nguyen, Nhat Truong Pham, Mustaqeem Khan, Alice Othmani, Abdulmotaleb El-Saddik:
HuBERT-CLAP: Contrastive Learning-Based Multimodal Emotion Recognition using Self-Alignment Approach. 23:1-23:6 - Tianchen Zhou, Jiateng Liu, Yue Jin, Li Yao:
MicroMamba: State Space Model with Partitioned Window Scan for Micro-Expression Recognition. 24:1-24:7 - Yang Yi, Dasith de Silva Edirimuni, Ye Zhu, Shang Gao, Zhiyong Wang, Antonio Robles-Kelly, Xuequan Lu:
Point Cloud Normal Estimation via Representation Learning on Height Maps. 25:1-25:7 - Jieqiong Zhou, Guoqing Zhang, Yuhui Zheng, Fuguo Zhang:
Local Feature-Emphasizing Transformer for Cloth-Changing Person Re-identification. 26:1 - Chao Tan, Sheng Li, Yang Cao, Zhao Ren, Tanja Schultz:
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. 27:1-27:7 - Faisal Ahmed, Justin Rozeboom, Hanran Song, Chenqiu Zhao, Anup Basu:
LMoW: A Latent Random Variable Model for Unconditional Human Motion Generation. 28:1-28:8 - Yujia Xu, Deyu Pan, Ling Ding:
A method for detecting hands off the steering wheel. 29:1-29:6 - Luhao Zhu, Xiangwei Kong, Runsen Li, Guodong Guo:
Where You See Is What You Know: A Visual-Semantic Conceptual Explainer. 30:1-30:7 - Shogo Yonezawa, Yukinobu Taniguchi, Go Irie:
Bivariate Mixup for 2D Contact Point Localization with Piezoelectric Microphone Array. 31:1-31:7 - Jinheng Zhou, Wu Liu, Guang Yang, He Zhao, feiniu yuan:
Prompting Industrial Anomaly Segment with Large Vision-Language Models. 32:1 - Yi-Chen Li, Chih-Fan Hsu, Jian-Kai Wang, Chung-Chi Tsai, Cheng-Hsin Hsu:
MAFS: Modality-Aware Federated Semi-Supervised Learning with Selective Data Sharing Specified by Individual Clients. 33:1-33:8 - Yiran Song, Qianyu Zhou, Kun Hu, Lizhuang Ma, Xuequan Lu:
CFRL: Coarse-Fine Decoupled Representation Learning For Long-Tailed Recognition. 34:1-34:7 - Tao Jiang, Feng Hou, Yi Wang:
Multimodal Energy Prompting for Video Salient Object Detection. 35:1-35:8 - Yingkai He, Zhen Zhang, Jing Xiao:
A Multi-scale Framework towards Human-Machine Friendly Remote Sensing Image Coding. 36:1-36:6 - Mingwei Cao, Fengna Wang, Dengdi Sun, Haifeng Zhao:
BCS-NeRF: Bundle Cross-Sensing Neural Radiance Fields. 37:1-37:8 - Zhenzhen Hu, Xin Guan, Jia Li, Zijie Song, Richang Hong:
Dual-Stream Keyframe Enhancement for Video Question Answering. 38:1-38:7 - Junchao Ge, Huafeng Li, Yafei Zhang:
Robust discriminative and modal-consistent feature learning for fine-grained sketch-based image retrieval. 39:1-39:8 - Ying Hu, Chenyi Zhuang, Pan Gao:
DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer. 40:1 - Mary Pilataki, Matthias Mauch, Simon Dixon:
Pitch-aware generative pretraining improves multi-pitch estimation with scarce data. 41:1-41:8 - Qianyu Li, Bingcai Chen, Jiaxing Tian, Ruolan Liu:
FA-UNext: A Feedback Attention-based MLP Network for Medical Image Segmentation. 42:1-42:7 - Jiaqi Chen, Yan Yang, Shizhuo Deng, Da Teng, Liyuan Pan:
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition. 43:1-43:8 - Tingting Yao, Yuan Gao, Zihao Feng, Qing Hu, Zhiyong Wang:
Underwater Image Enhancement via Domain Adaptive Transfer Learning and Hybrid Reinforcement Model. 44:1-44:7 - Yuhang Zhang, Cuixin Yang, Muxin Liao, Shishun Tian, Wenbin Zou, Chen Xu:
Layout Relationship Decoupling Framework for Multi-target Domain Adaptative Semantic Segmentation. 45:1-45:7 - Haoyuan Zhang, Xiangyu Zhu, Qu Tang, Zhaoxiang Zhang, Zhen Lei:
STODINE: Decompose video to Object-centric Spatial-Temporal Slots for physical reasoning. 46:1-46:7 - Ziqiang Liu, Gongwei Fang, Wentong Wang, Qiang Liu:
Multimodal Sign Language Knowledge Graph and Representation: Text Video KeyFrames and Motion Trajectories. 47:1-47:7 - JieYing Liu:
The Quantification of Emotional Expressions and Perceptions of Vocal Vibrato in Basic Emotion: Commercial Operatic Singing Recordings. 48:1-48:7 - Zhengjie Lu, Jinjia Peng, Huibing Wang, Qingxuan Shi, Bin Wang:
HSMnet: Hybrid Sampling and Matching Network for DETR-based Person Search. 49:1-49:7 - Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, Ichiro Ide:
Action Selection Learning for Multi-label Multi-view Action Recognition. 50:1 - Sangni Xu, Hao Xiong, Qiuxia Wu, Zhihui Wang, Shlomo Berkovsky, Zhiyong Wang:
Fast Online Adaptation of Visual SLAM via Variational Information Transfer and Preservation. 51:1-51:7 - Son Duy Dao, Hengcan Shi, Dinh Q. Phung, Jianfei Cai:
CA-OVS: Cluster and Adapt Mask Proposals for Open-Vocabulary Semantic Segmentation. 52:1-52:8 - Shuangping Chen, Huijin Wang, Shun Long, Jieyun Bai, Jianmei Jiang:
Ultrasound Video Segmentation of Pubic Symphysis and Fetal Head for Angle of Progression Measurement. 53:1-53:8 - Pu Li, Yibiao Zhao, Xiaobai Liu:
Policy-driven Auto-Augmentation with Distillment Rewards for Scene Text Recognition. 54:1-54:8 - Li Jiao, Lihong Cao, Tian Wang:
Prompt-based Continual Learning for Extending Pretrained CLIP Models' Knowledge. 55:1-55:8 - Xinhao Zhong, Siyu Jiao, Yao Zhao, Yunchao Wei:
Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection. 56:1-56:7 - Junjiang Liu, Dandan Sun, Hailun Xia, Jiangtao Bai, Xinyue Fan:
FeedMatch: Evolves for Semi-Supervised Multimedia Classification from Student Feedback. 57:1-57:6 - Huilin Chen:
MFNet: Mixed Feature Network for Enhancing Facial Emotion Recognition on the Small-Scale Dataset. 58:1-58:7 - Zhiyi Mo, Guangtong Zhang, Jian Nong, Bineng Zhong, Zhi Li:
Dual-stream Multi-modal Interactive Vision-language Tracking. 59:1-59:7 - Yangyuan Chen, Zhizhong Ma, Mingjing Wang, Mingzhe Liu:
Advancing Music Emotion Recognition: A Transformer Encoder-Based Approach. 60:1-60:5 - Jie Wang, Huilin Chen, Wandong Xue, Dongming Chen, Dongqi Wang:
A Multi-angle Text Recognition Algorithm. 61:1-61:7 - Zhiyuan Li, Dongnan Liu, Heng Wang, Chaoyi Zhang, Weidong Cai:
Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation. 62:1-62:8 - Yuanyuan Shi, Yunan Li, Huizhou Chen, Siyu Liang, Qiguang Miao:
CISampler: Correlated Information Guided Frame Sampling for Gesture Recognition in Video. 63:1-63:8 - Meng Shen, Yake Wei, Jianxiong (Terry) Yin, Deepu Rajan, Di Hu, Simon See:
Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning. 64:1-64:8 - Huajie Tan, Guoqing Xiang, Xiaodong Xie, Huizhu Jia:
Joint Frame-Level and Block-Level Rate-Perception Optimized Preprocessing for Video Coding. 65:1 - Yongjian Liu, Shunwei Zhang, Jinyu Xu, Jiachen Li, Yanchun Ma, Qing Xie:
Dlpp-Net: Degradation Location Prior Prediction Network for Image Restoration. 66:1-66:8 - Fengqi Li, Mengchao Guo, Renxuan Xiong, Donglei Yang, Yi Wang, Fengqiang Xu:
MSTMENet: Multi-Scale Spatio-Temporal Mapping and Evolution Network for Video Deraining. 67:1 - Yichen Ouyang, Jiayi Ye, Wenhao Chai, Dapeng Tao, Yibing Zhan, Gaoang Wang:
An Efficient Multi-prior Hybrid Approach for Consistent 3D Generation from Single Images. 68:1 - Minghui Wang, Zixu Wang, Hongbin Xu, Kun Hu, Zhiyong Wang, Wenxiong Kang:
T2QRM: Text-Driven Quadruped Robot Motion Generation. 69:1-69:7 - Yanming Chen, Ziyu Liu, Xiangjian He:
MambaVesselNet: A Hybrid CNN-Mamba Architecture for 3D Cerebrovascular Segmentation. 70:1-70:7 - Hongyu An, Xinfeng Zhang, Shijie Zhao, Li Zhang:
FATO: Frequency Attention Transformer for Omnidirectional Image Super-Resolution. 71:1-71:7 - Kai Zhang, Xia Yuan, Shuntong Chen, Di Hu, Chunxia Zhao:
Multi-Modality Semantic-Shared Cross-View Ground-to-Aerial Localization. 72:1-72:7 - Sicheng Liu, Lintao Wang, Xiaogang Zhu, Xuequan Lu, Zhiyong Wang, Kun Hu:
SITransformer: Shared Information-Guided Transformer for Extreme Multimodal Summarization. 73:1-73:7 - Song Huang, Ziming Zeng, Min Li, Jianping Wang:
Unified Multi-view Clustering based on Joint Multi-Structure Representation Learning. 74:1-74:7 - Cheng-Kang Tan, Wei-Ta Chu:
CS-HOI: Human Object Interaction Detection Enhanced by Common Sense. 75:1-75:7 - Zhiqian Dong, Sheng Yang, Peng Zhou:
Dual-Enhanced Disentangled Multi-View Clustering. 76:1-76:7 - Rim El Filali, Soufiane Jdaba, Ronghui Xie, Ran Shi, Tong Qiao, Pan Qiaodong, Ting Wu:
S2FB IoU: Improving Boundary-based Object-Centric Image Segmentation Quality Evaluation. 77:1-77:7 - Yu Wei, Yi Wang, Shijun Yan, Tianzhu Wang, Zhihan Wang, Weirong Sun, Yu Zhao, Xinwei Xue:
CSUNet: Contour-Sensitive Underwater Salient Object Detection. 78:1-78:7 - Yachao He, Li Liu, Huaxiang Zhang, Dongmei Liu, Hongzhen Li:
A Unified Contrastive Framework with Multi-Granularity Fusion for Text-to-Image Generation. 79:1-79:7 - Jingxuan Chen:
GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video. 80:1-80:7 - Zhaojun Guo, Junqiang Huang, Guobiao Li, Wanli Peng, Xinpeng Zhang, Zhenxing Qian, Sheng Li:
Emotion-Aware and Efficient Meme Sticker Dialogue Generation. 81:1 - Xiaocong Zhou, Fan Liu, Chuanyi Zhang, Feifan Li, Wenwen Cai, Jun Zhou:
Feature-weighted Multi-stage Bayesian Prototype for Few-shot Classification. 82:1-82:7 - Jiale Wang, Xueliang Liu, Yuling Su:
A Robust Few-shot Learning Framework via Dual-branch Adversarial Noise Pretraining. 83:1-83:8 - Zheng-Xian Keh, Lai-Kuan Wong, Yuen Peng Loh, Ke Gu, Weisi Lin:
KBY-Net: A Dual Learning Framework for Improving Object Detection in Rainy Weather Conditions. 84:1-84:7 - Muhammad Saad Shakeel, Kun Liu, Xiaochuan Liao, Wenxiong Kang:
MRGait: A Multi-range feature learning framework for Cross-View Gait Recognition. 85:1-85:7 - Shun Katada, Kazunori Komatani:
Personalized Sentiment Estimation Based on Recall and Resting Ratio of Frontal EEG. 86:1-86:7 - Wenyu Shao, Hongbo Liu:
TCFusion: A Three-branch Cross-domain Fusion Network for Infrared and Visible Images. 87:1 - Jianhua Zhao, Xue Jun Li, Peter Han Joo Chong:
HFS-HNeRV: High-Frequency Spectrum Hybrid Neural Representation for Videos. 88:1-88:7 - Zichen Zhu, Zhongze Tang, Amir Nassereldine, Jinjun Xiong, Sheng Wei:
OpenVideoWalls: an Open-Source System for Building Video Walls with Recycling Heterogeneous Displays. 89:1-89:7 - Yingkai He, Zhen Zhang, Liang Liao, Jing Xiao:
Latent Variables Coding for Perceptual Image Compression. 90:1-90:7 - Yuxin Yang, Pengfei Zhu, Mengshi Qi, Huadong Ma:
Following in the Footsteps: Predicting Human Trajectories Using Motion Pattern Memory. 91:1-91:7 - Liqun Shan, Rujun Zhang, Sai Venkatesh Chilukoti, Xingli Zhang, Insup Lee, Xiali Hei:
IdentityKD: Identity-wise Cross-modal Knowledge Distillation for Person Recognition via mmWave Radar Sensors. 92:1-92:7 - Xingang Wang, Mengyi Wang, Hai Cui, Yijia Zhang:
Efficient Low-Dimensional Representation Via Manifold Learning-Based Model for Multimodal Sentiment Analysis. 93:1-93:7 - Yixin Zhang, Yoko Yamakata, Keishi Tajima:
Adaptive Feature Inheritance and Thresholding for Ingredient Recognition in Multimedia Cooking Instructions. 94:1-94:7 - Fei Xiang, Hongbo Liu, Ruili Wang, Junjie Hou, Xingang Wang:
DCEPNet: Dual-Channel Emotional Perception Network for Speech Emotion Recognition. 95:1 - Dongming Chen, Mingshuo Nie, Zhengping Sun, Huilin Chen, Dongqi Wang:
An Information Cascade Prediction Algorithm Based on Time Series. 96:1 - Chengxi Lei, Satwinder Singh, Feng Hou, Ruili Wang:
Mix-fine-tune: An Alternate Fine-tuning Strategy for Domain Adaptation and Generalization of Low-resource ASR. 97:1-97:7 - Yuchong Sun, Bei Liu, Xu Chen, Ruihua Song, Jianlong Fu:
ViCo: Engaging Video Comment Generation with Human Preference Rewards. 98:1 - Zeyu Zhao, Nan Gao, Zhi Zeng, Guixuan Zhang, Jie Liu, Shuwu Zhang:
A Unified Editing Method for Co-Speech Gesture Generation via Diffusion Inversion. 99:1-99:7 - Raj Jaiswal, Avinash Anand, Rajiv Ratn Shah:
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring. 100:1-100:7 - Haipeng Li, Guangcun Wei, Haochen Xu, Boyan Guo:
DocPointer: A parameter-efficient Pointer Network for Key Information Extraction. 101:1-101:7 - Yani Chen, Jiaxiang E, Kaiyu Nie, Xiaoxia Nie, Ruili Wang:
Development of a Chinese Synonym Library: Enhancing Clinical Terminology Standardization and Interoperability. 102:1-102:7 - Chen Wang, Feng Hou, Yi Wang, Ruili Wang:
Structured Bipartite Graph Ensemble Clustering. 103:1-103:7 - Yuan Gao, Feng Hou, Ruili Wang:
Incorporating Pre-ordering Representations for Low-resource Neural Machine Translation. 104:1-104:7 - Wenhao Gao, Zhenbo Song, Zhenyuan Zhang, Jianfeng Lu:
On the Robustness of Deep Face Inpainting: An Adversarial Perspective. 105:1-105:7 - Ying Qiao, Aoxuan Chen, Xiang Li, Jinfei Gao:
Variational Stochastic Multiple Auto-Encoder For Multimodal Recommendation. 106:1-106:7 - Jiazhen Zhang, Kun Li, Yanyan Wei, Fei Wang, Wei Qian, Jinxing Zhou, Dan Guo:
Repetitive Action Counting with Feature Interaction Enhancement and Adaptive Gate Fusion. 107:1-107:7 - Xiong Zeng, Min Jiang, Ronghua Huang:
Multi-stage Image Deraining based on Pre-trained Diffusion Model. 108:1-108:7 - Tomoya Sugihara, Shuntaro Masuda, Ling Xiao, Toshihiko Yamasaki:
Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video. 109:1 - Qingjin Wei, Xiaozhuo Li, Dinglu Liu, Zhiwu Liao:
MFTAnet: Two-step Aggregation Net of Multiscale Features for Pneumoconiosis Screening. 110:1-110:7 - Huijie Zhang, Xiaobai Liu:
Focal Diffusion Process for Object-Aware 3D LiDAR Generation. 111:1-111:7 - Longyun Dong, Yuanrong Xu, Jianping Zhong, Zhaobo Qi, Weigang Zhang:
Improving Sequential DeepFake Detection with Local information enhancement. 112:1 - Pingyi Huo, Ajay Narayanan Sridhar, Md Fahim Faysal Khan, Kiwan Maeng, Vijaykrishnan Narayanan:
QoS-Diff: Adaptive Auto-tuning Framework for Low-latency Diffusion Model Inference. 113:1-113:7 - Cui Xu, Laiyun Qing:
Point-Supervised Temporal Action Detection with Label Supplementation Based on Transformer. 114:1-114:7 - Xu Gu, Xihua Wang, Chuhao Jin, Ruihua Song:
ScaMo: Towards Text to Video Storyboard Generation Using Scale and Movement of Shots. 115:1-115:8 - Guohuan Gao, Gang Zhang, Xiangyang Xu:
ADP3D: Adaptive Point Selection for Efficient Multi-frame 3D Object Detection. 116:1-116:7 - Shanshan Yao, Tian Li:
Multi-domain Acoustic Feature Fusion for Speaker Recognition. 117:1-117:6 - Zhitong Zhu, Jing Yu, Keke Gai, Jiamin Zhuang, Gaopeng Gou, Gang Xiong:
Flexible Semantic Watermarking for Robust Diffusion Model Detection and Tracing. 118:1-118:7 - Shan Wan, Wu Liu, Yijun Liu, Feiniu Yuan, Chunli Meng:
Watermarking Vision-Language Models. 119:1 - Jinwei Li, Yongkang Cheng, Yonghe Zhang, Pengcheng Wang:
Hierarchical Part-Attention Networks for 3D Human Reconstruction. 120:1
Short Papers and Demo Papers
- Hairui Yang, Ning Wang, Zhihui Wang, Lei Wang:
Sketch-based 3D Model Retrieval with Cross-Modal Representation. 121:1-121:5 - Daidou Guo, Chuan Qin:
PCMark-NAS: Lightweight Print-Camera Resilient Watermarking Networks via Neural Architecture Search. 122:1-122:5 - Daidou Guo, Ching-Chun Chang, Cheng SenMao, Chuan Qin:
Highly Fault-Tolerant Discrete Lattice Information Coding Method for Screen-Shooting Scenarios. 123:1-123:5 - Yu Song, Xiaohui Yang, Rongping Huang, Haifeng Bai, Lili Yang:
CSCCap: Plugging Sparse Coding in Zero-Shot Image Captioning. 124:1-124:5 - Zuyi Pei, Baoli Sun, Zhihui Wang, Haojie Li:
Fine-grained Video Semantic Distillation for Video-Text Retrieval. 125:1-125:5 - Zihao Tang, Xinyi Wang, Mariano Cabezas, Arkiev D'Souza, Michael Barnett, Fernando Calamante, Weidong Cai, Chenyu Wang:
Fibre Population-guided Pre-training for 3D Spatial Super-Resolution on Multimodal Brain Diffusion MR Imaging. 126:1 - Mingzhe Zhang, Laura J. Ferris, Lin Yue, Miao Xu:
Emotionally Guided Symbolic Music Generation Using Diffusion Models: The AGE-DM Approach. 127:1-127:5 - Wei-Lun Huang, Shao-Hung Wu, Hung-Chang Huang, Min-Chun Hu, Tse-Yu Pan:
Description-Driven Audiovisual Embedding Space Learning for Enhanced Movie Understanding. 128:1-128:5 - Chenxi Niu, Ziyu Liu, Xiangjian He:
SS-FS CSA: Self-Supervised and Fully Supervised Integration for 3D Cerebrovascular Segmentation. 129:1-129:5 - Hao Zhang, Xingning Dong, Jinfei Gao, Liang Hao, Pei Shen, Tian Gan:
MBC-ATA: Maximum Binary Classification and Anchor-based Triplet Augmentation for Unbiased Scene Graph Generation. 130:1-130:5 - Tianqi Wei, Zhi Chen, Xin Yu:
Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild. 131:1-131:3
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.