default search action
28th ACM Multimedia 2020: Virtual Event (Seattle, WA), USA
- Chang Wen Chen, Rita Cucchiara, Xian-Sheng Hua, Guo-Jun Qi, Elisa Ricci, Zhengyou Zhang, Roger Zimmermann:
MM '20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12-16, 2020. ACM 2020, ISBN 978-1-4503-7988-5
Oral Session A1: Deep Learning for Multimedia
- Jin Wang, Chen Wang, Qingming Huang, Yunhui Shi, Jian-Feng Cai, Qing Zhu, Baocai Yin:
Image Inpainting Based on Multi-frequency Probabilistic Inference Model. 1-9 - Jianzhe Lin, Lichao Mou, Tianze Yu, Xiaoxiang Zhu, Z. Jane Wang:
Dual Adversarial Network for Unsupervised Ground/Satellite-to-Aerial Scene Adaptation. 10-18 - Yadan Luo, Zi Huang, Zijian Wang, Zheng Zhang, Mahsa Baktashmotlagh:
Adversarial Bipartite Graph Learning for Video Domain Adaptation. 19-27 - Peng Wang, Dongyang Liu, Hui Li, Qi Wu:
Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge. 28-36 - Weijiang Yu, Jian Liang, Lu Li, Nong Xiao:
Single Image De-noising via Staged Memory Network. 37-45 - Xuanchi Ren, Haoran Li, Zijian Huang, Qifeng Chen:
Self-supervised Dance Video Synthesis Conditioned on Music. 46-54
Oral Session B1: Deep Learning for Multimedia
- Fanfan Ye, Shiliang Pu, Qiaoyong Zhong, Chao Li, Di Xie, Huiming Tang:
Dynamic GCN: Context-enriched Topology Learning for Skeleton-based Action Recognition. 55-63 - Peike Li, Yunchao Wei, Yi Yang:
Meta Parsing Networks: Towards Generalized Few-shot Scene Parsing with Adaptive Metric Learning. 64-72 - Wei Li, Zhenting Wang, Xiao Wu, Ji Zhang, Qiang Peng, Hongliang Li:
CODAN: Counting-driven Attention Network for Vehicle Detection in Congested Scenes. 73-82 - Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang:
Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph. 83-91 - Zeren Sun, Xian-Sheng Hua, Yazhou Yao, Xiu-Shen Wei, Guosheng Hu, Jian Zhang:
CRSSC: Salvage Reusable Samples from Noisy Data for Robust Learning. 92-101 - Jen-Chun Lin, Wen-Li Wei, Yen-Yu Lin, Tyng-Luh Liu, Hong-Yuan Mark Liao:
Learning From Music to Visual Storytelling of Shots: A Deep Interactive Learning Mechanism. 102-110
Oral Session C1: Deep Learning for Multimedia
- Fangfang Wang, Yifeng Chen, Fei Wu, Xi Li:
TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene Text Detection. 111-119 - Peng Lu, Jiahui Liu, Xujun Peng, Xiaojie Wang:
Weakly Supervised Real-time Image Cropping based on Aesthetic Distributions. 120-128 - Yuting Liu, Zheng Wang, Miaojing Shi, Shin'ichi Satoh, Qijun Zhao, Hongyu Yang:
Towards Unsupervised Crowd Counting via Regression-Detection Bi-knowledge Transfer. 129-137 - Yanlu Wei, Renshuai Tao, Zhangjie Wu, Yuqing Ma, Libo Zhang, Xianglong Liu:
Occluded Prohibited Items Detection: An X-ray Security Inspection Benchmark and De-occlusion Attention Module. 138-146 - Hsuan-Kai Kao, Li Su:
Temporally Guided Music-to-Body-Movement Generation. 147-155 - Yixiong Zou, Shanghang Zhang, Ke Chen, Yonghong Tian, Yaowei Wang, José M. F. Moura:
Compositional Few-Shot Recognition with Primitive Discovery and Enhancing. 156-164
Oral Session D1: Deep Learning for Multimedia
- Chen Gao, Si Liu, Defa Zhu, Quan Liu, Jie Cao, Haoqian He, Ran He, Shuicheng Yan:
InteractGAN: Learning to Generate Human-Object Interaction. 165-173 - Shijie Wang, Zhihui Wang, Haojie Li, Wanli Ouyang:
Category-specific Semantic Coherency Learning for Fine-grained Image Recognition. 174-183 - Che Sun, Yunde Jia, Yao Hu, Yuwei Wu:
Scene-Aware Context Reasoning for Unsupervised Abnormal Event Detection in Videos. 184-192 - Jing Jin, Junhui Hou, Jie Chen, Sam Kwong, Jingyi Yu:
Light Field Super-resolution via Attention-Guided Fusion of Hybrid Lenses. 193-201 - Wei-Cheng Lai, Zi-Xiang Xia, Hao-Siang Lin, Lien-Feng Hsu, Hong-Han Shuai, I-Hong Jhuo, Wen-Huang Cheng:
Trajectory Prediction in Heterogeneous Environment via Attended Ecology Embedding. 202-210 - Liang Sun, Xiang Guan, Yang Yang, Lei Zhang:
Text-Embedded Bilinear Model for Fine-Grained Visual Recognition. 211-219
Oral Session E1: Deep Learning for Multimedia
- Zhiheng Ma, Xing Wei, Xiaopeng Hong, Yihong Gong:
Learning Scales from Points: A Scale-aware Probabilistic Model for Crowd Counting. 220-228 - Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu:
Learning Global Structure Consistency for Robust Object Tracking. 229-237 - Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, Raymond Huang:
Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical Understanding of Outdoor Scene. 238-246 - Jun-Hyuk Kim, Soobeom Jang, Jun-Ho Choi, Jong-Seok Lee:
Instability of Successive Deep Image Compression. 247-255 - Akash Gupta, Abhishek Aich, Amit K. Roy-Chowdhury:
ALANET: Adaptive Latent Attention Network for Joint Video Deblurring and Interpolation. 256-264 - Shaotian Yan, Chen Shen, Zhongming Jin, Jianqiang Huang, Rongxin Jiang, Yaowu Chen, Xian-Sheng Hua:
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation. 265-273
Oral Session F1: Deep Learning for Multimedia
- Peixi Peng, Yonghong Tian, Yangru Huang, Xiangqian Wang, Huilong An:
Discriminative Spatial Feature Learning for Person Re-Identification. 274-283 - Xiangping Wu, Qingcai Chen, Wei Li, Yulun Xiao, Baotian Hu:
AdaHGNN: Adaptive Hypergraph Neural Networks for Multi-Label Image Classification. 284-293 - Dawei Zhang, Zhonglong Zheng, Minglu Li, Xiaowei He, Tianxiang Wang, Liyuan Chen, Riheng Jia, Feilong Lin:
Reinforced Similarity Learning: Siamese Relation Networks for Robust Object Tracking. 294-303 - Ruoxi Deng, Shengjun Liu:
Deep Structural Contour Detection. 304-312 - Saurabh Sahu, Palash Goyal, Shalini Ghosh, Chul Lee:
Cross-modal Non-linear Guided Attention and Temporal Coherence in Multi-modal Deep Video Models. 313-321 - Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang:
IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning. 322-330
Oral Session G1: Deep Learning for Multimedia
- Xin Wang, Wei Huang, Qi Liu, Yu Yin, Zhenya Huang, Le Wu, Jianhui Ma, Xue Wang:
Fine-Grained Similarity Measurement between Educational Videos and Exercises. 331-339 - Mengli Cheng, Minghui Qiu, Xing Shi, Jun Huang, Wei Lin:
One-shot Text Field labeling using Attention and Belief Propagation for Structure Information Extraction. 340-348 - Yunzhuo Liu, Bo Jiang, Tian Guo, Ramesh K. Sitaraman, Don Towsley, Xinbing Wang:
Grad: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding. 349-357 - Yat Hong Lam, Alireza Zare, Francesco Cricri, Jani Lainema, Miska M. Hannuksela:
Efficient Adaptation of Neural Network Filter for Video Compression. 358-366 - Naoki Kimura, Keisuke Shiro, Yota Takakura, Hiromi Nakamura, Jun Rekimoto:
SonoSpace: Visual Feedback of Timbre with Unsupervised Learning. 367-374 - Bo Pang, Deming Zhai, Junjun Jiang, Xianming Liu:
Single Image Deraining via Scale-space Invariant Attention Neural Network. 375-383
Oral Session H1: Emerging Multimedia Applications
- Kaihao Zhang, Wenhan Luo, Björn Stenger, Wenqi Ren, Lin Ma, Hongdong Li:
Every Moment Matters: Detail-Aware Networks to Bring a Blurry Image Alive. 384-392 - Weiqing Min, Linhu Liu, Zhiling Wang, Zhengdong Luo, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang:
ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network. 393-401 - Tianyu Zhang, Weiqing Min, Ying Zhu, Yong Rui, Shuqiang Jiang:
An Egocentric Action Anticipation Framework via Fusing Intuition and Analysis. 402-410 - Diangang Li, Jianquan Liu, Shoji Nishimura, Yuka Hayashi, Jun Suzuki, Yihong Gong:
Multi-Person Action Recognition in Microwave Sensors. 411-420 - Qi Jia, Xin Fan, Meiyu Yu, Yuqing Liu, Dingrong Wang, Longin Jan Latecki:
Coupling Deep Textural and Shape Features for Sketch Recognition. 421-429 - Huaizheng Zhang, Yong Luo, Qiming Ai, Yonggang Wen, Han Hu:
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning. 430-438
Oral Session A2: Emerging Multimedia Applications
- Komal Chugh, Parul Gupta, Abhinav Dhall, Ramanathan Subramanian:
Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization. 439-447 - Kai Cheng, Xin Liu, Yiu-ming Cheung, Rui Wang, Xing Xu, Bineng Zhong:
Hearing like Seeing: Improving Voice-Face Interactions and Associations via Adversarial Deep Semantic Matching Network. 448-455 - Ramit Sawhney, Puneet Mathur, Ayush Mangal, Piyush Khanna, Rajiv Ratn Shah, Roger Zimmermann:
Multimodal Multi-Task Financial Risk Forecasting. 456-465 - Jiahang Wang, Tong Sha, Wei Zhang, Zhoujun Li, Tao Mei:
Down to the Last Detail: Virtual Try-on with Fine-grained Details. 466-474 - Yifeng Zhou, Xing Xu, Fumin Shen, Lianli Gao, Huimin Lu, Heng Tao Shen:
Temporal Denoising Mask Synthesis Network for Learning Blind Video Temporal Consistency. 475-483 - K. R. Prajwal, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:
A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild. 484-492
Oral Session B2: Emotional and Social Signals in Multimedia
- Guangyao Shen, Xin Wang, Xuguang Duan, Hongzhi Li, Wenwu Zhu:
MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos. 493-502 - Dong Zhang, Weisheng Zhang, Shoushan Li, Qiaoming Zhu, Guodong Zhou:
Modeling both Intra- and Inter-modal Influence for Real-Time Emotion Detection in Conversations. 503-511 - Xincheng Ju, Dong Zhang, Junhui Li, Guodong Zhou:
Transformer-based Label Set Generation for Multi-modal Multi-label Emotion Detection. 512-520 - Kaicheng Yang, Hua Xu, Kai Gao:
CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis. 521-528 - Xingkun Zuo, Jiyi Li, Qili Zhou, Jianjun Li, Xiaoyang Mao:
AffectI: A Game for Diverse, Reliable, and Efficient Affective Image Annotation. 529-537 - Shi Yin, Shangfei Wang, Xiaoping Chen, Enhong Chen, Cong Liang:
Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking. 538-546
Oral Session C2: Media Interpretation
- Xiaobin Liu, Shiliang Zhang:
Domain Adaptive Person Re-Identification via Coupling Optimization. 547-555 - Peipei Li, Yinglu Liu, Hailin Shi, Xiang Wu, Yibo Hu, Ran He, Zhenan Sun:
Dual-Structure Disentangling Variational Generation for Data-Limited Face Parsing. 556-564 - Chunhui Zhang, Shiming Ge, Kangkai Zhang, Dan Zeng:
Accurate UAV Tracking with Distance-Injected Overlap Maximization. 565-573 - Hongru Liang, Wenqiang Lei, Paul Yaozhu Chan, Zhenglu Yang, Maosong Sun, Tat-Seng Chua:
PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music. 574-582 - Guang Yu, Siqi Wang, Zhiping Cai, En Zhu, Chuanfu Xu, Jianping Yin, Marius Kloft:
Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events. 583-591 - Qian Bao, Wu Liu, Jun Hong, Lingyu Duan, Tao Mei:
Pose-native Network Architecture Search for Multi-person Human Pose Estimation. 592-600
Oral Session D2: Media Interpretation
- Xiruo Shi, Liutong Xu, Pengfei Wang, Yuanyuan Gao, Haifang Jian, Wu Liu:
Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification. 601-609 - Hao Tang, Zechao Li, Zhimao Peng, Jinhui Tang:
BlockMix: Meta Regularization and Self-Calibrated Inference for Metric-Based Meta-Learning. 610-618 - Dechao Meng, Liang Li, Shuhui Wang, Xingyu Gao, Zheng-Jun Zha, Qingming Huang:
Fine-grained Feature Alignment with Part Perspective Transformation for Vehicle ReID. 619-627 - Yanbin Hao, Hao Zhang, Chong-Wah Ngo, Qiang Liu, Xiaojun Hu:
Compact Bilinear Augmented Query Structured Attention for Sport Highlights Classification. 628-636 - Jiacheng Li, Zhiwei Xiong, Dong Liu, Xuejin Chen, Zheng-Jun Zha:
Semantic Image Analogy with a Conditional Single-Image GAN. 637-645 - Yangchun Zhu, Zheng-Jun Zha, Tianzhu Zhang, Jiawei Liu, Jiebo Luo:
A Structured Graph Attention Network for Vehicle Re-Identification. 646-654
Oral Session E2: Media Interpretation
- Baoyu Fan, Li Wang, Runze Zhang, Zhenhua Guo, Yaqian Zhao, Rengang Li, Weifeng Gong:
Contextual Multi-Scale Feature Learning for Person Re-Identification. 655-663 - Zeyu Xiao, Zhiwei Xiong, Xueyang Fu, Dong Liu, Zheng-Jun Zha:
Space-Time Video Super-Resolution Using Temporal Profiles. 664-672 - Boqiang Xu, Lingxiao He, Xingyu Liao, Wu Liu, Zhenan Sun, Tao Mei:
Black Re-ID: A Head-shoulder Descriptor for the Challenging Problem of Person Re-Identification. 673-681 - Haoran Lv, Qin Yang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong:
SalGCN: Saliency Prediction for 360-Degree Images Based on Spherical Graph Convolutional Networks. 682-690 - Sai Praneeth Reddy Sunkesula, Rishabh Dabral, Ganesh Ramakrishnan:
LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos. 691-699 - Zhengqing Fang, Kun Kuang, Yuxiao Lin, Fei Wu, Yu-Feng Yao:
Concept-based Explanation for Fine-grained Images and Its Application in Infectious Keratitis Classification. 700-708
Oral Session F2: Mobile Multimedia & Multimedia HCI and Quality of Experience
- Yuanqiang Cai, Dawei Du, Libo Zhang, Longyin Wen, Weiqiang Wang, Yanjun Wu, Siwei Lyu:
Guided Attention Network for Object Detection and Counting on Drones. 709-717 - Jingchen Sun, Jiming Chen, Tao Chen, Jiayuan Fan, Shibo He:
PIDNet: An Efficient Network for Dynamic Pedestrian Intrusion Detection. 718-726 - Xing Cai, Lanqing Zhang, Chengyuan Li, Ge Li, Thomas H. Li:
VONAS: Network Design in Visual Odometry using Neural Architecture Search. 727-735 - Wenbo Zheng, Lan Yan, Fei-Yue Wang, Chao Gou:
Learning from the Past: Meta-Continual Learning with Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition. 736-743 - Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, Yanfeng Wang:
ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit. 744-752 - Qiushi Li, Wenwu Zhu, Chao Wu, Xinglin Pan, Fan Yang, Yuezhi Zhou, Yaoxue Zhang:
InvisibleFL: Federated Learning over Non-Informative Intermediate Updates against Multimedia Privacy Leakages. 753-762 - Shu Zhao, Dayan Wu, Wanqian Zhang, Yu Zhou, Bo Li, Weiping Wang:
Asymmetric Deep Hashing for Efficient Hash Code Compression. 763-771
Oral Session G2: Multimedia HCI and Quality of Experience
- Yuen-Jen Lin, Hsuan-Kai Kao, Yih-Chih Tseng, Ming Tsai, Li Su:
A Human-Computer Duet System for Music Performance. 772-780 - Yujia Wang, Sifan Hou, Bing Ning, Wei Liang:
Photo Stand-Out: Photography with Virtual Character. 781-788 - Dingquan Li, Tingting Jiang, Ming Jiang:
Norm-in-Norm Loss with Faster Convergence and Better Performance for Image Quality Assessment. 789-797 - Munan Xu, Jia-Xing Zhong, Yurui Ren, Shan Liu, Ge Li:
Context-aware Attention Network for Predicting Image Aesthetic Subjectivity. 798-806 - Nikolas Wehner, Michael Seufert, Sebastian Egger-Lampl, Bruno Gardlo, Pedro Casas, Raimund Schatz:
Scoring High: Analysis and Prediction of Viewer Behavior and Engagement in the Context of 2018 FIFA WC Live Streaming. 807-815 - Jingwen Hou, Sheng Yang, Weisi Lin:
Object-level Attention for Aesthetic Rating Distribution Prediction. 816-824 - Zhaohui Zhang, Haichao Zhu, Qian Zhang:
ARSketch: Sketch-Based User Interface for Augmented Reality Glasses. 825-833
Oral Session H2: Multimedia HCI and Quality of Experience & Multimedia Search and Recommendation
- Pengfei Chen, Leida Li, Lei Ma, Jinjian Wu, Guangming Shi:
RIRNet: Recurrent-In-Recurrent Network for Video Quality Assessment. 834-842 - Yiru Wang, Shen Huang, Gongfu Li, Qiang Deng, Dongliang Liao, Pengda Si, Yujiu Yang, Jin Xu:
Cognitive Representation Learning of Self-Media Online Article Quality. 843-851 - Jakub Nawala, Lucjan Janowski, Bogdan Cmiel, Krzysztof Rusek:
Describing Subjective Experiment Consistency by p-Value P-P Plot. 852-861 - Leonardo Galteri, Marco Bertini, Lorenzo Seidenari, Tiberio Uricchio, Alberto Del Bimbo:
Increasing Video Perceptual Quality with GANs and Semantic Coding. 862-870 - Yongxin Wang, Xin Luo, Xin-Shun Xu:
Label Embedding Online Hashing for Cross-Modal Retrieval. 871-879 - Zhaopeng Li, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang:
Quaternion-Based Knowledge Graph Network for Recommendation. 880-888
Oral Session A3: Multimedia Search and Recommendation
- Yongguo Ling, Zhun Zhong, Zhiming Luo, Paolo Rota, Shaozi Li, Nicu Sebe:
Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification. 889-897 - Da Cao, Yawen Zeng, Xiaochi Wei, Liqiang Nie, Richang Hong, Zheng Qin:
Adversarial Video Moment Retrieval by Jointly Modeling Ranking and Localization. 898-906 - Xinchen Liu, Wu Liu, Jinkai Zheng, Chenggang Yan, Tao Mei:
Beyond the Parts: Learning Multi-view Cross-part Correlation for Vehicle Re-identification. 907-915 - Lu Jin, Zechao Li, Yonghua Pan, Jinhui Tang:
Weakly-Supervised Image Hashing through Masked Visual-Semantic Graph-based Reasoning. 916-924 - Heyu Zhou, Weizhi Nie, Dan Song, Nian Hu, Xuanya Li, An-An Liu:
Semantic Consistency Guided Instance Feature Alignment for 2D Image-Based 3D Shape Retrieval. 925-933 - Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar:
RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization. 934-954
Oral Session B3: Multimedia Systems and Middleware & Media Transport and Delivery
- Weiming Zhuang, Yonggang Wen, Xuesen Zhang, Xin Gan, Daiying Yin, Dongzhan Zhou, Shuai Zhang, Shuai Yi:
Performance Optimization of Federated Person Re-identification via Benchmark Analysis. 955-963 - Hung-Min Hsu, Yizhou Wang, Jenq-Neng Hwang:
Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model. 964-972 - Jie Wu, Tianshui Chen, Lishan Huang, Hefeng Wu, Guanbin Li, Ling Tian, Liang Lin:
Active Object Search. 973-981 - Jun Yi, Md Reazul Islam, Shivang Aggarwal, Dimitrios Koutsonikolas, Y. Charlie Hu, Zhisheng Yan:
An Analysis of Delay in Live 360° Video Streaming Systems. 982-990 - Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha:
DeepFacePencil: Creating Face Images from Freehand Sketches. 991-999 - Peilin Chen, Wenhan Yang, Long Sun, Shiqi Wang:
When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding. 1000-1008 - Gang Yan, Jian Li:
RL-Bélády: A Unified Learning Framework for Content Caching. 1009-1017
Oral Session C3: Multimodal Analysis and Description &Summarization, Analytics, and Storytelling
- Zhizhong Han, Chao Chen, Yu-Shen Liu, Matthias Zwicker:
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences. 1018-1027 - Xing Wei, Diangang Li, Xiaopeng Hong, Wei Ke, Yihong Gong:
Co-Attentive Lifting for Infrared-Visible Person Re-Identification. 1028-1037 - Zhiwei Wu, Changmeng Zheng, Yi Cai, Junying Chen, Ho-fung Leung, Qing Li:
Multimodal Representation with Embedded Visual Guiding Objects for Named Entity Recognition in Social Media Posts. 1038-1046 - Leigang Qu, Meng Liu, Da Cao, Liqiang Nie, Qi Tian:
Context-Aware Multi-View Summarization Network for Image-Text Matching. 1047-1055 - Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras:
Performance over Random: A Robust Evaluation Protocol for Video Summarization Methods. 1056-1064 - Pravin Nagar, Mansi Khemka, Chetan Arora:
Concept Drift Detection for Multivariate Data Streams and Temporal Segmentation of Daylong Egocentric Videos. 1065-1074 - Shuyue Lan, Zhilu Wang, Amit K. Roy-Chowdhury, Ermin Wei, Qi Zhu:
Distributed Multi-agent Video Fast-forwarding. 1075-1084
Oral Session D3: Multimodal Fusion and Embedding
- Yitian Yuan, Lin Ma, Jingwen Wang, Wenwu Zhu:
Controllable Video Captioning with an Exemplar Sentence. 1085-1093 - Qing Lin, Bo Yan, Jichun Li, Weimin Tan:
MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting. 1094-1102 - Yiheng Liu, Wengang Zhou, Mao Xi, Sanjing Shen, Houqiang Li:
Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context Propagation. 1103-1111 - Beichen Zhang, Liang Li, Li Su, Shuhui Wang, Jincan Deng, Zheng-Jun Zha, Qingming Huang:
Structural Semantic Adversarial Active Learning for Image Captioning. 1112-1121 - Devamanyu Hazarika, Roger Zimmermann, Soujanya Poria:
MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. 1122-1131 - Liangming Pan, Jingjing Chen, Jianlong Wu, Shaoteng Liu, Chong-Wah Ngo, Min-Yen Kan, Yu-Gang Jiang, Tat-Seng Chua:
Multi-modal Cooking Workflow Construction for Food Recipes. 1132-1141 - Yuqian Fu, Li Zhang, Junke Wang, Yanwei Fu, Yu-Gang Jiang:
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition. 1142-1151 - David Semedo, João Magalhães:
Adaptive Temporal Triplet-loss for Cross-modal Embedding Learning. 1152-1161
Oral Session E3: Music, Speech and Audio Processing in Multimedia & Social Media
- Yujia Wang, Wei Liang, Wanwan Li, Dingzeyu Li, Lap-Fai Yu:
Scene-Aware Background Music Synthesis. 1162-1170 - Xutong Jin, Sheng Li, Tianshu Qu, Dinesh Manocha, Guoping Wang:
Deep-Modal: Real-Time Impact Sound Synthesis for Arbitrary Shapes. 1171-1179 - Yu-Siang Huang, Yi-Hsuan Yang:
Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions. 1180-1188 - Zhejing Hu, Yan Liu, Gong Chen, Sheng-hua Zhong, Aiwei Zhang:
Make Your Favorite Music Curative: Music Style Transfer for Anxiety Reduction. 1189-1197 - Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu:
PopMAG: Pop Music Accompaniment Generation. 1198-1206 - Run Wang, Felix Juefei-Xu, Yihao Huang, Qing Guo, Xiaofei Xie, Lei Ma, Yang Liu:
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices. 1207-1216 - Yihao Huang, Felix Juefei-Xu, Run Wang, Qing Guo, Lei Ma, Xiaofei Xie, Jianwen Li, Weikai Miao, Yang Liu, Geguang Pu:
FakePolisher: Making DeepFakes More Detection-Evasive by Shallow Reconstruction. 1217-1226
Oral Session F3: Vision and Language
- Guohao Li, Xin Wang, Wenwu Zhu:
Boosting Visual Question Answering with Context-aware Knowledge Aggregation. 1227-1235 - Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Zheng-Jun Zha, Meng Wang:
Memory-Augmented Relation Network for Few-Shot Learning. 1236-1244 - Yiyi Zhou, Rongrong Ji, Xiaoshuai Sun, Gen Luo, Xiaopeng Hong, Jinsong Su, Xinghao Ding, Ling Shao:
K-armed Bandit based Multi-Modal Network Architecture Search for Visual Question Answering. 1245-1254 - Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin:
Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition. 1255-1264 - Xiaoze Jiang, Siyi Du, Zengchang Qin, Yajing Sun, Jing Yu:
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue. 1265-1273 - Gen Luo, Yiyi Zhou, Rongrong Ji, Xiaoshuai Sun, Jinsong Su, Chia-Wen Lin, Qi Tian:
Cascade Grouped Attention Network for Referring Expression Segmentation. 1274-1282
Oral Session G3: Vision and Language
- Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin:
Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos. 1283-1291 - Shengyu Zhang, Ziqi Tan, Jin Yu, Zhou Zhao, Kun Kuang, Jie Liu, Jingren Zhou, Hongxia Yang, Fei Wu:
Poet: Product-oriented Video Captioner for E-commerce. 1292-1301 - Lisai Zhang, Qingcai Chen, Baotian Hu, Shuoran Jiang:
Text-Guided Neural Image Inpainting. 1302-1310 - Keyang Wang, Lei Zhang:
Single-Shot Two-Pronged Detector with Rectified IoU Loss. 1311-1319 - Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin, Zhengyuan Yang, Yubin Ge, Jie Zhou, Jiebo Luo:
Dynamic Context-guided Capsule Network for Multimodal Machine Translation. 1320-1329 - Shitong Luo, Wei Hu:
Differentiable Manifold Reconstruction for Point Cloud Denoising. 1330-1338
Oral Session H3: Vision and Language
- Hongyi Zheng, Wangmeng Zuo, Lei Zhang:
BS-MCVR: Binary-sensing based Mobile-cloud Visual Recognition. 1339-1347 - Jingjing Li, Mengmeng Jing, Lei Zhu, Zhengming Ding, Ke Lu, Yang Yang:
Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning. 1348-1356 - Yahui Liu, Marco De Nadai, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe, Bruno Lepri:
Describe What to Change: A Text-guided Unsupervised Image-to-image Translation Approach. 1357-1365 - Advaith Sridhar, Rohith Gandhi Ganesan, Pratyush Kumar, Mitesh M. Khapra:
INCLUDE: A Large Scale Dataset for Indian Sign Language Recognition. 1366-1375 - Run Wang, Felix Juefei-Xu, Qing Guo, Yihao Huang, Xiaofei Xie, Lei Ma, Yang Liu:
Amora: Black-box Adversarial Morphing Attack. 1376-1385 - Fan Yu, Haonan Wang, Tongwei Ren, Jinhui Tang, Gangshan Wu:
Visual Relation of Interest Detection. 1386-1394
Poster Session A1: Deep Learning for Multimedia
- Zhedong Zheng, Yunchao Wei, Yi Yang:
University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization. 1395-1403 - Tao Dai, Yan Feng, Dongxian Wu, Bin Chen, Jian Lu, Yong Jiang, Shu-Tao Xia:
DIPDefend: Deep Image Prior Driven Defense against Adversarial Examples. 1404-1412 - Peng Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Jing Lu, Liang Qiao, Yi Niu, Fei Wu:
TRIE: End-to-End Text Reading and Information Extraction for Document Understanding. 1413-1422 - Jiaming Zhang, Jitao Sang, Xian Zhao, Xiaowen Huang, Yanfeng Sun, Yongli Hu:
Adversarial Privacy-preserving Filter. 1423-1431 - Wei Peng, Jingang Shi, Zhaoqiang Xia, Guoying Zhao:
Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition. 1432-1440 - Lizhao Liu, Junyi Cao, Minqian Liu, Yong Guo, Qi Chen, Mingkui Tan:
Dynamic Extension Nets for Few-shot Semantic Segmentation. 1441-1449 - Feifan Lv, Bo Liu, Feng Lu:
Fast Enhancement for Non-Uniform Illumination Images using Light-weight CNNs. 1450-1458 - Zili Yi, Qiang Tang, Vishnu Sanjay Ramiya Srinivasan, Zhan Xu:
Animating Through Warping: An Efficient Method for High-Quality Facial Expression Animation. 1459-1468 - Liang Han, Pichao Wang, Zhaozheng Yin, Fan Wang, Hao Li:
Exploiting Better Feature Aggregation for Video Object Detection. 1469-1477 - Chongyi Li, Huazhu Fu, Runmin Cong, Zechao Li, Qianqian Xu:
NuI-Go: Recursive Non-Local Encoder-Decoder Network for Retinal Image Non-Uniform Illumination Removal. 1478-1487 - Jie Zhao, Kenan Dai, Dong Wang, Huchuan Lu, Xiaoyun Yang:
Online Filtering Training Samples for Robust Visual Tracking. 1488-1496 - Junfu Pu, Wengang Zhou, Hezhen Hu, Houqiang Li:
Boosting Continuous Sign Language Recognition via Cross Modality Augmentation. 1497-1505 - Chen Zhao, Bernard Ghanem:
ThumbNet: One Thumbnail Image Contains All You Need for Recognition. 1506-1514 - Kaihua Zhang, Long Wang, Dong Liu, Bo Liu, Qingshan Liu, Zhu Li:
Dual Temporal Memory Network for Efficient Video Object Segmentation. 1515-1523
Poster Session B1: Deep Learning for Multimedia
- Zeyuan Wang, Yifan Zhao, Jia Li, Yonghong Tian:
Cooperative Bi-path Metric for Few-shot Learning. 1524-1532 - Yu Han, Shuai Yang, Wenjing Wang, Jiaying Liu:
From Design Draft to Real Attire: Unaligned Fashion Image Translation. 1533-1541 - Fei Zhao, Ting Zhang, Chao Ma, Ming Tang, Jinqiao Wang, Xiaobo Wang:
Siamese Attentive Graph Tracking. 1542-1550 - Lingbo Yang, Shanshe Wang, Siwei Ma, Wen Gao, Chang Liu, Pan Wang, Peiran Ren:
HiFaceGAN: Face Renovation via Collaborative Suppression and Replenishment. 1551-1560 - Zhaohui Yang, Yunhe Wang, Chang Xu, Peng Du, Chao Xu, Chunjing Xu, Qi Tian:
Discernible Image Compression. 1561-1569 - Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan:
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation. 1570-1578 - Xiaojun Jia, Xingxing Wei, Xiaochun Cao, Xiaoguang Han:
Adv-watermark: A Novel Watermark Perturbation for Adversarial Examples. 1579-1587 - Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto, Nicu Sebe:
Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild. 1588-1596 - Gang Li, Jian Li, Shanshan Zhang, Jian Yang:
Learning Hierarchical Graph for Occluded Pedestrian Detection. 1597-1605 - Taotao Jing, Haifeng Xia, Zhengming Ding:
Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation. 1606-1614 - Jinpeng Li, Shengcai Liao, Hangzhi Jiang, Ling Shao:
Box Guided Convolution for Pedestrian Detection. 1615-1624 - Yi-Fan Song, Zhang Zhang, Caifeng Shan, Liang Wang:
Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition. 1625-1633 - Xia Du, Chi-Man Pun:
Adversarial Image Attacks Using Multi-Sample and Most-Likely Ensemble Methods. 1634-1642 - Cong Wang, Xiaoying Xing, Yutong Wu, Zhixun Su, Junyang Chen:
DCSFN: Deep Cross-scale Fusion Network for Single Image Rain Removal. 1643-1651
Poster Session C1: Deep Learning for Multimedia
- Yumeng Zhang, Gaoguo Jia, Li Chen, Mingrui Zhang, Junhai Yong:
Self-Paced Video Data Augmentation by Generative Adversarial Networks with Insufficient Samples. 1652-1660 - Xin Wen, Zhizhong Han, Geunhyuk Youk, Yu-Shen Liu:
CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention. 1661-1669 - Yunan Liu, Liang Zhao, Shanshan Zhang, Jian Yang:
Hybrid Resolution Network Using Edge Guided Region Mutual Information Loss for Human Parsing. 1670-1678 - Xiongwei Wu, Doyen Sahoo, Steven C. H. Hoi:
Meta-RCNN: Meta Learning for Few-Shot Object Detection. 1679-1687 - Ke Yang, Peng Zhang, Peng Qiao, Zhiyuan Wang, Dongsheng Li, Yong Dou:
Objectness Consistent Representation for Weakly Supervised Object Detection. 1688-1696 - Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong:
Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network. 1697-1705 - Xierong Zhu, Jiawei Liu, Haoze Wu, Meng Wang, Zheng-Jun Zha:
ASTA-Net: Adaptive Spatio-Temporal Attention Network for Person Re-Identification in Videos. 1706-1715 - Dan Zeng, Han Liu, Hui Lin, Shiming Ge:
Talking Face Generation with Expression-Tailored Generative Adversarial Network. 1716-1724 - Tianyu Yu, Tianrui Hui, Zhihao Yu, Yue Liao, Sansi Yu, Faxi Zhang, Si Liu:
Cross-Modal Omni Interaction Modeling for Phrase Grounding. 1725-1734 - Yazhou Yao, Xiansheng Hua, Guanyu Gao, Zeren Sun, Zhibin Li, Jian Zhang:
Bridging the Web Data and Fine-Grained Visual Recognition via Alleviating Label Noise and Domain Mismatch. 1735-1744 - Jiawei Zhao, Yifan Zhao, Jia Li, Xiaowu Chen:
Is Depth Really Necessary for Salient Object Detection? 1745-1754 - Nobukatsu Kajiura, Satoshi Kosugi, Xueting Wang, Toshihiko Yamasaki:
Self-Play Reinforcement Learning for Fast Image Retargeting. 1755-1763 - Ahmed Fares, Sheng-hua Zhong, Jianmin Jiang:
Brain-media: A Dual Conditioned and Lateralization Supported GAN (DCLS-GAN) towards Visualization of Image-evoked Brain Activities. 1764-1772 - Guangming Yao, Yi Yuan, Tianjia Shao, Kun Zhou:
Mesh Guided One-shot Face Reenactment Using Graph Convolutional Networks. 1773-1781
Poster Session D1: Deep Learning for Multimedia
- Weihao Xia, Yujiu Yang, Jing-Hao Xue, Wensen Feng:
Controllable Continuous Gaze Redirection. 1782-1790 - Xinxiao Wu, Jialu Chen:
Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer. 1791-1799 - Qinjie Xiao, Xiangjun Tang, You Wu, Leyang Jin, Yong-Liang Yang, Xiaogang Jin:
Deep Shapely Portraits. 1800-1808 - Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, Baopu Li:
Depth Super-Resolution via Deep Controllable Slicing Network. 1809-1818 - Chengcheng Ma, Weiliang Meng, Baoyuan Wu, Shibiao Xu, Xiaopeng Zhang:
Efficient Joint Gradient Based Attack Against SOR Defense for 3D Point Cloud Classification. 1819-1827 - Xiaofeng Cong, Jie Gui, Kai-Chao Miao, Jun Zhang, Bing Wang, Peng Chen:
Discrete Haze Level Dehazing Network. 1828-1836 - Shikang Gan, Yong Luo, Yonggang Wen, Tongliang Liu, Han Hu:
Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval. 1837-1845 - Meng Wei, Chun Yuan, Xiaoyu Yue, Kuo Zhong:
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation. 1846-1854 - Lijian Lin, Haosheng Chen, Honglun Zhang, Jun Liang, Yu Li, Ying Shan, Hanzi Wang:
Dual Semantic Fusion Network for Video Object Detection. 1855-1863 - Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Shuhui Wang, Hui Xue, Quan Lu:
Sharp Multiple Instance Learning for DeepFake Video Detection. 1864-1872 - Gang Fu, Qing Zhang, Qifeng Lin, Lei Zhu, Chunxia Xiao:
Learning to Detect Specular Highlights from Real-world Images. 1873-1881 - Jianping Luo, Shaofei Huang, Yuan Yuan:
Video Super-Resolution using Multi-scale Pyramid 3D Convolutional Networks. 1882-1890 - Hao Dou, Chen Chen, Xiyuan Hu, Zuxing Xuan, Zhisen Hu, Silong Peng:
PCA-SRGAN: Incremental Orthogonal Projection Discrimination for Face Super-resolution. 1891-1899 - Yizhi Wang, Zhouhui Lian:
Exploring Font-independent Features for Scene Text Recognition. 1900-1920
Poster Session E1: Deep Learning for Multimedia
- Zhangxuan Gu, Siyuan Zhou, Li Niu, Zihan Zhao, Liqing Zhang:
Context-aware Feature Generation For Zero-shot Semantic Segmentation. 1921-1929 - Wenqing Liu, Miaojing Shi, Teddy Furon, Li Li:
Defending Adversarial Examples via DNN Bottleneck Reinforcement. 1930-1938 - Xun Yang, Xueliang Liu, Meng Jian, Xinjian Gao, Meng Wang:
Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts. 1939-1947 - Chon-Hou Sio, Yu-Jen Ma, Hong-Han Shuai, Jun-Cheng Chen, Wen-Huang Cheng:
S2SiamFC: Self-supervised Fully Convolutional Siamese Network for Visual Tracking. 1948-1957 - Daniel Rotman, Yevgeny Yaroker, Elad Amrani, Udi Barzelay, Rami Ben-Ari:
Learnable Optimal Sequential Grouping for Video Scene Detection. 1958-1966 - Penghao Zhou, Chong Zhou, Pai Peng, Junlong Du, Xing Sun, Xiaowei Guo, Feiyue Huang:
NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination. 1967-1975 - Chuangchuang Tan, Guanghua Gu, Tao Ruan, Shikui Wei, Yao Zhao:
Dual-Gradients Localization Framework for Weakly Supervised Object Localization. 1976-1984 - Weicong Chen, Xu Tan, Yingce Xia, Tao Qin, Yu Wang, Tie-Yan Liu:
DualLip: A System for Joint Lip Reading and Generation. 1985-1993 - Hao Tang, Song Bai, Nicu Sebe:
Dual Attention GANs for Semantic Image Synthesis. 1994-2002 - Renwang Chen, Xuanhong Chen, Bingbing Ni, Yanhao Ge:
SimSwap: An Efficient Framework For High Fidelity Face Swapping. 2003-2011 - Jialian Wu, Chunluan Zhou, Qian Zhang, Ming Yang, Junsong Yuan:
Self-Mimic Learning for Small-scale Pedestrian Detection. 2012-2020 - Chuan Guo, Xinxin Zuo, Sen Wang, Shihao Zou, Qingyao Sun, Annan Deng, Minglun Gong, Li Cheng:
Action2Motion: Conditioned Generation of 3D Human Motions. 2021-2029 - Hui Zhang, Chuan Wang, Nenglun Chen, Jue Wang, Wenping Wang:
Skin Textural Generation via Blue-noise Gabor Filtering based Generative Adversarial Network. 2030-2038 - Jiapeng Li, Ping Wei, Yongchi Zhang, Nanning Zheng:
A Slow-I-Fast-P Architecture for Compressed Video Action Recognition. 2039-2047
Poster Session F1: Deep Learning for Multimedia
- Peisong Wen, Ruolin Yang, Qianqian Xu, Chen Qian, Qingming Huang, Runmin Cong, Jianlou Si:
DMVOS: Discriminative Matching for Real-time Video Object Segmentation. 2048-2056 - Zhensheng Shi, Liangjie Cao, Cheng Guan, Ju Liang, Qianqian Li, Zhaorui Gu, Haiyong Zheng, Bing Zheng:
Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation. 2057-2066 - Wei Yan, Ruonan Zhang, Jing Wang, Shan Liu, Thomas H. Li, Ge Li:
Vaccine-style-net: Point Cloud Completion in Implicit Continuous Function Space. 2067-2075 - Yumeng Zhang, Li Chen, Yufeng Liu, Wen Zheng, Junhai Yong:
Adaptive Wasserstein Hourglass for Weakly Supervised RGB 3D Hand Pose Estimation. 2076-2084 - Weide Liu, Chi Zhang, Guosheng Lin, Tzu-Yi Hung, Chunyan Miao:
Weakly Supervised Segmentation with Maximum Bipartite Graph Matching. 2085-2094 - Daksh Thapar, Aditya Nigam, Chetan Arora:
Recognizing Camera Wearer from Hand Gestures in Egocentric Videos: https: //egocentricbiometric.github.io/. 2095-2103 - Zijian Wang, Yadan Luo, Zi Huang, Mahsa Baktashmotlagh:
Prototype-Matching Graph Network for Heterogeneous Domain Adaptation. 2104-2112 - Huanrong Zhang, Zhi Jin, Xiaojun Tan, Xiying Li:
Towards Lighter and Faster: Learning Wavelets Progressively for Image Super-Resolution. 2113-2121 - Zhen Huang, Xu Shen, Xinmei Tian, Houqiang Li, Jianqiang Huang, Xian-Sheng Hua:
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition. 2122-2130 - Wenheng Chen, He Wang, Yi Yuan, Tianjia Shao, Kun Zhou:
Dynamic Future Net: Diversified Human Motion Generation. 2131-2139 - Xing Lan, Qinghao Hu, Fangzhou Xiong, Cong Leng, Jian Cheng:
ATF: Towards Robust Face Alignment via Leveraging Similarity and Diversity across Different Datasets. 2140-2148 - Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew:
Dual Gaussian-based Variational Subspace Disentanglement for Visible-Infrared Person Re-Identification. 2149-2158 - Chong Mou, Xin Zhang:
Attention Based Dual Branches Fingertip Detection Network and Virtual Key System. 2159-2165 - Md. Moniruzzaman, Zhaozheng Yin, Zhihai He, Ruwen Qin, Ming C. Leu:
Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization. 2166-2174
Poster Session G1: Deep Learning for Multimedia
- Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury:
Adversarial Knowledge Transfer from Unlabeled Data. 2175-2183 - Xiaoqing Liang, Xu Zhao, Chaoyang Zhao, Nanfei Jiang, Ming Tang, Jinqiao Wang:
Task Decoupled Knowledge Distillation For Lightweight Face Detectors. 2184-2192 - Li Tao, Xueting Wang, Toshihiko Yamasaki:
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework. 2193-2201 - Jie Liu, Minqiang Zou, Jie Tang, Gangshan Wu:
Memory Recursive Network for Single Image Super-Resolution. 2202-2210 - Ying Chen, Lifeng Huang, Chengying Gao, Ning Liu:
Scale-aware Progressive Optimization Network. 2211-2219 - Junguang Jiang, Ximei Wang, Mingsheng Long, Jianmin Wang:
Resource Efficient Domain Adaptation. 2220-2228 - Lina Wang, Kang Yang, Wenqi Wang, Run Wang, Aoshuang Ye:
MGAAttack: Toward More Query-efficient Black-box Attack by Microbial Genetic Algorithm. 2229-2236 - Ling Lei, Jianfeng Li, Tong Chen, Shigang Li:
A Novel Graph-TCN with a Graph Structured Representation for Micro-expression Recognition. 2237-2245 - Mengyue Geng, Peixi Peng, Yangru Huang, Yonghong Tian:
Masked Face Recognition with Generative Data Augmentation and Domain Constrained Ranking. 2246-2254 - Junhua Liao, Haihan Duan, Xin Li, Haoran Xu, Yanbing Yang, Wei Cai, Yanru Chen, Liangyin Chen:
Occlusion Detection for Automatic Video Editing. 2255-2263 - Yi Zheng, Yifan Zhao, Mengyuan Ren, He Yan, Xiangju Lu, Junhui Liu, Jia Li:
Cartoon Face Recognition: A Benchmark Dataset. 2264-2272 - Xiquan Guan, Huamin Feng, Weiming Zhang, Hang Zhou, Jie Zhang, Nenghai Yu:
Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication. 2273-2280 - Feifei Ding, Peixi Peng, Yangru Huang, Mengyue Geng, Yonghong Tian:
Masked Face Recognition with Latent Part Detection. 2281-2289 - Chunyan Zhang, Songhua Xu, Zongfang Li:
PanelNet: A Novel Deep Neural Network for Predicting Collective Diagnostic Ratings by a Panel of Radiologists for Pulmonary Nodules. 2290-2298
Poster Session H1: Deep Learning for Multimedia
- Xuan-Son Vu, Duc-Trong Le, Christoffer Edlund, Lili Jiang, Hoang D. Nguyen:
Privacy-Preserving Visual Content Tagging using Graph Transformer Networks. 2299-2307 - Youngjoong Kwon, Stefano Petrangeli, Dahun Kim, Haoliang Wang, Henry Fuchs, Viswanathan Swaminathan:
Rotationally-Consistent Novel View Synthesis for Humans. 2308-2316 - Minhao Fan, Wenjing Wang, Wenhan Yang, Jiaying Liu:
Integrating Semantic Segmentation and Retinex Model for Low-Light Image Enhancement. 2317-2325 - Xixia Xu, Qi Zou, Xue Lin:
Alleviating Human-level Shift: A Robust Domain Adaptation Method for Multi-person Pose Estimation. 2326-2335 - Lei Zhao, Sihuan Lin, Ailin Li, Huaizhong Lin, Wei Xing, Dongming Lu:
SpatialGAN: Progressive Image Generation Based on Spatial Recursive Adversarial Expansion. 2336-2344 - Li-Ming Zhan, Bo Liu, Lu Fan, Jiaxin Chen, Xiao-Ming Wu:
Medical Visual Question Answering via Conditional Reasoning. 2345-2354 - Jing Zhang, Yang Cao, Zheng-Jun Zha, Dacheng Tao:
Nighttime Dehazing with a Synthetic Benchmark. 2355-2363 - Chenru Jiang, Kaizhu Huang, Shufei Zhang, Xinheng Wang, Jimin Xiao:
Pay Attention Selectively and Comprehensively: Pyramid Gating Network for Human Pose Estimation without Pre-training. 2364-2371 - Chuanyi Zhang, Yazhou Yao, Xiangbo Shu, Zechao Li, Zhenmin Tang, Qi Wu:
Data-driven Meta-set Based Fine-Grained Visual Recognition. 2372-2381 - Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, Yu-Gang Jiang:
WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. 2382-2390 - Ce Zheng, Yecheng Lyu, Ming Li, Ziming Zhang:
LodoNet: A Deep Neural Network with 2D Keypoint Matching for 3D LiDAR Odometry Estimation. 2391-2399 - Weitao Wang, Ruyang Liu, Meng Wang, Sen Wang, Xiaojun Chang, Yang Chen:
Memory-Based Network for Scene Graph with Unbalanced Relations. 2400-2408 - Haotian Wang, Wenjing Yang, Ji Wang, Ruxin Wang, Long Lan, Mingyang Geng:
Pairwise Similarity Regularization for Adversarial Domain Adaptation. 2409-2418 - Mingyao Hong, Guorong Li, Xinfeng Zhang, Qingming Huang:
Generalized Zero-Shot Video Classification via Generative Adversarial Networks. 2419-2426
Poster Session A2: Deep Learning for Multimedia
- Maciej Tomczak, Masataka Goto, Jason Hockman:
Drum Synthesis and Rhythmic Transformation with Adversarial Autoencoders. 2427-2435 - Guibiao Liao, Wei Gao, Qiuping Jiang, Ronggang Wang, Ge Li:
MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection. 2436-2444 - Songhua Liu, Hao Wu, Shoutong Luo, Zhengxing Sun:
Stable Video Style Transfer Based on Partial Convolution with Depth-Aware Supervision. 2445-2453 - Yimeng Zhang, Xiao-Yang Liu, Bo Wu, Anwar Walid:
Video Synthesis via Transform-Based Tensor Neural Network. 2454-2462 - Ziming Wang, Yuexian Zou, Zeming Zhang:
Cluster Attention Contrast for Video Anomaly Detection. 2463-2471 - Wolmer Bigi, Claudio Baecchi, Alberto Del Bimbo:
Automatic Interest Recognition from Posture and Behaviour. 2472-2480 - Yangfan Sun, Li Li, Zhu Li, Shan Liu:
Referenceless Rate-Distortion Modeling with Learning from Bitstream and Pixel Features. 2481-2489 - Lilang Lin, Sijie Song, Wenhan Yang, Jiaying Liu:
MS2L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition. 2490-2498 - Dang-Khoa Nguyen, Wei-Lun Tseng, Hong-Han Shuai:
Domain-Adaptive Object Detection via Uncertainty-Aware Distribution Alignment. 2499-2507 - Zhenyu Wu, Duc Hoang, Shih-Yao Lin, Yusheng Xie, Liangjian Chen, Yen-Yu Lin, Zhangyang Wang, Wei Fan:
MM-Hand: 3D-Aware Multi-Modal Guided Hand Generation for 3D Hand Pose Synthesis. 2508-2516 - Cong Wang, Yutong Wu, Zhixun Su, Junyang Chen:
Joint Self-Attention and Scale-Aggregation for Self-Calibrated Deraining Network. 2517-2525 - Ling-An Zeng, Fa-Ting Hong, Wei-Shi Zheng, Qi-Zhi Yu, Wei Zeng, Yaowei Wang, Jian-Huang Lai:
Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos. 2526-2534 - Yan Hong, Li Niu, Jianfu Zhang, Weijie Zhao, Chen Fu, Liqing Zhang:
F2GAN: Fusing-and-Filling GAN for Few-shot Image Generation. 2535-2543 - Xianggang Yu, Haolin Liu, Xiaoguang Han, Zhen Li, Zixiang Xiong, Shuguang Cui:
JAFPro: Joint Appearance Fusion and Propagation for Human Video Motion Transfer from Multiple Reference Images. 2544-2552
Poster Session B2: Deep Learning for Multimedia & Emerging Multimedia Applications
- Jakub Lokoc, Tomás Soucek, Patrik Veselý, Frantisek Mejzlík, Jiaqi Ji, Chaoxi Xu, Xirong Li:
A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval. 2553-2561 - Yucheng Hang, Qingmin Liao, Wenming Yang, Yupeng Chen, Jie Zhou:
Attention Cube Network for Image Restoration. 2562-2570 - Yu Zhou, Hongtao Xie, Shancheng Fang, Yan Li, Yongdong Zhang:
CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes. 2571-2580 - Xiaoqian Guo, Xiangyang Li, Shuqiang Jiang:
Expressional Region Retrieval. 2581-2589 - Shuyuan Li, Jianguo Li, Hanlin Tang, Rui Qian, Weiyao Lin:
ATRW: A Benchmark for Amur Tiger Re-identification in the Wild. 2590-2598 - Weiying Wang, Jieting Chen, Qin Jin:
VideoIC: A Video Interactive Comments Dataset and Multimodal Multitask Learning for Comments Generation. 2599-2607 - Jiewen Zhao, Ruize Han, Yiyang Gan, Liang Wan, Wei Feng, Song Wang:
Human Identification and Interaction Detection in Cross-View Multi-Person Videos with Wearable Cameras. 2608-2616 - Miaohui Wang, Wuyuan Xie, Maolin Cui:
Surface Reconstruction with Unconnected Normal Maps: An Efficient Mesh-based Approach. 2617-2625 - Murari Mandal, Lav Kush Kumar, Santosh Kumar Vipparthi:
MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos. 2626-2635 - Xuewen Yang, Dongliang Xie, Xin Wang, Jiangbo Yuan, Wanying Ding, Pengyun Yan:
Learning Tuple Compatibility for Conditional Outfit Recommendation. 2636-2644 - Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin:
Efficient Crowd Counting via Structured Knowledge Transfer. 2645-2654 - Yifei Huang, Chenhui Li, Xiaohu Guo, Jing Liao, Chenxu Zhang, Changbo Wang:
DeSmoothGAN: Recovering Details of Smoothed Images via Spatial Feature-wise Transformation and Full Attention. 2655-2663 - Hyewon Song, Jaeseong Park, Suwoong Heo, Jiwoo Kang, Sanghoon Lee:
PatchMatch based Multiview Stereo with Local Quadric Window. 2664-2672 - Alexander Tesch, Ralf Dörner:
Expert Performance in the Examination of Interior Surfaces in an Automobile: Virtual Reality vs. Reality. 2673-2681
Poster Session C2: Emerging Multimedia Applications
- Wentao Bao, Qi Yu, Yu Kong:
Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning. 2682-2690 - Xuan Shao, Lin Zhang, Tianjun Zhang, Ying Shen, Hongyu Li, Yicong Zhou:
A Tightly-coupled Semantic SLAM System with Visual, Inertial and Surround-view Sensors for Autonomous Indoor Parking. 2691-2699 - Yimu Wang, Shiyin Lu, Lijun Zhang:
Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy. 2700-2709 - Xin Wang, Huijun Zhang, Lei Cao, Ling Feng:
Leverage Social Media for Personalized Stress Detection. 2710-2718 - Yingying Deng, Fan Tang, Weiming Dong, Wen Sun, Feiyue Huang, Changsheng Xu:
Arbitrary Style Transfer via Multi-Adaptation Network. 2719-2727 - Jingcai Guo, Shiheng Ma, Jie Zhang, Qihua Zhou, Song Guo:
Dual-view Attention Networks for Single Image Super-Resolution. 2728-2736 - Zhongnian Li, Tao Zhang, Ruoyu Chen, Daoqiang Zhang:
MRI Measurement Matrix Learning via Correlation Reweighting. 2737-2745 - Ruize Han, Jiewen Zhao, Wei Feng, Yiyang Gan, Liang Wan, Song Wang:
Complementary-View Co-Interest Person Detection. 2746-2754 - Weidong He, Zhi Li, Dongcai Lu, Enhong Chen, Tong Xu, Baoxing Huai, Jing Yuan:
Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements. 2755-2764 - Carlos Bermejo, Dimitris Chatzopoulos, Pan Hui:
EyeShopper: Estimating Shoppers' Gaze using CCTV Cameras. 2765-2774 - Eugene Yujun Fu, Zhongqi Yang, Hong Va Leong, Grace Ngai, Chi-Wai Do, Lily Chan:
Exploiting Active Learning in Novel Refractive Error Detection with Smartphones. 2775-2783 - Liang Han, Zhaozheng Yin, Zhurong Xia, Minqian Tang, Rong Jin:
Price Suggestion for Online Second-hand Items with Texts and Images. 2784-2792 - Xuebin Sun, Sukai Wang, Miaohui Wang, Shing Shin Cheng, Ming Liu:
An Advanced LiDAR Point Cloud Sequence Coding Scheme for Autonomous Driving. 2793-2801 - Xing Xu, Jiefu Chen, Jinhui Xiao, Zheng Wang, Yang Yang, Heng Tao Shen:
Learning Optimization-based Adversarial Perturbations for Attacking Sequential Recognition Models. 2802-2822
Poster Session D2: Emerging Multimedia Applications & Emotional and Social Signals in Multimedia
- Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha:
Emotions Don't Lie: An Audio-Visual Deepfake Detection Method using Affective Cues. 2823-2832 - Delian Ruan, Yan Yan, Si Chen, Jing-Hao Xue, Hanzi Wang:
Deep Disturbance-Disentangled Learning for Facial Expression Recognition. 2833-2841 - Xinhui Song, Tianyang Shi, Zunlei Feng, Mingli Song, Jackie Lin, Chuanjie Lin, Changjie Fan, Yi Yuan:
Unsupervised Learning Facial Parameter Regressor for Action Unit Intensity Estimation via Differentiable Renderer. 2842-2851 - Jingjun Liang, Ruichen Li, Qin Jin:
Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching. 2852-2861 - Songcheng Gao, Wenzhong Li, Lynda J. Song, Xiao Zhang, Mingkai Lin, Sanglu Lu:
PersonalitySensing: A Multi-View Multi-Task Learning Approach for Personality Detection based on Smartphone Usage. 2862-2870 - Hong-Xia Xie, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng:
AU-assisted Graph Attention Convolutional Network for Micro-Expression Recognition. 2871-2880 - Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia, Cheng Lu, Jiateng Liu:
DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions in the Wild. 2881-2889 - Zheng Zhang, Taoyue Wang, Lijun Yin:
Region of Interest Based Graph Convolution: A Heatmap Regression Approach for Action Unit Detection. 2890-2898 - Junjie Zhu, Bingjun Luo, Sicheng Zhao, Shihui Ying, Xibin Zhao, Yue Gao:
IExpressNet: Facial Expression Recognition with Incremental Classes. 2899-2908 - Ziyu Jia, Youfang Lin, Xiyang Cai, Haobin Chen, Haijun Gou, Jing Wang:
SST-EmotionNet: Spatial-Spectral-Temporal based Attention 3D Dense Network for EEG Emotion Recognition. 2909-2917 - Connor T. Heaton, David M. Schwartz:
Language Models as Emotional Classifiers for Textual Conversation. 2918-2926 - Bin Xia, Shangfei Wang:
Occluded Facial Expression Recognition with Step-Wise Assistance from Unpaired Non-Occluded Images. 2927-2935 - Bin Xia, Weikang Wang, Shangfei Wang, Enhong Chen:
Learning from Macro-expression: a Micro-expression Recognition Framework. 2936-2944 - Sicheng Zhao, Yaxian Li, Xingxu Yao, Weizhi Nie, Pengfei Xu, Jufeng Yang, Kurt Keutzer:
Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space. 2945-2954
Poster Session E2: Emotional and Social Signals in Multimedia & Media Interpretation
- Zhiwei Xu, Shangfei Wang, Can Wang:
Exploiting Multi-Emotion Relations at Feature and Label Levels for Emotion Tagging. 2955-2963 - Linyi Zhou, Xijian Fan, Yingjie Ma, Tardi Tjahjadi, Qiaolin Ye:
Uncertainty-aware Cross-dataset Facial Expression Recognition via Regularized Conditional Alignment. 2964-2972 - Tugba Kulahcioglu, Gerard de Melo:
Fonts Like This but Happier: A New Way to Discover Fonts. 2973-2981 - Huiyuan Yang, Taoyue Wang, Lijun Yin:
Adaptive Multimodal Fusion for Facial Action Units Recognition. 2982-2990 - Shi Yin, Shangfei Wang, Xiaoping Chen, Enhong Chen:
Exploiting Self-Supervised and Semi-Supervised Learning for Facial Landmark Tracking with Unlabeled Data. 2991-2998 - Woan-Shiuan Chien, Hao-Chun Yang, Chi-Chun Lee:
Cross Corpus Physiological-based Emotion Recognition Using a Learnable Visual Semantic Graph Convolutional Network. 2999-3006 - Mengshi Qi, Jie Qin, Xiantong Zhen, Di Huang, Yi Yang, Jiebo Luo:
Few-Shot Ensemble Learning for Video Classification with SlowFast Memory Networks. 3007-3015 - Chenyu Li, Shiming Ge, Daichi Zhang, Jia Li:
Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation. 3016-3024 - Jizhe Zhou, Chi-Man Pun, Yu Tong:
Privacy-sensitive Objects Pixelation for Live Video Streaming. 3025-3033 - Jiaxin Chen, Jie Qin, Yichao Yan, Lei Huang, Li Liu, Fan Zhu, Ling Shao:
Deep Local Binary Coding for Person Re-Identification by Delving into the Details. 3034-3043 - Hai Xu, Hongtao Xie, Zheng-Jun Zha, Sun'ao Liu, Yongdong Zhang:
March on Data Imperfections: Domain Division and Domain Generalization for Semantic Segmentation. 3044-3053 - Beibei Lin, Shunli Zhang, Feng Bao:
Gait Recognition with Multiple-Temporal-Scale 3D Convolutional Neural Network. 3054-3062 - Yi Li, Wenjie Pei, Zhenyu He:
SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space. 3063-3071 - Jianbo Jiao, Ying Cao, Manfred Lau, Rynson W. H. Lau:
Tactile Sketch Saliency. 3072-3080
Poster Session F2: Media Interpretation & Mobile Multimedia
- Zhengrui Ma, Zhao Kang, Guangchun Luo, Ling Tian, Wenyu Chen:
Towards Clustering-friendly Representations: Subspace Clustering via Graph Filtering. 3081-3089 - Yuyu Guo, Jingkuan Song, Lianli Gao, Heng Tao Shen:
One-shot Scene Graph Generation. 3090-3098 - Huiyuan Fu, Ting Yu, Xin Wang, Huadong Ma:
Cross-Granularity Learning for Multi-Domain Image-to-Image Translation. 3099-3107 - Rui Li, Xiantuo He, Yu Zhu, Xianjun Li, Jinqiu Sun, Yanning Zhang:
Enhancing Self-supervised Monocular Depth Estimation via Incorporating Robust Constraints. 3108-3117 - Tuo Feng, Licheng Jiao, Hao Zhu, Long Sun:
A Novel Object Re-Track Framework for 3D Point Clouds. 3118-3126 - Zixuan Su, Xindi Shang, Jingjing Chen, Yu-Gang Jiang, Zhiyong Qiu, Tat-Seng Chua:
Video Relation Detection via Multiple Hypothesis Association. 3127-3135 - Lin Huang, Jianchao Tan, Jingjing Meng, Ji Liu, Junsong Yuan:
HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation. 3136-3145 - Lixuan Meng, Chenggang Yan, Jun Li, Jian Yin, Wu Liu, Hongtao Xie, Liang Li:
Multi-Features Fusion and Decomposition for Age-Invariant Face Recognition. 3146-3154 - Hongshuo Tian, Ning Xu, An-An Liu, Yongdong Zhang:
Part-Aware Interactive Learning for Scene Graph Generation. 3155-3163 - Raul Gomez, Yahui Liu, Marco De Nadai, Dimosthenis Karatzas, Bruno Lepri, Nicu Sebe:
Retrieval Guided Unsupervised Multi-domain Image to Image Translation. 3164-3172 - Liuwan Zhu, Rui Ning, Cong Wang, Chunsheng Xin, Hongyi Wu:
GangSweep: Sweep out Neural Backdoors by GAN. 3173-3181 - Zhengcong Fei:
Iterative Back Modification for Faster Image Captioning. 3182-3190 - Carlos Bermejo, Tristan Braud, Ji Yang, Shayan Mirjafari, Bowen Shi, Yu Xiao, Pan Hui:
VIMES: A Wearable Memory Assistance System for Automatic Information Retrieval. 3191-3200
Poster Session G2: Multimedia -- Art and Entertainment, Cloud and Edge Computing, Data Systems, & HCI
- Tianyang Shi, Zhengxia Zou, Xinhui Song, Zheng Song, Changjian Gu, Changjie Fan, Yi Yuan:
Neutral Face Game Character Auto-Creation via PokerFace-GAN. 3201-3209 - Peng Lu, Jinbei Yu, Xujun Peng, Zhaoran Zhao, Xiaojie Wang:
Gray2ColorNet: Transfer More Colors from Reference Image. 3210-3218 - Cheng-Che Lee, Wan-Yi Lin, Yen-Ting Shih, Pei-Yi (Patricia) Kuo, Li Su:
Crossing You in Style: Cross-modal Style Transfer from Music to Visual Arts. 3219-3227 - Keyu Chen, Jianmin Zheng, Jianfei Cai, Juyong Zhang:
Modeling Caricature Expressions by 3D Blendshape and Dynamic Texture. 3228-3236 - Jia Li, Nan Gao, Tong Shen, Wei Zhang, Tao Mei, Hui Ren:
SketchMan: Learning to Create Professional Sketches. 3237-3245 - Xuanhong Chen, Xirui Yan, Naiyuan Liu, Ting Qiu, Bingbing Ni:
Anisotropic Stroke Control for Multiple Artists Style Transfer. 3246-3255 - Hao Hao, Changqiao Xu, Lujie Zhong, Gabriel-Miro Muntean:
A Multi-update Deep Reinforcement Learning Algorithm for Edge Computing Service Offloading. 3256-3264 - Zichuan Xu, Jiangkai Wu, Qiufen Xia, Pan Zhou, Jiankang Ren, Huizhi Liang:
Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds. 3265-3273 - Wanqian Zhang, Dayan Wu, Yu Zhou, Bo Li, Weiping Wang, Dan Meng:
Deep Unsupervised Hybrid-similarity Hadamard Hashing. 3274-3282 - Mengmeng Jing, Jingjing Li, Lei Zhu, Ke Lu, Yang Yang, Zi Huang:
Incomplete Cross-modal Retrieval with Dual-Aligned Variational Autoencoders. 3283-3291 - Tie Liu, Mai Xu, Shengxi Li, Rui Ding, Huaida Liu:
MRS-Net: Multi-Scale Recurrent Scalable Network for Face Quality Enhancement of Compressed Videos. 3292-3301 - Jasper R. R. Uijlings, Mykhaylo Andriluka, Vittorio Ferrari:
Panoptic Image Annotation with a Collaborative Assistant. 3302-3310 - Jari Korhonen, Yicheng Su, Junyong You:
Blind Natural Video Quality Prediction via Statistical Temporal Features and Deep Spatial Features. 3311-3319
Session H2: Multimedia HCI, Multimeda Scalability and Management, & Multimedia Search and Recommendation
- Zhiyuan Hu, Jia Jia, Bei Liu, Yaohua Bu, Jianlong Fu:
Aesthetic-Aware Image Style Transfer. 3320-3329 - Naoki Sugimoto, Yoshihito Ebine, Kiyoharu Aizawa:
Building Movie Map - A Tool for Exploring Areas in a City - and its Evaluations. 3330-3338 - Jing Li, Suiyi Ling, Junle Wang, Patrick Le Callet:
A Probabilistic Graphical Model for Analyzing the Subjective Visual Quality Assessment Data from Crowdsourcing. 3339-3347 - Linsheng Li, Bin Yang, Cathy Bao, Shuo Liu, Randy Xu, Yong Yao, Mohammad R. Haghighat, Jerry W. Hu, Shoumeng Yan, Zhengwei Qi:
DroidCloud: Scalable High Density AndroidTM Cloud Rendering. 3348-3356 - Jiaxin Wu, Chong-Wah Ngo:
Interpretable Embedding for Ad-Hoc Video Search. 3357-3366 - Feifei Zhang, Mingliang Xu, Qirong Mao, Changsheng Xu:
Joint Attribute Manipulation and Modality Alignment Learning for Composing Text and Image to Image Retrieval. 3367-3376 - Yangxi Li, Han Hu, Jin Li, Yong Luo, Yonggang Wen:
Semi-supervised Online Multi-Task Metric Learning for Visual Recognition and Retrieval. 3377-3385 - Yu-Wei Zhan, Xin Luo, Yongxin Wang, Xin-Shun Xu:
Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval. 3386-3394 - Weizhi Nie, Yue Zhao, An-An Liu, Zan Gao, Yuting Su:
Multi-graph Convolutional Network for Unsupervised 3D Shape Retrieval. 3395-3403 - Wenjie Yang, Dangwei Li, Xiaotang Chen, Kaiqi Huang:
Bottom-Up Foreground-Aware Feature Fusion for Person Search. 3404-3412 - Zhi Chen, Sen Wang, Jingjing Li, Zi Huang:
Rethinking Generative Zero-Shot Learning: An Ensemble Learning Perspective for Recognising Visual Patches. 3413-3421 - Yanan Wang, Shengcai Liao, Ling Shao:
Surpassing Real-World Source Training Data: Random 3D Characters for Generalizable Person Re-Identification. 3422-3430 - Meng-Jiun Chiou, Zhenguang Liu, Yifang Yin, An-An Liu, Roger Zimmermann:
Zero-Shot Multi-View Indoor Localization via Graph Location Networks. 3431-3440 - Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, Tao Mei:
Hierarchical Gumbel Attention Network for Text-based Person Search. 3441-3449
Poster Session A3: Multimedia Search and Recommendation & Multimedia System and Middleware
- Jiawei Liu, Zheng-Jun Zha, Richang Hong, Meng Wang, Yongdong Zhang:
Dual Context-Aware Refinement Network for Person Search. 3450-3459 - Lei Meng, Fuli Feng, Xiangnan He, Xiaoyan Gao, Tat-Seng Chua:
Heterogeneous Fusion of Semantic and Collaborative Information for Visually-Aware Food Recommendation. 3460-3468 - Xiaoyu Du, Xiang Wang, Xiangnan He, Zechao Li, Jinhui Tang, Tat-Seng Chua:
How to Learn Item Representation for Cold-Start Multimedia Recommendation? 3469-3477 - Xuzheng Yu, Tian Gan, Yinwei Wei, Zhiyong Cheng, Liqiang Nie:
Personalized Item Recommendation for Second-hand Trading Platform. 3478-3486 - Hao Jiang, Wenjie Wang, Yinwei Wei, Zan Gao, Yinglong Wang, Liqiang Nie:
What Aspect Do You Like: Multi-scale Time-aware User Interest Modeling for Micro-video Recommendation. 3487-3495 - Yuting Su, Yuqian Li, Dan Song, Zhendong Mao, Xuanya Li, An-An Liu:
Domain-Specific Alignment Network for Multi-Domain Image-Based 3D Object Retrieval. 3496-3504 - Jun Hu, Quan Fang, Shengsheng Qian, Changsheng Xu:
Multi-modal Attentive Graph Pooling Model for Community Question Answer Matching. 3505-3513 - Tianwei Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang:
Task-distribution-aware Meta-learning for Cold-start CTR Prediction. 3514-3522 - Ziruo Sun, Xiushan Nie, Xiaoming Xi, Yilong Yin:
CFVMNet: A Multi-branch Network for Vehicle Re-identification Based on Common Field of View. 3523-3531 - Chunyuan Yuan, Qianwen Ma, Junyang Chen, Wei Zhou, Xiaodan Zhang, Xuehai Tang, Jizhong Han, Songlin Hu:
Exploiting Heterogeneous Artist and Listener Preference Graph for Music Genre Classification. 3532-3540 - Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Tat-Seng Chua:
Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback. 3541-3549 - Riddhiman Dasgupta, Francis Tom, Sudhir Kumar, Mithun Das Gupta, Yokesh Kumar, Badri N. Patro, Vinay P. Namboodiri:
Visually Precise Query. 3550-3558 - Dingjian Jin, Anke Zhang, Jiamin Wu, Gaochang Wu, Haoqian Wang, Lu Fang:
All-in-depth via Cross-baseline Light Field Camera. 3559-3567 - Mohammad Amin Arab, Puria Azadi Moghadam, Mohamed E. Hussein, Wael Abd-Almageed, Mohamed Hefeeda:
Revealing True Identity: Detecting Makeup Attacks in Face-based Biometric Systems. 3568-3576
Poster Session B3: Multimedia System and Middleware & Multimedia Telepresence and Virtual/Augmented Reality
- Negin Ghamsarian, Hadi Amirpour, Christian Timmerer, Mario Taschwer, Klaus Schöffmann:
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. 3577-3585 - Chirag Raman, Stephanie Tan, Hayley Hung:
A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings. 3586-3594 - Shuoqian Wang, Xiaoyang Zhang, Mengbai Xiao, Kenneth Chiu, Yao Liu:
SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication. 3595-3603 - Yawen Lu, Yuxing Wang, Guoyu Lu:
Single Image Shape-from-Silhouettes. 3604-3613 - Zhongze Tang, Xianglong Feng, Yi Xie, Huy Phan, Tian Guo, Bo Yuan, Sheng Wei:
VVSec: Securing Volumetric Video Streaming via Benign Use of Adversarial Perturbation. 3614-3623 - Viktor Kelkkanen, Markus Fiedler, David Lindero:
Bitrate Requirements of Non-Panoramic VR Remote Rendering. 3624-3631 - Serhan Gül, Sebastian Bosse, Dimitri Podborski, Thomas Schierl, Cornelius Hellge:
Kalman Filter-based Head Motion Prediction for Cloud-based Mixed Reality. 3632-3641 - Chaoyang Zeng, Tiesong Zhao, Qian Liu, Yiwen Xu, Kai Wang:
Perception-Lossless Codec of Haptic Data with Low Delay. 3642-3650 - Xin Suo, Minye Wu, Yanshun Zhang, Yingliang Zhang, Lan Xu, Qiang Hu, Jingyi Yu:
Neural3D: Light-weight Neural Portrait Scanning via Context-aware Correspondence Learning. 3651-3660 - Jack Ratcliffe, Laurissa Tokarchuk:
Presence, Embodied Interaction and Motivation: Distinct Learning Phenomena in an Immersive Virtual Environment. 3661-3668 - Shishir Subramanyam, Irene Viola, Alan Hanjalic, Pablo César:
User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. 3669-3677 - Rui-Xiao Zhang, Ming Ma, Tianchi Huang, Hanyu Li, Jiangchuan Liu, Lifeng Sun:
Leveraging QoE Heterogenity for Large-Scale Livecaset Scheduling. 3678-3686 - JongBeom Jeong, Soonbin Lee, Il-Woong Ryu, Tuan Thanh Le, Eun-Seok Ryu:
Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems. 3687-3695
Poster Session C3: Multimedia Transport and Delivery & Multimedia Analysis and Description
- Yixiang Mao, Liyang Sun, Yong Liu, Yao Wang:
Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming. 3696-3704 - Rongqun Lin, Linwei Zhu, Shiqi Wang, Sam Kwong:
Towards Modality Transferable Visual Information Representation with Optimal Model Compression. 3705-3714 - Chao Zhou, Shuoqian Wang, Mengbai Xiao, Sheng Wei, Yao Liu:
AdaP-360: User-Adaptive Area-of-Focus Projections for Bandwidth-Efficient 360-Degree Video Streaming. 3715-3723 - Praveen Kumar Yadav, Wei Tsang Ooi:
Tile Rate Allocation for 360-Degree Tiled Adaptive Video Streaming. 3724-3733 - Lianli Gao, Junchen Zhu, Jingkuan Song, Feng Zheng, Heng Tao Shen:
Lab2Pix: Label-Adaptive Generative Adversarial Network for Unsupervised Image Synthesis. 3734-3742 - Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, Dacheng Tao, Qi Tian:
Deep Multimodal Neural Architecture Search. 3743-3752 - Jie Wen, Zheng Zhang, Zhao Zhang, Zhihao Wu, Lunke Fei, Yong Xu, Bob Zhang:
DIMC-net: Deep Incomplete Multi-view Clustering Network. 3753-3761 - Bin Zhu, Chong-Wah Ngo, Jingjing Chen:
Cross-domain Cross-modal Food Transfer. 3762-3770 - Li-Shuai Gao, Hua Zhang, Zan Gao, Weili Guan, Zhiyong Cheng, Meng Wang:
Texture Semantically Aligned with Visibility-aware for Partial Person Re-identification. 3771-3779 - Xuanhan Wang, Lianli Gao, Jingkuan Song, Heng Tao Shen:
KTN: Knowledge Transfer Network for Multi-person DensePose Estimation. 3780-3788 - Junwen Chen, Wentao Bao, Yu Kong:
Activity-driven Weakly-Supervised Spatio-Temporal Grounding from Untrimmed Videos. 3789-3797 - Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang:
Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis. 3798-3806 - Wenqiao Zhang, Xin Eric Wang, Siliang Tang, Haizhou Shi, Haochen Shi, Jun Xiao, Yueting Zhuang, William Yang Wang:
Relational Graph Learning for Grounded Video Description Generation. 3807-3828
Poster Session D3: Multimedia Analysis and Description & Multimedia Fusion and Embedding
- Deepak Kumar, Chetan Kumar, Chun-Wei Seah, Siyu Xia, Ming Shao:
Finding Achilles' Heel: Adversarial Attack on Multi-modal Action Recognition. 3829-3837 - Jinxing Li, Hongwei Yong, Feng Wu, Mu Li:
Online Multi-view Subspace Learning with Mixed Noise. 3838-3846 - Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng:
LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark. 3847-3856 - Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian:
Towards More Explainability: Concept Knowledge Mining Network for Event Recognition. 3857-3865 - Shuang Li, Binhui Xie, Jiashu Wu, Ying Zhao, Chi Harold Liu, Zhengming Ding:
Simultaneous Semantic Alignment Network for Heterogeneous Domain Adaptation. 3866-3874 - Liang Li, Shijie Yang, Li Su, Shuhui Wang, Chenggang Yan, Zhengjun Zha, Qingming Huang:
Diverter-Guider Recurrent Network for Diverse Poems Generation from Image. 3875-3883 - Ying Cheng, Ruize Wang, Zhihao Pan, Rui Feng, Yuejie Zhang:
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning. 3884-3892 - Haoming Xu, Runhao Zeng, Qingyao Wu, Mingkui Tan, Chuang Gan:
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization. 3893-3901 - Yikai Wang, Fuchun Sun, Ming Lu, Anbang Yao:
Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion. 3902-3910 - Ruijian Jia, Xinsheng Wang, Shanmin Pang, Jihua Zhu, Jianru Xue:
Look, Listen and Infer. 3911-3919 - Zhi Chen, Wei Yang, Zhenbo Xu, Xike Xie, Liusheng Huang:
DCNet: Dense Correspondence Neural Network for 6DoF Object Pose Estimation in Occluded Scenes. 3929-3937
Poster Session E3: Multimedia Fusion and Embedding & Music, Speech and Audio & Summarization, Analytics and Storytelling
- Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang:
Transferrable Referring Expression Grounding with Concept Transfer and Context Inheritance. 3938-3946 - Yanhui Guo, Xi Zhang, Xiaolin Wu:
Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos. 3947-3955 - Yingying Zhang, Quan Fang, Shengsheng Qian, Changsheng Xu:
Multi-modal Multi-relational Feature Aggregation Network for Medical Knowledge Representation Learning. 3956-3965 - Wenqiao Zhang, Siliang Tang, Yanpeng Cao, Jun Xiao, Shiliang Pu, Fei Wu, Yueting Zhuang:
Photo Stream Question Answer. 3966-3975 - Xinhang Song, Haitao Zeng, Sixian Zhang, Luis Herranz, Shuqiang Jiang:
Generalized Zero-shot Learning with Multi-source Semantic Embeddings for Scene Recognition. 3976-3985 - Xia Du, Chi-Man Pun, Zheng Zhang:
A Unified Framework for Detecting Audio Adversarial Examples. 3986-3994 - Kunihiro Miyazaki, Takayuki Uchiba, Scarlett Young, Yuichi Sasaki, Kenji Tanaka:
Emerging Topic Detection on the Meta-data of Images from Fashion Social Media. 3995-4003 - Xin Li, Tianwei Lin, Xiao Liu, Wangmeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen, Chuang Gan:
Deep Concept-wise Temporal Convolutional Networks for Action Localization. 4004-4012 - Shuang Wu, Shaojing Fan, Zhiqi Shen, Mohan S. Kankanhalli, Anthony K. H. Tung:
Who You Are Decides How You Tell. 4013-4022 - Junyan Wang, Yang Bai, Yang Long, Bingzhang Hu, Zhenhua Chai, Yu Guan, Xiaolin Wei:
Query Twice: Dual Mixture Attention Meta Learning for Video Summarization. 4023-4031 - Kai Niu, Yan Huang, Liang Wang:
Textual Dependency Embedding for Person Search by Language. 4032-4040 - Chenchen Jing, Yuwei Wu, Mingtao Pei, Yao Hu, Yunde Jia, Qi Wu:
Visual-Semantic Graph Matching for Visual Grounding. 4041-4050 - Yi Zheng, Wenda Qin, Derry Wijaya, Margrit Betke:
LAL: Linguistically Aware Learning for Scene Text Recognition. 4051-4059
Poster Session F3: Vision and Language
- Fen Liu, Guanghui Xu, Qi Wu, Qing Du, Wei Jia, Mingkui Tan:
Cascade Reasoning Network for Text-based Visual Question Answering. 4060-4069 - Daizong Liu, Xiaoye Qu, Xiao-Yang Liu, Jianfeng Dong, Pan Zhou, Zichuan Xu:
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization. 4070-4078 - Zijian Zhang, Zhou Zhao, Zhu Zhang, Baoxing Huai, Jing Yuan:
Text-Guided Image Inpainting. 4079-4087 - Mohan Zhang, Qiqi Gao, Jinglu Wang, Henrik Turbell, David Zhao, Jinhui Yu, Yan Lu:
RT-VENet: A Convolutional Network for Real-time Video Enhancement. 4088-4097 - Zhu Zhang, Zhijie Lin, Zhou Zhao, Jieming Zhu, Xiuqiang He:
Regularized Two-Branch Proposal Networks for Weakly-Supervised Moment Retrieval in Videos. 4098-4106 - Miao Zhang, Yu Zhang, Yongri Piao, Beiqi Hu, Huchuan Lu:
Feature Reintegration over Differential Treatment: A Top-down and Adaptive Fusion Network for RGB-D Salient Object Detection. 4107-4115 - Hao Wang, Zheng-Jun Zha, Xuejin Chen, Zhiwei Xiong, Jiebo Luo:
Dual Path Interaction Network for Video Moment Localization. 4116-4124 - Guiyu Tian, Shuai Wang, Jie Feng, Li Zhou, Yadong Mu:
Cap2Seg: Inferring Semantic and Spatial Context from Captions for Zero-Shot Image Segmentation. 4125-4134 - Congcong Zhu, Xiaoqiang Li, Jide Li, Guangtai Ding, Weiqin Tong:
Spatial-Temporal Knowledge Integration: Robust Self-Supervised Facial Landmark Tracking. 4135-4143 - Zengyi Qin, Jinglu Wang, Yan Lu:
Weakly Supervised 3D Object Detection from Point Clouds. 4144-4152 - Fenglin Liu, Xian Wu, Shen Ge, Xiaoyu Zhang, Wei Fan, Yuexian Zou:
Bridging the Gap between Vision and Language Domains for Improved Image Captioning. 4153-4161 - Da Cao, Yawen Zeng, Meng Liu, Xiangnan He, Meng Wang, Zheng Qin:
STRONG: Spatio-Temporal Reinforcement Learning for Cross-Modal Video Moment Localization. 4162-4170 - Heqian Qiu, Hongliang Li, Qingbo Wu, Fanman Meng, Hengcan Shi, Taijin Zhao, King Ngi Ngan:
Language-Aware Fine-Grained Object Representation for Referring Expression Comprehension. 4171-4180 - Xu Yang, Chongyang Gao, Hanwang Zhang, Jianfei Cai:
Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning. 4181-4189
Poster Session G3: Vision and Language
- Yong Wang, Wenkai Zhang, Qing Liu, Zhengyuan Zhang, Xin Gao, Xian Sun:
Improving Intra- and Inter-Modality Visual Relation for Image Captioning. 4190-4198 - Xiaoshuai Sun, Xuying Zhang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji:
Exploring Language Prior for Mode-Sensitive Visual Attention Modeling. 4199-4207 - Jiacheng Li, Siliang Tang, Juncheng Li, Jun Xiao, Fei Wu, Shiliang Pu, Yueting Zhuang:
Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling. 4208-4216 - Anwen Hu, Shizhe Chen, Qin Jin:
ICECAP: Information Concentrated Entity-aware Image Captioning. 4217-4225 - Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji, Fuhai Chen, Jianzhuang Liu, Qi Tian:
Attacking Image Captioning Towards Accuracy-Preserving Target Words Removal. 4226-4234 - Ye Liu, Junsong Yuan, Chang Wen Chen:
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection. 4235-4243 - Siyuan Pan, Ling Dai, Xuhong Hou, Huating Li, Bin Sheng:
ChefGAN: Food Image Generation from Recipes. 4244-4252 - Fei Liu, Jing Liu, Xinxin Zhu, Richang Hong, Hanqing Lu:
Dual Hierarchical Temporal Convolutional Network with QA-Aware Dynamic Normalization for Video Story Question Answering. 4253-4261 - Omkar Gune, Biplab Banerjee, Subhasis Chaudhuri, Fabio Cuzzolin:
Generalized Zero-Shot Learning using Generated Proxy Unseen Samples and Entropy Separation. 4262-4270 - Zipeng Xu, Fangxiang Feng, Xiaojie Wang, Yushu Yang, Huixing Jiang, Zhongyuan Wang:
Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue. 4271-4279 - Xiaoye Qu, Pengwei Tang, Zhikang Zou, Yu Cheng, Jianfeng Dong, Pan Zhou, Zichuan Xu:
Fine-grained Iterative Attention Network for Temporal Language Localization in Videos. 4280-4288 - Zhipu Liu, Lei Zhang, Yang Yang:
Hierarchical Bi-Directional Feature Perception Network for Person Re-Identification. 4289-4298 - Zhongzhou Zhang, Lei Zhang:
Hard Negative Samples Emphasis Tracker without Anchors. 4299-4308 - Yankun Xi, Guoli Yan, Jing Hua, Zichun Zhong:
JointFontGAN: Joint Geometry-Content GAN for Font Generation via Few-Shot Learning. 4309-4317
Poster Session H3: Vision and Language
- Hua Qi, Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, Wei Feng, Yang Liu, Jianjun Zhao:
DeepRhythm: Exposing DeepFakes with Attentional Visual Heartbeat Rhythms. 4318-4327 - Jinglin Liu, Yi Ren, Zhou Zhao, Chen Zhang, Baoxing Huai, Jing Yuan:
FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire. 4328-4336 - Jing Wang, Jinhui Tang, Jiebo Luo:
Multimodal Attention with Image Text Spatial Relationship for OCR-Based Image Captioning. 4337-4345 - Yi Zhang, Jitao Sang:
Towards Accuracy-Fairness Paradox: Adversarial Example-based Data Augmentation for Visual Debiasing. 4346-4354 - Botian Shi, Lei Ji, Zhendong Niu, Nan Duan, Ming Zhou, Xilin Chen:
Learning Semantic Concepts and Temporal Alignment for Narrated Video Procedural Captioning. 4355-4363 - Quan Meng, Jiakai Zhang, Qiang Hu, Xuming He, Jingyi Yu:
LGNN: A Context-aware Line Segment Detector. 4364-4372 - Shengyu Zhang, Tan Jiang, Tan Wang, Kun Kuang, Zhou Zhao, Jianke Zhu, Jin Yu, Hongxia Yang, Fei Wu:
DeVLBert: Learning Deconfounded Visio-Linguistic Representations. 4373-4382 - Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, Jianfeng Gao:
Sequential Attention GAN for Interactive Image Editing. 4383-4391
Interactive Art Session
- Tiago Martins, João Correia, Sérgio Rebelo, João Bicker, Penousal Machado:
Portraits of No One: An Internet Artwork. 4392-4393 - Ruixue Liu, Shaozu Yuan, Meng Chen, Baoyang Chen, Zhijie Qiu, Xiaodong He:
MaLiang: An Emotion-driven Chinese Calligraphy Artwork Composition System. 4394-4396 - Xiaohui Wang, Xia Liang, Miao Lu, Jingyan Qin:
First Impression: AI Understands Personality. 4397-4398 - Siyu Jin, Jingyan Qin, Wenfa Li:
Draw Portraits by Music: A Music based Image Style Transformation. 4399-4400 - Xiaohui Wang, Xiaoxue Ding, Jinke Li, Jingyan Qin:
Little World: Virtual Humans Accompany Children on Dramatic Performance. 4401-4402 - James She, Carmen Ng, Wadia Sheng:
Keep Running - AI Paintings of Horse Figure and Portrait. 4403-4404 - Siyu Hu, Bo Shui, Siyu Jin, Xiaohui Wang:
AI Mirror: Visualize AI's Self-knowledge. 4405-4406
Brave New Ideas Session
- Tianlang Chen, Wei Xiong, Haitian Zheng, Jiebo Luo:
Image Sentiment Transfer. 4407-4415 - Ali Rostami, Vaibhav Pandey, Nitish Nag, Vesper Wang, Ramesh C. Jain:
Personal Food Model. 4416-4424 - Christian von der Weth, Ashraf M. Abdul, Shaojing Fan, Mohan S. Kankanhalli:
Helping Users Tackle Algorithmic Threats on Social Media: A Multimedia Research Agenda. 4425-4434
Reproducibility Session
- Fan Yu, Dandan Wang, Haonan Wang, Tongwei Ren, Jinhui Tang, Gangshan Wu, Jingjing Chen, Michael Riegler:
Reproducibility Companion Paper: Instance of Interest Detection. 4435-4438 - Xin Wang, Bo Wu, Yueqi Zhong, Wei Hu, Jan Zahálka:
Reproducibility Companion Paper: Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network. 4439-4443 - Quoc-Tuan Truong, Hady W. Lauw, Martin Aumüller, Naoko Nitta:
Reproducibility Companion Paper: Visual Sentiment Analysis for Review Images with Item-Oriented and User-Oriented CNN. 4444-4447 - Tuan Hoang, Thanh-Toan Do, Ngai-Man Cheung, Michael Riegler, Jan Zahálka:
Reproducibility Companion Paper: Selective Deep Convolutional Features for Image Retrieval. 4448-4452
Open Source Software
- Huaizheng Zhang, Yuanming Li, Yizheng Huang, Yonggang Wen, Jianxiong Yin, Kyle Guan:
MLModelCI: An Automatic Cloud Platform for Efficient MLaaS. 4453-4456 - Huaizheng Zhang, Yuanming Li, Qiming Ai, Yong Luo, Yonggang Wen, Yichao Jin, Ta Nguyen Binh Duong:
Hysia: Serving DNN-Based Video-to-Retail Applications in Cloud. 4457-4460 - Benyi Hu, Ren-Jie Song, Xiu-Shen Wei, Yazhou Yao, Xian-Sheng Hua, Yuehu Liu:
PyRetri: A PyTorch-based Library for Unsupervised Image Retrieval by Deep Convolutional Neural Networks. 4461-4464 - Ralph Gasser, Luca Rossetto, Silvan Heller, Heiko Schuldt:
Cottontail DB: An Open Source Database System for Multimedia Retrieval and Analysis. 4465-4468 - Joseph Bethge, Christian Bartz, Haojin Yang, Christoph Meinel:
BMXNet 2: An Open Source Framework for Low-bit Networks - Reproducing, Understanding, Designing and Showcasing. 4469-4472 - Yuhao Cheng, Wu Liu, Pengrui Duan, Jingen Liu, Tao Mei:
PyAnomaly: A Pytorch-based Toolkit for Video Anomaly Detection. 4473-4476 - Giuseppe Ribezzo, Luca De Cicco, Vittorio Palmisano, Saverio Mascolo:
TAPAS-360°: A Tool for the Design and Experimental Evaluation of 360° Video Streaming Systems. 4477-4480 - Miroslav Kratochvíl, Frantisek Mejzlík, Patrik Veselý, Tomás Soucek, Jakub Lokoc:
SOMHunter: Lightweight Video Search System with SOM-Guided Relevance Feedback. 4481-4484
Demo Session I
- Samah Saeed Baraheem, Trung-Nghia Le, Tam V. Nguyen:
Text-to-Image Synthesis via Aesthetic Layout. 4485-4487 - Zijun Sha, Zelong Zeng, Zheng Wang, Yoichi Natori, Yasuhiro Taniguchi, Shin'ichi Satoh:
Progressive Domain Adaptation for Robot Vision Person Re-identification. 4488-4490 - Paula Viana, Pedro Carvalho, Maria Teresa Andrade, Pieter P. Jonker, Vasileios Papanikolaou, Inês N. Teixeira, Luís Vilaça, José P. Pinto, Tiago Soares da Costa:
Semantic Storytelling Automation: A Context-Aware and Metadata-Driven Approach. 4491-4493 - Yanyi Zhang, Ming Kong, Tianqi Zhao, Wenchen Hong, Qiang Zhu, Fei Wu:
ADHD Intelligent Auxiliary Diagnosis System Based on Multimodal Information Fusion. 4494-4496 - Jounsup Park, Mingyuan Wu, Eric Lee, Klara Nahrstedt, Yash Shah, Arielle Rosenthal, John Murray, Kevin Spiteri, Michael Zink, Ramesh K. Sitaraman:
Video 360 Content Navigation for Mobile HMD Devices. 4497-4499 - Yuanfeng Song, Di Jiang, Xiaoling Huang, Yawen Li, Qian Xu, Raymond Chi-Wing Wong, Qiang Yang:
GoldenRetriever: A Speech Recognition System Powered by Modern Information Retrieval. 4500-4502 - Andrew C. Freeman, Ketan Mayer-Patel:
Integrating Event Camera Sensor Emulator. 4503-4505 - Alex Lee, Chang-Uk Kwak, Jeong-Woo Son, Gyeong-June Hahm, Minho Han, Sun-Joong Kim:
Scene-segmented Video Information Annotation System V2.0. 4506-4508 - Tan Tang, Junxiu Tang, Jiewen Lai, Lu Ying, Peiran Ren, Lingyun Yu, Yingcai Wu:
SmartShots: Enabling Automatic Generation of Videos with Data Visualizations Embedded. 4509-4511
Demo Session II
- Sha Yu, Kevin McGuinness, Patricia Moore, David Azcona, Noel E. O'Connor:
A Smart-Site-Survey System using Image-based 3D Metric Reconstruction and Interactive Panorama Visualization. 4512-4514 - Ning Zhang, Tong Shen, Yue Chen, Wei Zhang, Dan Zeng, Jingen Liu, Tao Mei:
AI-SAS: Automated In-match Soccer Analysis System. 4515-4517 - Maarten Sukel, Stevan Rudinac, Marcel Worring:
Detecting Urban Issues With the Object Detection Kit. 4518-4520 - Yaohua Bu, Weijun Li, Tianyi Ma, Shengqi Chen, Jia Jia, Kun Li, Xiaobo Lu:
Visual-speech Synthesis of Exaggerated Corrective Feedback. 4521-4523 - Gjorgji Strezoski, Lucas Fijen, Jonathan Mitnik, Dániel László, Pieter de Marez Oyens, Yoni Schirris, Marcel Worring:
TindART: A Personal Visual Arts Recommender. 4524-4526 - Dhruv Verma, Kshitij Gulati, Vasu Goel, Rajiv Ratn Shah:
Fashionist: Personalising Outfit Recommendation for Cold-Start Scenarios. 4527-4529 - Xuncheng Liu, Jingyi Wang, Weizhan Zhang, Qinghua Zheng, Xuanya Li:
EmotionTracker: A Mobile Real-time Facial Expression Tracking System with the Assistant of Public AI-as-a-Service. 4530-4532 - Xuanyu Wang, Yang Wang, Yan Shi, Weizhan Zhang, Qinghua Zheng:
AvatarMeeting: An Augmented Reality Remote Interaction System With Personalized Avatars. 4533-4535 - Haolin Ren, Zheng Wang, Zhixiang Wang, Lixiong Chen, Shin'ichi Satoh, Daning Hu:
An Interactive Design for Visualizable Person Re-Identification. 4536-4538
Demo Session III
- Filippo Mameli, Marco Bertini, Leonardo Galteri, Alberto Del Bimbo:
Image and Video Restoration and Compression Artefact Removal Using a NoGAN Approach. 4539-4541 - Wentao Jiang, Si Liu, Chen Gao, Ran He, Bo Li, Shuicheng Yan:
Beautify As You Like. 4542-4544 - Jiawei Zuo, Yue Chen, Linfang Wang, Yingwei Pan, Ting Yao, Ke Wang, Tao Mei:
iDirector: An Intelligent Directing System for Live Broadcast. 4545-4547 - Ali Rostami, Bihao Xu, Ramesh C. Jain:
Multimedia Food Logger. 4548-4549 - Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Tao Mei:
A Cross-modality and Progressive Person Search System. 4550-4552 - Teo T. Niemirepo, Marko Viitanen, Jarno Vanne:
Binocular Multi-CNN System for Real-Time 3D Pose Estimation. 4553-4555 - Itsuki Hashimoto, Yuanyuan Wang, Yukiko Kawai, Kazutoshi Sumiya:
An Interaction-based Video Viewing Support System using Geographical Relationships. 4556-4558 - Feijie Wu, Ho Yin Yuen, Henry C. B. Chan, Victor C. M. Leung, Wei Cai:
Infinity Battle: A Glance at How Blockchain Techniques Serve in a Serverless Gaming System. 4559-4561 - Ekin Gedik, Hayley Hung:
ConfFlow: A Tool to Encourage New Diverse Collaborations. 4562-4564
Grand Challenge: SMP Challenge
- Xin Lai, Yihong Zhang, Wei Zhang:
HyFea: Winning Solution to Social Media Popularity Prediction for Multimedia Grand Challenge 2020. 4565-4569 - Kai Wang, Penghui Wang, Xin Chen, Qiushi Huang, Zhendong Mao, Yongdong Zhang:
A Feature Generalization Framework for Social Media Popularity Prediction. 4570-4574 - Weilong Chen, Feng Hong, Chenghao Huang, Shaoliang Zhang, Rui Wang, Ruobing Xie, Feng Xia, Leyu Lin, Yanru Zhang, Yan Wang:
Curriculum Learning for Wide Multimedia-Based Transformer with Graph Target Detection. 4575-4579 - Kele Xu, Zhimin Lin, Jianqiao Zhao, Peichang Shi, Wei Deng, Huaimin Wang:
Multimodal Deep Learning for Social Media Popularity Prediction With Attention Mechanism. 4580-4584 - Chih-Chung Hsu, Wen-Hai Tseng, Hao-Ting Yang, Chia-Hsiang Lin, Chi-Hung Kao:
Rethinking Relation between Model Stacking and Recurrent Neural Networks for Social Media Prediction. 4585-4589
Grand Challenge: Video Relation understanding & Pre-training for Video Captions Challenge
- Wentao Xie, Guanghui Ren, Si Liu:
Video Relation Detection with Trajectory-aware Multi-modal Features. 4590-4594 - Zhipeng Luo, Zhiguang Zhang, Yuehan Yao:
A Strong Baseline for Multiple Object Tracking on VidOR Dataset. 4595-4599 - Yiqing Huang, Qiuyu Cai, Siyu Xu, Jiansheng Chen:
XlanV Model with Adaptively Multi-Modality Feature Fusing for Video Captioning. 4600-4604 - Jingwen Chen, Hongyang Chao:
VideoTRM: Pre-training for Video Captioning Challenge 2020. 4605-4609 - Lanxiao Wang, Chao Shang, Heqian Qiu, Taijin Zhao, Benliu Qiu, Hongliang Li:
Multi-stage Tag Guidance Network in Video Caption. 4610-4614
Grand Challenge: Human Centric Analysis I
- Jinlong Peng, Yueyang Gu, Yabiao Wang, Chengjie Wang, Jilin Li, Feiyue Huang:
Dense Scene Multiple Object Tracking with Box-Plane Matching. 4615-4619 - Ancong Wu, Chengzhi Lin, Bogao Chen, Weihao Huang, Zeyu Huang, Wei-Shi Zheng:
Transductive Multi-Object Tracking in Complex Events by Interactive Self-Training. 4620-4624 - Bing Shuai, Andrew G. Berneshawi, Manchen Wang, Chunhui Liu, Davide Modolo, Xinyu Li, Joseph Tighe:
Application of Multi-Object Tracking with Siamese Track-RCNN to the Human in Events Dataset. 4625-4629 - Shuning Chang, Li Yuan, Xuecheng Nie, Ziyuan Huang, Yichen Zhou, Yupeng Chen, Jiashi Feng, Shuicheng Yan:
Towards Accurate Human Pose Estimation in Videos of Crowded Scenes. 4630-4634 - Lei Yuan, Shu Zhang, Fubiao Feng, Naike Wei, Huadong Pan:
Combined Distillation Pose. 4635-4639
Grand Challenge: Deep Video Understanding & BioMedia
- Fan Yu, Dandan Wang, Beibei Zhang, Tongwei Ren:
Deep Relationship Analysis in Video with Multimodal Feature Fusion. 4640-4644 - Matthias Baumgartner, Luca Rossetto, Abraham Bernstein:
Towards Using Semantic-Web Technologies for Multi-Modal Knowledge Graph Construction. 4645-4649 - Vishal Anand, Raksha Ramesh, Ziyin Wang, Yijing Feng, Jiana Feng, Wenfeng Lyu, Tianle Zhu, Serena Yuan, Ching-Yung Lin:
Story Semantic Relationships from Multimodal Cognitions. 4650-4654 - Steven Alexander Hicks, Vajira Thambawita, Hugo Lewi Hammer, Trine B. Haugen, Jorunn M. Andersen, Oliwia Witczak, Pål Halvorsen, Michael A. Riegler:
ACM Multimedia BioMedia 2020 Grand Challenge Overview. 4655-4658 - Ming Feng, Kele Xu, Yin Wang:
A Quantitative Comparison of Different Machine Learning Approaches for Human Spermatozoa Quality Prediction Using Multimodal Datasets. 4659-4663
Grand Challenge: CitySCENE
- Kun Liu, Minzhi Zhu, Huiyuan Fu, Huadong Ma, Tat-Seng Chua:
Enhancing Anomaly Detection in Surveillance Videos with Transfer Learning from Action Recognition. 4664-4668 - Jie Wu, Yingying Li, Wei Zhang, Yi Wu, Xiao Tan, Hongwu Zhang, Shilei Wen, Errui Ding, Guanbin Li:
Modularized Framework with Category-Sensitive Abnormal Filter for City Anomaly Detection. 4669-4673 - Soumil Kanwal, Vineet Mehta, Abhinav Dhall:
Large Scale Hierarchical Anomaly Detection and Temporal Localization. 4674-4678 - Hui Lv, Chunyan Xu, Zhen Cui:
Global Information Guided Video Anomaly Detection. 4679-4683
Grand Challenge: Human Centric Analysis II
- Li Yuan, Shuning Chang, Ziyuan Huang, Yichen Zhou, Yupeng Chen, Xuecheng Nie, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan:
A Simple Baseline for Pose Tracking in Videos of Crowed Scenes. 4684-4688 - Lumin Xu, Ruihan Xu, Sheng Jin:
HiEve ACM MM Grand Challenge 2020: Pose Tracking in Crowded Scenes. 4689-4693 - Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yupeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan:
Toward Accurate Person-level Action Recognition in Videos of Crowed Scenes. 4694-4698 - Yanbin Hao, Zi-Niu Liu, Hao Zhang, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo:
Person-level Action Recognition in Complex Events via TSD-TSM Networks. 4699-4702 - Tingtian Li, Zixun Sun, Xiao Chen:
Group-Skeleton-Based Human Action Recognition in Complex Events. 4703-4707
Grand Challenge: AI Meets Beauty
- Jun Yu, Guochen Xie, Mengyan Li, Haonian Xie, Xinlong Hao, Fang Gao, Feng Shuang:
Attention Based Beauty Product Retrieval Using Global and Local Descriptors. 4708-4712 - Runming Yan, Yongchun Lin, Zhichao Deng, Liang Lei, Chudong Xu:
Multi-Feature Fusion Method Based on Salient Object Detection for Beauty Product Retrieval. 4713-4717 - Jingwen Hou, Sijie Ji, Annan Wang:
Attention-driven Unsupervised Image Retrieval for Beauty Products with Visual and Textual Clues. 4718-4722 - Fangxiang Feng, Tianrui Niu, Ruifan Li, Xiaojie Wang, Huixing Jiang:
Learning Visual Features from Product Title for Image Retrieval. 4723-4727 - Toan H. Vu, An Dang, Jia-Ching Wang:
Learning to Remember Beauty Products. 4728-4732 - Kele Xu, Yuzhong Liu, Ming Feng, Jianqiao Zhao, Huaimin Wang, Hengxing Cai:
Multi-Scale Generalized Attention-Based Regional Maximum Activation of Convolutions for Beauty Product Retrieval. 4733-4737
Doctoral Symposium
- Mathieu Febvay:
Low-level Optimizations for Faster Mobile Deep Learning Inference Frameworks. 4738-4742 - Ha Thi Phuong Thao:
Deep Neural Networks for Predicting Affective Responses from Movies. 4743-4747 - Abhinav Shukla:
Learning Self-Supervised Multimodal Representations of Human Behaviour. 4748-4751 - Wen Guo:
Multi-person Pose Estimation in Complex Physical Interactions. 4752-4755
Workshop Summaries
- Raphaël Troncy, Jorma Laaksonen, Hamed R. Tavakoli, Lyndon J. B. Nixon, Vasileios Mezaris, Mohammad Hosseini:
AI4TV 2020: 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery. 4756-4757 - Tanaya Guha, Vlad Hosu, Dietmar Saupe, Bastian Goldlücke, Naveen Kumar, Weisi Lin, Victor R. Martinez, Krishna Somandepalli, Shrikanth Narayanan, Wen-Huang Cheng, Kree McLaughlin, Hartwig Adam, John See, Lai-Kuan Wong:
ATQAM/MAST'20: Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends. 4758-4760 - Xavier Alameda-Pineda, Miriam Redi, Jahna Otterbacher, Nicu Sebe, Shih-Fu Chang:
FATE/MM 20: 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in MultiMedia. 4761-4762 - Wu Liu, Chuang Gan, Jingkuan Song, Dingwen Zhang, Wenbing Huang, John Smith:
HUMA'20: 1st International Workshop on Human-Centric Multimedia Analysis. 4763-4764 - Rainer Lienhart, Thomas B. Moeslund, Hideo Saito:
MMSports'20: 3rd International Workshop on Multimedia Content Analysis in Sports. 4765-4766 - Alex Hauptmann, João Magalhães, Ricardo Gamelas Sousa, João Paulo Costeira:
MuCAI'20: 1st International Workshop on Multimodal Conversational AI. 4767-4768 - Lukas Stappen, Björn W. Schuller, Iulia Lefter, Erik Cambria, Ioannis Kompatsiaris:
Summary of MuSe 2020: Multimodal Sentiment Analysis, Emotion-target Engagement and Trustworthiness Detection in Real-life Media. 4769-4770 - Xinbo Gao, Patrick Le Callet, Jing Li, Zhi Li, Wen Lu, Jiachen Yang:
QoEVMA'20: 1st Workshop on Quality of Experience (QoE) in Visual Multimedia Applications. 4771-4772 - Valérie Gouet-Brunet, Margarita Khokhlova, Ronak Kosti, Liming Chen, Xu-Cheng Yin:
SUMAC 2020: The 2nd Workshop on Structuring and Understanding of Multimedia heritAge Contents. 4773-4774
Tutorials
- Xin Wang, Wenwu Zhu, Yonghong Tian, Wen Gao:
Multimedia Intelligence: When Multimedia Meets Artificial Intelligence. 4775-4776 - Andrea Cavallaro, Mohammad Malekzadeh, Ali Shahin Shamsabadi:
Deep Learning for Privacy in Multimedia. 4777-4778 - Gerald Friedland:
Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data. 4779-4781 - Shuqiang Jiang, Weiqing Min:
Food Computing for Multimedia. 4782-4784 - Shayok Chakraborty:
Active Learning for Multimedia Computing: Survey, Recent Trends and Applications. 4785-4786 - Martin Alain, Emin Zerman, Cagri Ozcinar:
Immersive Imaging Technologies: From Capture to Display. 4787-4788 - Zheng Wang, Wu Liu, Yusuke Matsui, Shin'ichi Satoh:
Effective and Efficient: Toward Open-world Instance Re-identification. 4789-4790 - Jen-Tzung Chien:
Deep Bayesian Multimedia Learning. 4791-4793
Panels
- Jiaying Liu, Wen-Huang Cheng, Klara Nahrstedt, Ramesh C. Jain, Elisa Ricci, Hyeran Byun:
Coping with Pandemics: Opportunities and Challenges for AI Multimedia in the "New Normal". 4794-4795 - Susanne Boll, Hari Sundaram, Svetha Venkatesh, Martha A. Larson, Mohan S. Kankanhalli:
The World has Changed - The World Needs to Change. What Multimedia has to Offer for Our Common Digital Future. 4796-4798
Keynote Talks
- Klara Nahrstedt:
360-Video Navigation for 360-Multimedia Delivery Systems: Research Challenges and Opportunities. 4799 - Itamar Friedman:
Cloud Drive Apps - Closing the Gap Between AI Research to Practice. 4800 - Dong Yu:
Building Digital Human. 4801 - Shuicheng Yan:
Neural Network Design for Multimedia: Bio-inspired and Hardware-friendly. 4802
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.