"Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding."

Yan Shu et al. (2024)

Details and statistics

DOI: 10.48550/ARXIV.2409.14485

access: open

type: Informal or Other Publication

metadata version: 2024-10-15