"B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens."

Zhuqiang Lu et al. (2024)

Details and statistics

DOI: 10.48550/ARXIV.2412.09919

access: open

type: Informal or Other Publication

metadata version: 2025-01-20