VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

Jan 1, 2023ยท
Kashu Yamazaki*
,
Khoa Vo*
,
Sang Truong
,
Bhiksha Raj
,
Ngan Le
ยท 0 min read
Image credit: Unsplash
Publication
AAAI (2023)