VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Jan 1, 2023ยท,,,,ยท
0 min read
Kashu Yamazaki*
Khoa Vo*
Sang Truong
Bhiksha Raj
Ngan Le
Image credit: UnsplashPublication
AAAI (2023)
Image credit: Unsplash