CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models
SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection
SlotVLA: Towards Modeling of Object-Relation Representations in Robotic Manipulation
Clutter-Robust Vision-Language-Action Models through Object-Centric and Geometry Grounding
Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective
Dr. Ngan Le has received a prestigious 2025 NSF Faculty Early Career Development (CAREER) Award
CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling
The Doctor is in, officially Dr. Khoa Vo! πΌπ¨βπ
HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model
Open-fusion: Real-time open-vocabulary 3d mapping and queryable scene representation