AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans

Loh, Dillon; Bednarz, Tomasz; Xia, Xinxing; Guan, Frank

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.18539 (cs)

[Submitted on 27 Nov 2024]

Title:AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans

Authors:Dillon Loh, Tomasz Bednarz, Xinxing Xia, Frank Guan

View PDF HTML (experimental)

Abstract:Visual Language Navigation is a task that challenges robots to navigate in realistic environments based on natural language instructions. While previous research has largely focused on static settings, real-world navigation must often contend with dynamic human obstacles. Hence, we propose an extension to the task, termed Adaptive Visual Language Navigation (AdaVLN), which seeks to narrow this gap. AdaVLN requires robots to navigate complex 3D indoor environments populated with dynamically moving human obstacles, adding a layer of complexity to navigation tasks that mimic the real-world. To support exploration of this task, we also present AdaVLN simulator and AdaR2R datasets. The AdaVLN simulator enables easy inclusion of fully animated human models directly into common datasets like Matterport3D. We also introduce a "freeze-time" mechanism for both the navigation task and simulator, which pauses world state updates during agent inference, enabling fair comparisons and experimental reproducibility across different hardware. We evaluate several baseline models on this task, analyze the unique challenges introduced by AdaVLN, and demonstrate its potential to bridge the sim-to-real gap in VLN research.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2411.18539 [cs.CV]
	(or arXiv:2411.18539v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.18539

Submission history

From: Frank Guan [view email]
[v1] Wed, 27 Nov 2024 17:36:08 UTC (21,356 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators