UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue
Journal:
arXiv
Published Date:
Mar 4, 2025
Abstract
Emergency search and rescue (SAR) operations often require rapid and precise
target identification in complex environments where traditional manual drone
control is inefficient. In order to address these scenarios, a rapid SAR
system, UAV-VLRR (Vision-Language-Rapid-Response), is developed in this
research. This system consists of two aspects: 1) A multimodal system which
harnesses the power of Visual Language Model (VLM) and the natural language
processing capabilities of ChatGPT-4o (LLM) for scene interpretation. 2) A
non-linearmodel predictive control (NMPC) with built-in obstacle avoidance for
rapid response by a drone to fly according to the output of the multimodal
system. This work aims at improving response times in emergency SAR operations
by providing a more intuitive and natural approach to the operator to plan the
SAR mission while allowing the drone to carry out that mission in a rapid and
safe manner. When tested, our approach was faster on an average by 33.75% when
compared with an off-the-shelf autopilot and 54.6% when compared with a human
pilot. Video of UAV-VLRR: https://youtu.be/KJqQGKKt1xY