A Satellite-Ground Synergistic Large Vision-Language Model System for Earth Observation
Journal:
arXiv
Published Date:
Jul 8, 2025
Abstract
Recently, large vision-language models (LVLMs) unleash powerful analysis
capabilities for low Earth orbit (LEO) satellite Earth observation images in
the data center. However, fast satellite motion, brief satellite-ground station
(GS) contact windows, and large size of the images pose a data download
challenge. To enable near real-time Earth observation applications (e.g.,
disaster and extreme weather monitoring), we should explore how to deploy LVLM
in LEO satellite networks, and design SpaceVerse, an efficient satellite-ground
synergistic LVLM inference system. To this end, firstly, we deploy compact
LVLMs on satellites for lightweight tasks, whereas regular LVLMs operate on GSs
to handle computationally intensive tasks. Then, we propose a computing and
communication co-design framework comprised of a progressive confidence network
and an attention-based multi-scale preprocessing, used to identify on-satellite
inferring data, and reduce data redundancy before satellite-GS transmission,
separately. We implement and evaluate SpaceVerse on real-world LEO satellite
constellations and datasets, achieving a 31.2% average gain in accuracy and a
51.2% reduction in latency compared to state-of-the-art baselines.