Exploring Textual Semantics Diversity for Image Transmission in Semantic Communication Systems using Visual Language Model
Journal:
arXiv
Published Date:
Mar 25, 2025
Abstract
In recent years, the rapid development of machine learning has brought
reforms and challenges to traditional communication systems. Semantic
communication has appeared as an effective strategy to effectively extract
relevant semantic signals semantic segmentation labels and image features for
image transmission. However, the insufficient number of extracted semantic
features of images will potentially result in a low reconstruction accuracy,
which hinders the practical applications and still remains challenging for
solving. In order to fill this gap, this letter proposes a multi-text
transmission semantic communication (Multi-SC) system, which uses the visual
language model (VLM) to assist in the transmission of image semantic signals.
Unlike previous image transmission semantic communication systems, the proposed
system divides the image into multiple blocks and extracts multiple text
information from the image using a modified large language and visual assistant
(LLaVA), and combines semantic segmentation tags with semantic text for image
recovery. Simulation results show that the proposed text semantics diversity
scheme can significantly improve the reconstruction accuracy compared with
related works.