Visual Evolutionary Optimization on Combinatorial Problems with Multimodal Large Language Models: A Case Study of Influence Maximization
Journal:
arXiv
Published Date:
May 11, 2025
Abstract
Graph-structured combinatorial problems in complex networks are prevalent in
many domains, and are computationally demanding due to their complexity and
non-linear nature. Traditional evolutionary algorithms (EAs), while robust,
often face obstacles due to content-shallow encoding limitations and lack of
structural awareness, necessitating hand-crafted modifications for effective
application. In this work, we introduce an original framework, Visual
Evolutionary Optimization (VEO), leveraging multimodal large language models
(MLLMs) as the backbone evolutionary optimizer in this context. Specifically,
we propose a context-aware encoding way, representing the solution of the
network as an image. In this manner, we can utilize MLLMs' image processing
capabilities to intuitively comprehend network configurations, thus enabling
machines to solve these problems in a human-like way. We have developed
MLLM-based operators tailored for various evolutionary optimization stages,
including initialization, crossover, and mutation. Furthermore, we propose that
graph sparsification can effectively enhance the applicability and scalability
of VEO on large-scale networks, owing to the scale-free nature of real-world
networks. We demonstrate the effectiveness of our method using a well-known
task in complex networks, influence maximization, and validate it on eight
different real-world networks of various structures. The results have confirmed
VEO's reliability and enhanced effectiveness compared to traditional
evolutionary optimization.