Enhancing Visual Analysis in Person Re-Identification With Vision-Language Models.

Journal: IEEE computer graphics and applications

Published Date: Jul 28, 2025

Abstract

Image-based person re-identification aims to match individuals across multiple cameras. Despite advances in machine learning, their effectiveness in real-world scenarios remains limited, often leaving users to handle fine-grained matching manually. Recent work has explored textual information as auxiliary cues, but existing methods generate coarse descriptions and fail to integrate them effectively into retrieval workflows. To address these issues, we adopt a vision-language model fine-tuned with domain-specific knowledge to generate detailed textual descriptions and keywords for pedestrian images. We then create a joint search space combining visual and textual information, using image clustering and keyword co-occurrence to build a semantic layout. Additionally, we introduce a dynamic spiral word cloud algorithm to improve visual presentation and enhance semantic associations. Finally, we conduct case studies, a user study, and expert feedback, demonstrating the usability and effectiveness of our system.

Authors

Wang Xia
Tianci Wang

From the Department of Diagnostic and Interventional Radiology (F.K., G.M.F., T.W., S.T.A., C.K., S.N., D.T.) and Department of Medicine III (J.N.K.), University Hospital Aachen, Pauwelsstraße 30, 52074 Aachen, Germany; Physics of Molecular Imaging Systems, Institute of Experimental Molecular Imaging (T.H.), and Institute of Imaging and Computer Vision (J.S.), RWTH Aachen University, Aachen, Germany; Ocumeda, Munich, Germany (C.H.); Department of Diagnostic and Interventional Radiology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany (K.B.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (J.N.K.); Division of Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK (J.N.K.); and Department of Medical Oncology, National Center for Tumor Diseases, University Hospital Heidelberg, Heidelberg, Germany (J.N.K.).
Jiawei Li

School of Chemistry & Chemical Engineering, College of Guangling, Yangzhou University Yangzhou 225002 PR China zhuxiashi@sina.com.
Guodao Sun

College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310000, Zhejiang, China. Electronic address: guodao@zjut.edu.cn.
Haidong Gao
Xu Tan

School of Humanities, Media and Design, Hangzhou Dianzi University, Hangzhou, China.
Ronghua Liang

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40720277)

Enhancing Visual Analysis in Person Re-Identification With Vision-Language Models.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Enhancing Visual Analysis in Person Re-Identification With Vision-Language Models.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals