OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Journal: arXiv

Published Date: Apr 20, 2025

Abstract

The practical deployment of medical vision-language models (Med-VLMs) necessitates seamless integration of textual data with diverse visual modalities, including 2D/3D images and videos, yet existing models typically employ separate encoders for different modalities. To address this limitation, we present OmniV-Med, a unified framework for multimodal medical understanding. Our technical contributions are threefold: First, we construct OmniV-Med-Instruct, a comprehensive multimodal medical dataset containing 252K instructional samples spanning 14 medical image modalities and 11 clinical tasks. Second, we devise a rotary position-adaptive encoder that processes multi-resolution 2D/3D images and videos within a unified architecture, diverging from conventional modality-specific encoders. Third, we introduce a medical-aware token pruning mechanism that exploits spatial-temporal redundancy in volumetric data (e.g., consecutive CT slices) and medical videos, effectively reducing 60\% of visual tokens without performance degradation. Empirical evaluations demonstrate that OmniV-Med-7B achieves state-of-the-art performance on 7 benchmarks spanning 2D/3D medical imaging and video understanding tasks. Notably, our lightweight variant (OmniV-Med-1.5B) attains comparable performance while requiring only 8 RTX3090 GPUs for training and supporting efficient long-video inference. Data, code and model will be released.

Authors

Songtao Jiang
Yuan Wang
Sibo Song
Yan Zhang
Zijie Meng
Bohan Lei
Jian Wu
Jimeng Sun
Zuozhu Liu

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2504.14692v1)

OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals