Evidence-based diagnostic reasoning with multi-agent copilot for human pathology
Journal:
arXiv
Published Date:
Jun 26, 2025
Abstract
Pathology is experiencing rapid digital transformation driven by whole-slide
imaging and artificial intelligence (AI). While deep learning-based
computational pathology has achieved notable success, traditional models
primarily focus on image analysis without integrating natural language
instruction or rich, text-based context. Current multimodal large language
models (MLLMs) in computational pathology face limitations, including
insufficient training data, inadequate support and evaluation for multi-image
understanding, and a lack of autonomous, diagnostic reasoning capabilities. To
address these limitations, we introduce PathChat+, a new MLLM specifically
designed for human pathology, trained on over 1 million diverse,
pathology-specific instruction samples and nearly 5.5 million question answer
turns. Extensive evaluations across diverse pathology benchmarks demonstrated
that PathChat+ substantially outperforms the prior PathChat copilot, as well as
both state-of-the-art (SOTA) general-purpose and other pathology-specific
models. Furthermore, we present SlideSeek, a reasoning-enabled multi-agent AI
system leveraging PathChat+ to autonomously evaluate gigapixel whole-slide
images (WSIs) through iterative, hierarchical diagnostic reasoning, reaching
high accuracy on DDxBench, a challenging open-ended differential diagnosis
benchmark, while also capable of generating visually grounded,
humanly-interpretable summary reports.