PathoGraph: A Graph-Based Method for Standardized Representation of Pathology Knowledge.
Journal:
Scientific data
Published Date:
May 27, 2025
Abstract
Pathology data, primarily consisting of slides and diagnostic reports, inherently contain knowledge that is pivotal for advancing data-driven biomedical research and clinical practice. However, the hidden and fragmented nature of this knowledge across various data modalities not only hinders its computational utilization, but also impedes the effective integration of AI technologies within the domain of pathology. To systematically organize pathology knowledge for its computational use, we propose PathoGraph, a knowledge representation method that describes pathology knowledge in a graph-based format. PathoGraph can represent: (1) pathological entities' types and morphological features; (2) the composition, spatial arrangements, and dynamic behaviors associated with pathological phenotypes; and (3) the differential diagnostic approaches used by pathologists. By applying PathoGraph to neoplastic diseases, we illustrate its ability to comprehensively and structurally capture multi-scale disease characteristics alongside pathologists' expertise. Furthermore, we validate its computational utility by demonstrating the feasibility of large-scale automated PathoGraph construction, showing performance improvements in downstream deep learning tasks, and presenting two illustrative use cases that highlight its clinical potential. We believe PathoGraph opens new avenues for AI-driven advances in the field of pathology.