FPGDKG 1.0: An Integrated Facial Phenotype-Gene-Disease Knowledge Graph for Rare Disease Diagnosis and Explanation.

Journal: IEEE journal of biomedical and health informatics
Published Date:

Abstract

Rare diseases pose significant diagnostic challenges due to their low prevalence, limited clinical awareness, and pronounced phenotypic heterogeneity. Early and accurate diagnosis is essential but remains difficult, especially in resource-limited settings where comprehensive genetic testing is unavailable. Distinctive facial phenotypes can offer accessible diagnostic clues, yet overlapping features and broad phenotypic spectra often hinder precise identification. To address these challenges, we present the Facial Phenotype-Gene-Disease Knowledge Graph (FPGDKG), a unified resource integrating multi-source data on facial phenotypes, genes, and diseases. The knowledge graph comprises 23,096 nodes and 239,236 relationships. We demonstrate the utility of FPGDKG through three representative use cases: (1) phenotype-based automated diagnosis of rare diseases using machine learning models; (2) explainable diagnosis by jointly presenting phenotype, genotype, and literature evidence for each prediction. The accuracies of the presented evidence, as validated quantitatively, are 73.67$\%$ for phenotype, 59.57$\%$ for gene, and 90.59$\%$ for literature evidence; (3) embedding-based matching to support differential diagnosis for ultra-rare diseases. To facilitate clinical use and research, we also developed an interactive online platform that offers intuitive visualization, information retrieval, and explainable decision support (http://bioinf.org.cn:8060/). Through three representative use cases, we show that FPGDKG supports promising diagnostic performance and enhances explainability by providing multi-dimensional evidence, making it a valuable tool for transparent, data-driven rare disease diagnosis.

Authors

Keywords

No keywords available for this article.