Fine-tuning large language models for rare disease concept normalization.

Journal: Journal of the American Medical Informatics Association : JAMIA
PMID:

Abstract

OBJECTIVE: We aim to develop a novel method for rare disease concept normalization by fine-tuning Llama 2, an open-source large language model (LLM), using a domain-specific corpus sourced from the Human Phenotype Ontology (HPO).

Authors

  • Andy Wang
    Peddie School, Hightstown, NJ 08520, United States.
  • Cong Liu
    Department of Bioengineering, University of Illinois at Chicago, 851 S Morgan St, Chicago, IL, 60607, USA.
  • Jingye Yang
    Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104, United States.
  • Chunhua Weng
    Department of Biomedical Informatics, Columbia University.