Use GPT-J Prompt Generation with RoBERTa for NER Models on Diagnosis Extraction of Periodontal Diagnosis from Electronic Dental Records.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:

Abstract

This study explored the usability of prompt generation on named entity recognition (NER) tasks and the performance in different settings of the prompt. The prompt generation by GPT-J models was utilized to directly test the gold standard as well as to generate the seed and further fed to the RoBERTa model with the spaCy package. In the direct test, a lower ratio of negative examples with higher numbers of examples in prompt achieved the best results with a F1 score of 0.72. The performance revealed consistency, 0.92-0.97 in the F1 score, in all settings after training with the RoBERTa model. The study highlighted the importance of seed quality rather than quantity in feeding NER models. This research reports on an efficient and accurate way to mine clinical notes for periodontal diagnoses, allowing researchers to easily and quickly build a NER model with the prompt generation approach.

Authors

  • Yao-Shun Chuang
    McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States.
  • Xiaoqian Jiang
    School of Biomedical Informatics, University of Texas Health, Science Center at Houston, Houston, TX, USA.
  • Chun-Teh Lee
    Department of Periodontics and Dental Hygiene, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Ryan Brandon
    Department of Oral Health Sciences, Temple University Kornberg School of Dentistry, Philadelphia, Pennsylvania, USA.
  • Duong Tran
    Diagnostic and Biomedical Sciences, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Oluwabunmi Tokede
    Oral Healthcare Quality and Safety, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Muhammad F Walji
    Department of Diagnostic and Biomedical Sciences, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.