Toward Cross-Hospital Deployment of Natural Language Processing Systems: Model Development and Validation of Fine-Tuned Large Language Models for Disease Name Recognition in Japanese.

Journal: JMIR medical informatics
Published Date:

Abstract

BACKGROUND: Disease name recognition is a fundamental task in clinical natural language processing, enabling the extraction of critical patient information from electronic health records. While recent advances in large language models (LLMs) have shown promise, most evaluations have focused on English, and little is known about their robustness in low-resource languages such as Japanese. In particular, whether these models can perform reliably on previously unseen in-hospital data, which differs from training data in writing styles and clinical contexts, has not been thoroughly investigated.

Authors

  • Seiji Shimizu
    Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma-shi, Nara, 630-0192, Japan, 81 743-72-5250.
  • Tomohiro Nishiyama
    Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan.
  • Hiroyuki Nagai
    Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma-shi, Nara, 630-0192, Japan, 81 743-72-5250.
  • Shoko Wakamiya
    Nara Institute of Science and Technology (NAIST), Japan.
  • Eiji Aramaki
    Nara Institute of Science and Technology (NAIST), Japan.