Fine-tuning of language models for automated structuring of medical exam reports to improve patient screening and analysis.

Journal: Scientific reports
Published Date:

Abstract

The analysis of medical imaging reports is labour-intensive but crucial for accurate diagnosis and effective patient screening. Often presented as unstructured text, these reports require systematic organisation for efficient interpretation. This study applies Natural Language Processing (NLP) techniques tailored for European Portuguese to automate the analysis of cardiology reports, streamlining patient screening. Using a methodology involving tokenization, part-of-speech tagging and manual annotation, the MediAlbertina PT-PT language model was fine-tuned, achieving 96.13% accuracy in entity recognition. The system enables rapid identification of conditions such as aortic stenosis through an interactive interface, substantially reducing the time and effort required for manual review. It also facilitates patient monitoring and disease quantification, optimising healthcare resource allocation. This research highlights the potential of NLP tools in Portuguese healthcare contexts, demonstrating their applicability to medical report analysis and their broader relevance in improving efficiency and decision-making in diverse clinical environments.

Authors

  • Luis B Elvas
    Department of Logistics, Molde University College, Molde, 6410, Norway; ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026, Lisbon, Portugal; Inov Inesc Inovação - Instituto de Novas Tecnologias, 1000-029, Lisbon, Portugal. Electronic address: luis.m.elvas@himolde.no.
  • Rafaela Santos
    ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026, Lisbon, Portugal.
  • João C Ferreira
    Department of Logistics, Molde University College, Molde, 6410, Norway; ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026, Lisbon, Portugal; Inov Inesc Inovação - Instituto de Novas Tecnologias, 1000-029, Lisbon, Portugal.