Improving tabular data extraction in scanned laboratory reports using deep learning models.

Journal: Journal of biomedical informatics
Published Date:

Abstract

OBJECTIVE: Medical laboratory testing is essential in healthcare, providing crucial data for diagnosis and treatment. Nevertheless, patients' lab testing results are often transferred via fax across healthcare organizations and are not immediately available for timely clinical decision making. Thus, it is important to develop new technologies to accurately extract lab testing information from scanned laboratory reports. This study aims to develop an advanced deep learning-based Optical Character Recognition (OCR) method to identify tables containing lab testing results in scanned laboratory reports.

Authors

  • Yiming Li
    Department of Cardiology, West China Hospital, Sichuan University, Chengdu 610041, China.
  • Qiang Wei
    School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  • Xinghan Chen
    Department of Management, Policy and Community Health, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
  • Jianfu Li
    Mayo Clinic.
  • Cui Tao
    The University of Texas Health Science Center at Houston, USA.
  • Hua Xu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.