Using AI to Identify Unremarkable Chest Radiographs for Automatic Reporting.

Journal: Radiology
PMID:

Abstract

Background Radiology practices have a high volume of unremarkable chest radiographs and artificial intelligence (AI) could possibly improve workflow by providing an automatic report. Purpose To estimate the proportion of unremarkable chest radiographs, where AI can correctly exclude pathology (ie, specificity) without increasing diagnostic errors. Materials and Methods In this retrospective study, consecutive chest radiographs in unique adult patients (≥18 years of age) were obtained January 1-12, 2020, at four Danish hospitals. Exclusion criteria included insufficient radiology reports or AI output error. Two thoracic radiologists, who were blinded to AI output, labeled chest radiographs as "remarkable" or "unremarkable" based on predefined unremarkable findings (reference standard). Radiology reports were classified similarly. A commercial AI tool was adapted to output a chest radiograph "remarkableness" probability, which was used to calculate specificity at different AI sensitivities. Chest radiographs with missed findings by AI and/or the radiology report were graded by one thoracic radiologist as critical, clinically significant, or clinically insignificant. Paired proportions were compared using the McNemar test. Results A total of 1961 patients were included (median age, 72 years [IQR, 58-81 years]; 993 female), with one chest radiograph per patient. The reference standard labeled 1231 of 1961 chest radiographs (62.8%) as remarkable and 730 of 1961 (37.2%) as unremarkable. At 99.9%, 99.0%, and 98.0% sensitivity, the AI had a specificity of 24.5% (179 of 730 radiographs [95% CI: 21, 28]), 47.1% (344 of 730 radiographs [95% CI: 43, 51]), and 52.7% (385 of 730 radiographs [95% CI: 49, 56]), respectively. With the AI fixed to have a similar sensitivity as radiology reports (87.2%), the missed findings of AI and reports had 2.2% (27 of 1231 radiographs) and 1.1% (14 of 1231 radiographs) classified as critical ( = .01), 4.1% (51 of 1231 radiographs) and 3.6% (44 of 1231 radiographs) classified as clinically significant ( = .46), and 6.5% (80 of 1231) and 8.1% (100 of 1231) classified as clinically insignificant ( = .11), respectively. At sensitivities greater than or equal to 95.4%, the AI tool exhibited less than or equal to 1.1% critical misses. Conclusion A commercial AI tool used off-label could correctly exclude pathology in 24.5%-52.7% of all unremarkable chest radiographs at greater than or equal to 98% sensitivity. The AI had equal or lower rates of critical misses than radiology reports at sensitivities greater than or equal to 95.4%. These results should be confirmed in a prospective study. © RSNA, 2024 See also the editorial by Yoon and Hwang in this issue.

Authors

  • Louis Lind Plesner
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib, Juuls vej 1 Herlev, Copenhagen 2730, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., M.W.B., C.H.K., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Herlev, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (M.W.B., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Felix C Müller
    Department of Radiology, Herlev and Gentofte Hospital, Herlev, Denmark. Electronic address: christoph.felix.mueller@regionh.dk.
  • Mathias W Brejnebøl
    Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Radiology, Bispebjerg and Frederiksberg Hospital, Frederiksberg, Denmark.
  • Christian Hedeager Krag
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib, Juuls vej 1 Herlev, Copenhagen 2730, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., M.W.B., C.H.K., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Herlev, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (M.W.B., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Lene C Laustrup
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib Juuls vej 1, 2730 Herlev, Copenhagen, Denmark (L.L.P., F.C.M., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital region of Denmark (L.L.P., F.C.M., J.D.N., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (J.D.N., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Finn Rasmussen
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib Juuls vej 1, 2730 Herlev, Copenhagen, Denmark (L.L.P., F.C.M., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital region of Denmark (L.L.P., F.C.M., J.D.N., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (J.D.N., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Olav Wendelboe Nielsen
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib, Juuls vej 1 Herlev, Copenhagen 2730, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., M.W.B., C.H.K., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Herlev, Denmark (L.L.P., F.C.M., M.W.B., C.H.K., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (M.W.B., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Mikael Boesen
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib Juuls vej 1, 2730 Herlev, Copenhagen, Denmark (L.L.P., F.C.M., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital region of Denmark (L.L.P., F.C.M., J.D.N., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (J.D.N., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).
  • Michael B Andersen
    From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib Juuls vej 1, 2730 Herlev, Copenhagen, Denmark (L.L.P., F.C.M., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital region of Denmark (L.L.P., F.C.M., J.D.N., M.B., M.B.A.); Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (J.D.N., M.B.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.); and Department of Cardiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark (O.W.N.).