Medical visual question answering: A survey.

Journal: Artificial intelligence in medicine
Published Date:

Abstract

Medical Visual Question Answering (VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we collect and discuss the publicly available medical VQA datasets up-to-date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. We summarize and discuss their techniques, innovations, and potential improvements. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions. Our goal is to provide comprehensive and helpful information for researchers interested in the medical visual question answering field and encourage them to conduct further research in this field.

Authors

  • Zhihong Lin
    Faculty of Engineering, Monash University, Clayton, VIC, 3800, Australia. Electronic address: zhihong.lin@monash.edu.
  • Donghao Zhang
  • Qingyi Tao
    NVIDIA AI Technology Center, 038988, Singapore. Electronic address: qtao002@e.ntu.edu.sg.
  • Danli Shi
    State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
  • Gholamreza Haffari
    Faculty of Information Technology, Monash University, Clayton, Australia.
  • Qi Wu
    Endoscopy Center, Peking University Cancer Hospital and Institute, Beijing, China.
  • Mingguang He
    State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China; Centre for Eye Research Australia; Departments of Ophthalmology and Surgery, University of Melbourne, Melbourne, Australia. Electronic address: mingguang.he@unimelb.edu.au.
  • Zongyuan Ge
    AIM for Health Lab, Faculty of IT, Monash University, Clayton, Victoria, Australia; Monash-Airdoc Research Lab, Faculty of IT, Monash University, Clayton, Victoria, Australia.