Feasibility of improving vocal fold pathology image classification with synthetic images generated by DDPM-based GenAI: a pilot study.

Journal: European archives of oto-rhino-laryngology : official journal of the European Federation of Oto-Rhino-Laryngological Societies (EUFOS) : affiliated with the German Society for Oto-Rhino-Laryngology - Head and Neck Surgery
Published Date:

Abstract

BACKGROUND: Voice disorders (VD) are often linked to vocal fold structural pathologies (VFSP). Laryngeal imaging plays a vital role in assessing VFSPs and VD in clinical and research settings, but challenges like scarce and imbalanced datasets can limit the generalizability of findings. Denoising Diffusion Probabilistic Models (DDPMs), a subtype of Generative AI, has gained attention for its ability to generate high-quality and realistic synthetic images to address these challenges.

Authors

  • Iman Khazrak
    Department of Computer Science, Bowling Green State University, Bowling Green, OH, 43403, USA. ikhazra@bgsu.edu.
  • Shahryar Zainaee
    Department of Communication Sciences and Disorders, Bowling Green State University, Bowling Green, OH, 43403, USA.
  • Mostafa M Rezaee
    Department of Computer Science, Bowling Green State University, Bowling Green, OH, 43403, USA.
  • Mehran Ghasemi
    Department of Communication Sciences and Disorders, Bowling Green State University, Bowling Green, OH, 43403, USA.
  • Robert C Green
    Department of Computer Science, Bowling Green State University, Bowling Green, OH, 43403, USA.

Keywords

No keywords available for this article.