General retinal image enhancement via reconstruction: Bridging distribution shifts using latent diffusion adaptors.

Journal: Medical image analysis
Published Date:

Abstract

Deep learning-based fundus image enhancement has attracted extensive research attention recently, which has shown remarkable effectiveness in improving the visibility of low-quality images. However, these methods are often constrained to specific datasets and degradations, leading to poor generalization capabilities and having challenges in the fine-tuning process. Therefore, a general method for fundus image enhancement is proposed for improved generalizability and flexibility, which decomposes the enhancement task into reconstruction and adaptation phases. In the reconstruction phase, self-supervised training with unpaired data is employed, allowing the utilization of extensive public datasets to improve the generalizability of the model. During the adaptation phase, the model is fine-tuned according to the target datasets and their degradations, utilizing the pre-trained weights from the reconstruction. The proposed method improves the feasibility of latent diffusion models for retinal image enhancement. Adaptation loss and enhancement adaptor are proposed in autoencoders and diffusion networks for fewer paired training data, fewer trainable parameters, and faster convergence compared with training from scratch. The proposed method can be easily fine-tuned and experiments demonstrate the adaptability for different datasets and degradations. Additionally, the reconstruction-adaptation framework can be utilized in different backbones and other modalities, which shows its generality.

Authors

  • Bingyu Yang
    Beijing Institute of Technology, Beijing, 100081, China.
  • Haonan Han
    Beijing Institute of Technology, Beijing, 100081, China.
  • Weihang Zhang
  • Huiqi Li
    School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.