Lossy DICOM conversion may affect AI performance.
Journal:
Scientific reports
Published Date:
Jul 8, 2025
Abstract
Many pathologies have started to digitize their glass slides. To ensure long term accessibility, it is desirable to store them in the DICOM format. Currently, many scanners initially store the images in vendor-specific formats and only provide DICOM converters, with only a few producing DICOM directly. However, such a conversion is not lossless for all vendors, and in the case of MRXS files even overlapping tile handling differs. The resulting consequences have not yet been investigated. We converted MRXS files depicting bladder, ovarian and prostate tissue into DICOM images using the 3D Histech/Sysmex converter and an open-source tool both using baseline JPEG for the re-compression. After conversion no human perceptible differences were present between the images, nevertheless they were not identical and had structure similarity indices (SSIM) of ~ 0.85 to ~ 0.96 on average, while the vendor specific converter in general achieved higher values. AI models based on CNNs and current foundation models could distinguish between the original and the converted images in most cases with an accuracy of up to 99.5%. And already trained AI models showed significant performance differences between the image formats in five out of 64 scenarios, mainly when only little data was used during AI training. So, if DICOM images are intended for a diagnostic use, all processes and algorithms must be (re-)evaluated with the converted files, as images are not identical. Nevertheless, the DICOM format is an excellent opportunity to ensure interoperability in future, as some first AI trainings with converted files did not result in systematically decreased performances.