Validation of Conformal Prediction in Cervical Atypia Classification
Journal:
arXiv
Published Date:
May 13, 2025
Abstract
Deep learning based cervical cancer classification can potentially increase
access to screening in low-resource regions. However, deep learning models are
often overconfident and do not reliably reflect diagnostic uncertainty.
Moreover, they are typically optimized to generate maximum-likelihood
predictions, which fail to convey uncertainty or ambiguity in their results.
Such challenges can be addressed using conformal prediction, a model-agnostic
framework for generating prediction sets that contain likely classes for
trained deep-learning models. The size of these prediction sets indicates model
uncertainty, contracting as model confidence increases. However, existing
conformal prediction evaluation primarily focuses on whether the prediction set
includes or covers the true class, often overlooking the presence of extraneous
classes. We argue that prediction sets should be truthful and valuable to end
users, ensuring that the listed likely classes align with human expectations
rather than being overly relaxed and including false positives or unlikely
classes. In this study, we comprehensively validate conformal prediction sets
using expert annotation sets collected from multiple annotators. We evaluate
three conformal prediction approaches applied to three deep-learning models
trained for cervical atypia classification. Our expert annotation-based
analysis reveals that conventional coverage-based evaluations overestimate
performance and that current conformal prediction methods often produce
prediction sets that are not well aligned with human labels. Additionally, we
explore the capabilities of the conformal prediction methods in identifying
ambiguous and out-of-distribution data.