Class conditional conformal prediction for multiple inputs by p-value aggregation
Journal:
arXiv
Published Date:
Jul 9, 2025
Abstract
Conformal prediction methods are statistical tools designed to quantify
uncertainty and generate predictive sets with guaranteed coverage
probabilities. This work introduces an innovative refinement to these methods
for classification tasks, specifically tailored for scenarios where multiple
observations (multi-inputs) of a single instance are available at prediction
time. Our approach is particularly motivated by applications in citizen
science, where multiple images of the same plant or animal are captured by
individuals. Our method integrates the information from each observation into
conformal prediction, enabling a reduction in the size of the predicted label
set while preserving the required class-conditional coverage guarantee. The
approach is based on the aggregation of conformal p-values computed from each
observation of a multi-input. By exploiting the exact distribution of these
p-values, we propose a general aggregation framework using an abstract scoring
function, encompassing many classical statistical tools. Knowledge of this
distribution also enables refined versions of standard strategies, such as
majority voting. We evaluate our method on simulated and real data, with a
particular focus on Pl@ntNet, a prominent citizen science platform that
facilitates the collection and identification of plant species through
user-submitted images.