Wild refitting for black box prediction
Journal:
arXiv
Published Date:
Jun 26, 2025
Abstract
We describe and analyze a computionally efficient refitting procedure for
computing high-probability upper bounds on the instance-wise mean-squared
prediction error of penalized nonparametric estimates based on least-squares
minimization. Requiring only a single dataset and black box access to the
prediction method, it consists of three steps: computing suitable residuals,
symmetrizing and scaling them with a pre-factor $\rho$, and using them to
define and solve a modified prediction problem recentered at the current
estimate. We refer to it as wild refitting, since it uses Rademacher residual
symmetrization as in a wild bootstrap variant. Under relatively mild conditions
allowing for noise heterogeneity, we establish a high probability guarantee on
its performance, showing that the wild refit with a suitably chosen wild noise
scale $\rho$ gives an upper bound on prediction error. This theoretical
analysis provides guidance into the design of such procedures, including how
the residuals should be formed, the amount of noise rescaling in the wild
sub-problem needed for upper bounds, and the local stability properties of the
block-box procedure. We illustrate the applicability of this procedure to
various problems, including non-rigid structure-from-motion recovery with
structured matrix penalties; plug-and-play image restoration with deep neural
network priors; and randomized sketching with kernel methods.