LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
Journal:
arXiv
Published Date:
Mar 16, 2025
Abstract
Text-to-image latent diffusion models (LDMs) have recently emerged as
powerful generative models with great potential for solving inverse problems in
imaging. However, leveraging such models in a Plug & Play (PnP), zero-shot
manner remains challenging because it requires identifying a suitable text
prompt for the unknown image of interest. Also, existing text-to-image PnP
approaches are highly computationally expensive. We herein address these
challenges by proposing a novel PnP inference paradigm specifically designed
for embedding generative models within stochastic inverse solvers, with special
attention to Latent Consistency Models (LCMs), which distill LDMs into fast
generators. We leverage our framework to propose LAtent consisTency INverse
sOlver (LATINO), the first zero-shot PnP framework to solve inverse problems
with priors encoded by LCMs. Our conditioning mechanism avoids automatic
differentiation and reaches SOTA quality in as little as 8 neural function
evaluations. As a result, LATINO delivers remarkably accurate solutions and is
significantly more memory and computationally efficient than previous
approaches. We then embed LATINO within an empirical Bayesian framework that
automatically calibrates the text prompt from the observed measurements by
marginal maximum likelihood estimation. Extensive experiments show that prompt
self-calibration greatly improves estimation, allowing LATINO with PRompt
Optimization to define new SOTAs in image reconstruction quality and
computational efficiency.