Efficient and Accurate Image Provenance Analysis: A Scalable Pipeline for Large-scale Images
Journal:
arXiv
Published Date:
Jun 30, 2025
Abstract
The rapid proliferation of modified images on social networks that are driven
by widely accessible editing tools demands robust forensic tools for digital
governance. Image provenance analysis, which filters various query image
variants and constructs a directed graph to trace their phylogeny history, has
emerged as a critical solution. However, existing methods face two fundamental
limitations: First, accuracy issues arise from overlooking heavily modified
images due to low similarity while failing to exclude unrelated images and
determine modification directions under diverse modification scenarios. Second,
scalability bottlenecks stem from pairwise image analysis incurs quadratic
complexity, hindering application in large-scale scenarios. This paper presents
a scalable end-to-end pipeline for image provenance analysis that achieves high
precision with linear complexity. This improves filtering effectiveness through
modification relationship tracing, which enables the comprehensive discovery of
image variants regardless of their visual similarity to the query. In addition,
the proposed pipeline integrates local features matching and compression
artifact capturing, enhancing robustness against diverse modifications and
enabling accurate analysis of images' relationships. This allows the generation
of a directed provenance graph that accurately characterizes the image's
phylogeny history. Furthermore, by optimizing similarity calculations and
eliminating redundant pairwise analysis during graph construction, the pipeline
achieves a linear time complexity, ensuring its scalability for large-scale
scenarios. Experiments demonstrate pipeline's superior performance, achieving a
16.7-56.1% accuracy improvement. Notably, it exhibits significant scalability
with an average 3.0-second response time on 10 million scale images, which is
far shorter than the SOTA approach's 12-minute duration.