An archaeological Catalog Collection Method Based on Large Vision-Language Models
Journal:
arXiv
Published Date:
Dec 28, 2024
Abstract
Archaeological catalogs, containing key elements such as artifact images,
morphological descriptions, and excavation information, are essential for
studying artifact evolution and cultural inheritance. These data are widely
scattered across publications, requiring automated collection methods. However,
existing Large Vision-Language Models (VLMs) and their derivative data
collection methods face challenges in accurate image detection and modal
matching when processing archaeological catalogs, making automated collection
difficult. To address these issues, we propose a novel archaeological catalog
collection method based on Large Vision-Language Models that follows an
approach comprising three modules: document localization, block comprehension
and block matching. Through practical data collection from the Dabagou and
Miaozigou pottery catalogs and comparison experiments, we demonstrate the
effectiveness of our approach, providing a reliable solution for automated
collection of archaeological catalogs.