AI-Based Reconstruction from Inherited Personal Data: Analysis, Feasibility, and Prospects
Journal:
arXiv
Published Date:
Jul 3, 2025
Abstract
This article explores the feasibility of creating an "electronic copy" of a
deceased researcher by training artificial intelligence (AI) on the data stored
in their personal computers. By analyzing typical data volumes on inherited
researcher computers, including textual files such as articles, emails, and
drafts, it is estimated that approximately one million words are available for
AI training. This volume is sufficient for fine-tuning advanced pre-trained
models like GPT-4 to replicate a researcher's writing style, domain expertise,
and rhetorical voice with high fidelity. The study also discusses the potential
enhancements from including non-textual data and file metadata to enrich the
AI's representation of the researcher. Extensions of the concept include
communication between living researchers and their electronic copies,
collaboration among individual electronic copies, as well as the creation and
interconnection of organizational electronic copies to optimize information
access and strategic decision-making. Ethical considerations such as ownership
and security of these electronic copies are highlighted as critical for
responsible implementation. The findings suggest promising opportunities for
AI-driven preservation and augmentation of intellectual legacy.