Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information
Journal:
arXiv
Published Date:
Jun 13, 2025
Abstract
LLM-based Conversational AIs (CAIs), also known as GenAI chatbots, like
ChatGPT, are increasingly used across various domains, but they pose privacy
risks, as users may disclose personal information during their conversations
with CAIs. Recent research has demonstrated that LLM-based CAIs could be used
for malicious purposes. However, a novel and particularly concerning type of
malicious LLM application remains unexplored: an LLM-based CAI that is
deliberately designed to extract personal information from users.
In this paper, we report on the malicious LLM-based CAIs that we created
based on system prompts that used different strategies to encourage disclosures
of personal information from users. We systematically investigate CAIs' ability
to extract personal information from users during conversations by conducting a
randomized-controlled trial with 502 participants. We assess the effectiveness
of different malicious and benign CAIs to extract personal information from
participants, and we analyze participants' perceptions after their interactions
with the CAIs. Our findings reveal that malicious CAIs extract significantly
more personal information than benign CAIs, with strategies based on the social
nature of privacy being the most effective while minimizing perceived risks.
This study underscores the privacy threats posed by this novel type of
malicious LLM-based CAIs and provides actionable recommendations to guide
future research and practice.