PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models

Journal: arXiv

Published Date: Feb 19, 2025

Abstract

The rapid development of large language models (LLMs) is redefining the landscape of human-computer interaction, and their integration into various user-service applications is becoming increasingly prevalent. However, transmitting user data to cloud-based LLMs presents significant risks of data breaches and unauthorized access to personal identification information. In this paper, we propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and LLMs in practical LLM usage scenarios. We construct SensitiveQA, the first privacy open-ended question-answering dataset. It comprises 57k interactions in Chinese and English, encompassing a diverse range of user-sensitive information within the conversations. Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs. Experimental validation underscores our method's efficacy in balancing privacy protection with maintaining robust interaction quality. The code and dataset are available at https://github.com/ligw1998/PRIV-QA.

Authors

Guangwei Li
Yuansen Zhang
Yinggui Wang
Shoumeng Yan
Lei Wang
Tao Wei

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2502.13564v1)

PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals