A Comparison of LLMs for Use in Generating Synthetic Test Data for Automated Testing of a Patient-Focused, Survey-Based System.
Journal:
AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:
May 22, 2025
Abstract
In the context of a patient-focused, survey-based system, we demonstrated the potential of generative AI to create custom synthetic data using 2 different large language models (GPT 3.5 and Flan T5-XL) in AWS and Azure environments. While we improved test effectiveness and efficiency by synthetically generating many test cases, the experience included technical and communication challenges as well as complexities associated with balancing the desire for high utility and realism in the data with the available testing resources. Recommendations range from defining and gaining consensus on evaluation metrics early in the process as it influences technical questions like persona creation and prompt-engineering to encouraging test teams to build flexible frameworks from the start.