People are poorly equipped to detect AI-powered voice clones.

Journal: Scientific reports
PMID:

Abstract

As generative artificial intelligence (AI) continues its ballistic trajectory, everything from text to audio, image, and video generation continues to improve at mimicking human-generated content. Through a series of perceptual studies, we report on the realism of AI-generated voices in terms of identity matching and naturalness. We find human participants cannot consistently identify recordings of AI-generated voices. Specifically, participants perceived the identity of an AI-generated voice to be the same as its real counterpart approximately [Formula: see text] of the time, and correctly identified a voice as AI generated only about [Formula: see text] of the time.

Authors

  • Sarah Barrington
    School of Information, University of California, Berkeley, USA.
  • Emily A Cooper
    Herbert Wertheim School of Optometry, University of California, Berkeley, CA, 94720, USA.
  • Hany Farid
    Department of Electrical and Computer Sciences, School of Information, University of California, Berkeley, CA 94708.