Cognitive phantoms in LLMs through the lens of latent variables
Journal:
arXiv
Published Date:
Sep 6, 2024
Abstract
Large language models (LLMs) increasingly reach real-world applications,
necessitating a better understanding of their behaviour. Their size and
complexity complicate traditional assessment methods, causing the emergence of
alternative approaches inspired by the field of psychology. Recent studies
administering psychometric questionnaires to LLMs report human-like traits in
LLMs, potentially influencing LLM behaviour. However, this approach suffers
from a validity problem: it presupposes that these traits exist in LLMs and
that they are measurable with tools designed for humans. Typical procedures
rarely acknowledge the validity problem in LLMs, comparing and interpreting
average LLM scores. This study investigates this problem by comparing latent
structures of personality between humans and three LLMs using two validated
personality questionnaires. Findings suggest that questionnaires designed for
humans do not validly measure similar constructs in LLMs, and that these
constructs may not exist in LLMs at all, highlighting the need for psychometric
analyses of LLM responses to avoid chasing cognitive phantoms.
Keywords: large language models, psychometrics, machine behaviour, latent
variable modeling, validity