Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models
Journal:
arXiv
Published Date:
Apr 11, 2025
Abstract
Self-report questionnaires have long been used to assess LLM personality
traits, yet they fail to capture behavioral nuances due to biases and
meta-knowledge contamination. This paper proposes a novel multi-observer
framework for personality trait assessments in LLM agents that draws on
informant-report methods in psychology. Instead of relying on self-assessments,
we employ multiple observer agents. Each observer is configured with a specific
relational context (e.g., family member, friend, or coworker) and engages the
subject LLM in dialogue before evaluating its behavior across the Big Five
dimensions. We show that these observer-report ratings align more closely with
human judgments than traditional self-reports and reveal systematic biases in
LLM self-assessments. We also found that aggregating responses from 5 to 7
observers reduces systematic biases and achieves optimal reliability. Our
results highlight the role of relationship context in perceiving personality
and demonstrate that a multi-observer paradigm offers a more reliable,
context-sensitive approach to evaluating LLM personality traits.