Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them
Journal:
arXiv
Published Date:
Mar 20, 2025
Abstract
Large language models (LLMs) and Vision language models (VLMs) have been able
to perform various forms of reasoning tasks in a wide range of scenarios, but
are they truly engaging in task abstraction and rule-based reasoning beyond
mere memorization and pattern matching? To answer this question, we propose a
novel experimental approach, Misleading Fine-Tuning (MisFT), to examine whether
LLMs/VLMs perform abstract reasoning by altering their original understanding
of fundamental rules. In particular, by constructing a dataset with math
expressions that contradict correct operation principles, we fine-tune the
model to learn those contradictory rules and assess its generalization ability
on different test domains. Through a series of experiments, we find that
current LLMs/VLMs are capable of effectively applying contradictory rules to
solve practical math word problems and math expressions represented by images,
implying the presence of an internal mechanism that abstracts before reasoning.