MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection
Journal:
arXiv
Published Date:
Mar 23, 2025
Abstract
Mathematical error detection in educational settings presents a significant
challenge for Multimodal Large Language Models (MLLMs), requiring a
sophisticated understanding of both visual and textual mathematical content
along with complex reasoning capabilities. Though effective in mathematical
problem-solving, MLLMs often struggle with the nuanced task of identifying and
categorizing student errors in multimodal mathematical contexts. Therefore, we
introduce MathAgent, a novel Mixture-of-Math-Agent framework designed
specifically to address these challenges. Our approach decomposes error
detection into three phases, each handled by a specialized agent: an image-text
consistency validator, a visual semantic interpreter, and an integrative error
analyzer. This architecture enables more accurate processing of mathematical
content by explicitly modeling relationships between multimodal problems and
student solution steps. We evaluate MathAgent on real-world educational data,
demonstrating approximately 5% higher accuracy in error step identification and
3% improvement in error categorization compared to baseline models. Besides,
MathAgent has been successfully deployed in an educational platform that has
served over one million K-12 students, achieving nearly 90% student
satisfaction while generating significant cost savings by reducing manual error
detection.