CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System
Journal:
arXiv
Published Date:
Jan 3, 2025
Abstract
As the impact of global climate change intensifies, corporate carbon
emissions have become a focal point of global attention. In response to issues
such as the lag in climate change knowledge updates within large language
models, the lack of specialization and accuracy in traditional augmented
generation architectures for complex problems, and the high cost and time
consumption of sustainability report analysis, this paper proposes CarbonChat:
Large Language Model-based corporate carbon emission analysis and climate
knowledge Q&A system, aimed at achieving precise carbon emission analysis and
policy understanding.First, a diversified index module construction method is
proposed to handle the segmentation of rule-based and long-text documents, as
well as the extraction of structured data, thereby optimizing the parsing of
key information.Second, an enhanced self-prompt retrieval-augmented generation
architecture is designed, integrating intent recognition, structured reasoning
chains, hybrid retrieval, and Text2SQL, improving the efficiency of semantic
understanding and query conversion.Next, based on the greenhouse gas accounting
framework, 14 dimensions are established for carbon emission analysis, enabling
report summarization, relevance evaluation, and customized responses.Finally,
through a multi-layer chunking mechanism, timestamps, and hallucination
detection features, the accuracy and verifiability of the analysis results are
ensured, reducing hallucination rates and enhancing the precision of the
responses.