Enhancing Transformation from Natural Language to Signal Temporal Logic Using LLMs with Diverse External Knowledge
Journal:
arXiv
Published Date:
May 27, 2025
Abstract
Temporal Logic (TL), especially Signal Temporal Logic (STL), enables precise
formal specification, making it widely used in cyber-physical systems such as
autonomous driving and robotics. Automatically transforming NL into STL is an
attractive approach to overcome the limitations of manual transformation, which
is time-consuming and error-prone. However, due to the lack of datasets,
automatic transformation currently faces significant challenges and has not
been fully explored. In this paper, we propose an NL-STL dataset named
STL-Diversity-Enhanced (STL-DivEn), which comprises 16,000 samples enriched
with diverse patterns. To develop the dataset, we first manually create a
small-scale seed set of NL-STL pairs. Next, representative examples are
identified through clustering and used to guide large language models (LLMs) in
generating additional NL-STL pairs. Finally, diversity and accuracy are ensured
through rigorous rule-based filters and human validation. Furthermore, we
introduce the Knowledge-Guided STL Transformation (KGST) framework, a novel
approach for transforming natural language into STL, involving a
generate-then-refine process based on external knowledge. Statistical analysis
shows that the STL-DivEn dataset exhibits more diversity than the existing
NL-STL dataset. Moreover, both metric-based and human evaluations indicate that
our KGST approach outperforms baseline models in transformation accuracy on
STL-DivEn and DeepSTL datasets.