Legal Requirements Translation from Law
Journal:
arXiv
Published Date:
Jul 3, 2025
Abstract
Software systems must comply with legal regulations, which is a
resource-intensive task, particularly for small organizations and startups
lacking dedicated legal expertise. Extracting metadata from regulations to
elicit legal requirements for software is a critical step to ensure compliance.
However, it is a cumbersome task due to the length and complex nature of legal
text. Although prior work has pursued automated methods for extracting
structural and semantic metadata from legal text, key limitations remain: they
do not consider the interplay and interrelationships among attributes
associated with these metadata types, and they rely on manual labeling or
heuristic-driven machine learning, which does not generalize well to new
documents. In this paper, we introduce an approach based on textual entailment
and in-context learning for automatically generating a canonical representation
of legal text, encodable and executable as Python code. Our representation is
instantiated from a manually designed Python class structure that serves as a
domain-specific metamodel, capturing both structural and semantic legal
metadata and their interrelationships. This design choice reduces the need for
large, manually labeled datasets and enhances applicability to unseen
legislation. We evaluate our approach on 13 U.S. state data breach notification
laws, demonstrating that our generated representations pass approximately 89.4%
of test cases and achieve a precision and recall of 82.2 and 88.7,
respectively.