Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model
Journal:
arXiv
Published Date:
May 7, 2025
Abstract
Access to legal information is fundamental to access to justice. Yet
accessibility refers not only to making legal documents available to the
public, but also rendering legal information comprehensible to them. A vexing
problem in bringing legal information to the public is how to turn formal legal
documents such as legislation and judgments, which are often highly technical,
to easily navigable and comprehensible knowledge to those without legal
education. In this study, we formulate a three-step approach for bringing legal
knowledge to laypersons, tackling the issues of navigability and
comprehensibility. First, we translate selected sections of the law into
snippets (called CLIC-pages), each being a small piece of article that focuses
on explaining certain technical legal concept in layperson's terms. Second, we
construct a Legal Question Bank (LQB), which is a collection of legal questions
whose answers can be found in the CLIC-pages. Third, we design an interactive
CLIC Recommender (CRec). Given a user's verbal description of a legal situation
that requires a legal solution, CRec interprets the user's input and shortlists
questions from the question bank that are most likely relevant to the given
legal situation and recommends their corresponding CLIC pages where relevant
legal knowledge can be found. In this paper we focus on the technical aspects
of creating an LQB. We show how large-scale pre-trained language models, such
as GPT-3, can be used to generate legal questions. We compare machine-generated
questions (MGQs) against human-composed questions (HCQs) and find that MGQs are
more scalable, cost-effective, and more diversified, while HCQs are more
precise. We also show a prototype of CRec and illustrate through an example how
our 3-step approach effectively brings relevant legal knowledge to the public.