LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Ilias Chalkidis; Abhik Jana; Dirk Hartung; Michael Bommarito; Ion Androutsopoulos; Daniel Katz; Nikolaos Aletras

doi:10.18653/v1/2022.acl-long.297

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Katz, Nikolaos Aletras

Abstract

Laws and their interpretations, legal arguments and agreements are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeavors. Their usefulness, however, largely depends on whether current state-of-the-art models can generalize across various tasks in the legal domain. To answer this currently open question, we introduce the Legal General Language Understanding Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way. We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.

Anthology ID:: 2022.acl-long.297
Volume:: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4310–4330
Language:
URL:: https://aclanthology.org/2022.acl-long.297
DOI:: 10.18653/v1/2022.acl-long.297
Bibkey:
Cite (ACL):: Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Katz, and Nikolaos Aletras. 2022. LexGLUE: A Benchmark Dataset for Legal Language Understanding in English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4310–4330, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: LexGLUE: A Benchmark Dataset for Legal Language Understanding in English (Chalkidis et al., ACL 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2022.acl-long.297.pdf
Software:: 2022.acl-long.297.software.zip
Video:: https://preview.aclanthology.org/landing_page/2022.acl-long.297.mp4
Code: coastalcph/lex-glue
Data: LexGLUE, CaseHOLD, ECHR, ECtHR, GLUE, SuperGLUE

PDF Search Code Software Video