Sentence Boundary Detection in Legal Text

George Sanchez


Abstract
In this paper, we examined several algorithms to detect sentence boundaries in legal text. Legal text presents challenges for sentence tokenizers because of the variety of punctuations and syntax of legal text. Out-of-the-box algorithms perform poorly on legal text affecting further analysis of the text. A novel and domain-specific approach is needed to detect sentence boundaries to further analyze legal text. We present the results of our investigation in this paper.
Anthology ID:
W19-2204
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2019
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Nikolaos Aletras, Elliott Ash, Leslie Barrett, Daniel Chen, Adam Meyers, Daniel Preotiuc-Pietro, David Rosenberg, Amanda Stent
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31–38
Language:
URL:
https://aclanthology.org/W19-2204
DOI:
10.18653/v1/W19-2204
Bibkey:
Cite (ACL):
George Sanchez. 2019. Sentence Boundary Detection in Legal Text. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 31–38, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Sentence Boundary Detection in Legal Text (Sanchez, NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/W19-2204.pdf