Abstract
In this paper, we examined several algorithms to detect sentence boundaries in legal text. Legal text presents challenges for sentence tokenizers because of the variety of punctuations and syntax of legal text. Out-of-the-box algorithms perform poorly on legal text affecting further analysis of the text. A novel and domain-specific approach is needed to detect sentence boundaries to further analyze legal text. We present the results of our investigation in this paper.- Anthology ID:
- W19-2204
- Volume:
- Proceedings of the Natural Legal Language Processing Workshop 2019
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Nikolaos Aletras, Elliott Ash, Leslie Barrett, Daniel Chen, Adam Meyers, Daniel Preotiuc-Pietro, David Rosenberg, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 31–38
- Language:
- URL:
- https://aclanthology.org/W19-2204
- DOI:
- 10.18653/v1/W19-2204
- Cite (ACL):
- George Sanchez. 2019. Sentence Boundary Detection in Legal Text. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 31–38, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Sentence Boundary Detection in Legal Text (Sanchez, NAACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/W19-2204.pdf