Abstract
This paper conducts a comparative study on the performance of various machine learning approaches for classifying judgments into legal areas. Using a novel dataset of 6,227 Singapore Supreme Court judgments, we investigate how state-of-the-art NLP methods compare against traditional statistical models when applied to a legal corpus that comprised few but lengthy documents. All approaches tested, including topic model, word embedding, and language model-based classifiers, performed well with as little as a few hundred judgments. However, more work needs to be done to optimize state-of-the-art methods for the legal domain.- Anthology ID:
- W19-2208
- Volume:
- Proceedings of the Natural Legal Language Processing Workshop 2019
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Nikolaos Aletras, Elliott Ash, Leslie Barrett, Daniel Chen, Adam Meyers, Daniel Preotiuc-Pietro, David Rosenberg, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 67–77
- Language:
- URL:
- https://aclanthology.org/W19-2208
- DOI:
- 10.18653/v1/W19-2208
- Cite (ACL):
- Jerrold Soh, How Khang Lim, and Ian Ernst Chai. 2019. Legal Area Classification: A Comparative Study of Text Classifiers on Singapore Supreme Court Judgments. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 67–77, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Legal Area Classification: A Comparative Study of Text Classifiers on Singapore Supreme Court Judgments (Soh et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/W19-2208.pdf