Multi-granular Legal Topic Classification on Greek Legislation
Christos Papaloukas, Ilias Chalkidis, Konstantinos Athinaios, Despina Pantazi, Manolis Koubarakis
Abstract
In this work, we study the task of classifying legal texts written in the Greek language. We introduce and make publicly available a novel dataset based on Greek legislation, consisting of more than 47 thousand official, categorized Greek legislation resources. We experiment with this dataset and evaluate a battery of advanced methods and classifiers, ranging from traditional machine learning and RNN-based methods to state-of-the-art Transformer-based methods. We show that recurrent architectures with domain-specific word embeddings offer improved overall performance while being competitive even to transformer-based models. Finally, we show that cutting-edge multilingual and monolingual transformer-based models brawl on the top of the classifiers’ ranking, making us question the necessity of training monolingual transfer learning models as a rule of thumb. To the best of our knowledge, this is the first time the task of Greek legal text classification is considered in an open research project, while also Greek is a language with very limited NLP resources in general.- Anthology ID:
- 2021.nllp-1.6
- Volume:
- Proceedings of the Natural Legal Language Processing Workshop 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Nikolaos Aletras, Ion Androutsopoulos, Leslie Barrett, Catalina Goanta, Daniel Preotiuc-Pietro
- Venue:
- NLLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 63–75
- Language:
- URL:
- https://aclanthology.org/2021.nllp-1.6
- DOI:
- 10.18653/v1/2021.nllp-1.6
- Cite (ACL):
- Christos Papaloukas, Ilias Chalkidis, Konstantinos Athinaios, Despina Pantazi, and Manolis Koubarakis. 2021. Multi-granular Legal Topic Classification on Greek Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 63–75, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Multi-granular Legal Topic Classification on Greek Legislation (Papaloukas et al., NLLP 2021)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2021.nllp-1.6.pdf
- Code
- christospi/glc-nllp-21
- Data
- EURLEX57K