Automatic Anonymization of Swiss Federal Supreme Court Rulings

Joel Niklaus, Robin Mamié, Matthias Stürmer, Daniel Brunner, Marcel Gygli


Abstract
Releasing court decisions to the public relies on proper anonymization to protect all involved parties, where necessary. The Swiss Federal Supreme Court relies on an existing system that combines different traditional computational methods with human experts. In this work, we enhance the existing anonymization software using a large dataset annotated with entities to be anonymized. We compared BERT-based models with models pre-trained on in-domain data. Our results show that using in-domain data to pre-train the models further improves the F1-score by more than 5% compared to existing models. Our work demonstrates that combining existing anonymization methods, such as regular expressions, with machine learning can further reduce manual labor and enhance automatic suggestions.
Anthology ID:
2023.nllp-1.16
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Daniel Preoțiuc-Pietro, Catalina Goanta, Ilias Chalkidis, Leslie Barrett, Gerasimos Spanakis, Nikolaos Aletras
Venues:
NLLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
159–165
Language:
URL:
https://aclanthology.org/2023.nllp-1.16
DOI:
10.18653/v1/2023.nllp-1.16
Bibkey:
Cite (ACL):
Joel Niklaus, Robin Mamié, Matthias Stürmer, Daniel Brunner, and Marcel Gygli. 2023. Automatic Anonymization of Swiss Federal Supreme Court Rulings. In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 159–165, Singapore. Association for Computational Linguistics.
Cite (Informal):
Automatic Anonymization of Swiss Federal Supreme Court Rulings (Niklaus et al., NLLP-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2023.nllp-1.16.pdf