Wietse de Vries

2021

pdf bib
As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages
Wietse de Vries | Malvina Nissim
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High
Wietse de Vries | Martijn Bartelds | Malvina Nissim | Martijn Wieling
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs
A Multilingual Approach to Identify and Classify Exceptional Measures against COVID-19
Georgios Tziafas | Eugenie de Saint-Phalle | Wietse de Vries | Clara Egger | Tommaso Caselli
Proceedings of the Natural Legal Language Processing Workshop 2021

The COVID-19 pandemic has witnessed the implementations of exceptional measures by governments across the world to counteract its impact. This work presents the initial results of an on-going project, EXCEPTIUS, aiming to automatically identify, classify and com- pare exceptional measures against COVID-19 across 32 countries in Europe. To this goal, we created a corpus of legal documents with sentence-level annotations of eight different classes of exceptional measures that are im- plemented across these countries. We evalu- ated multiple multi-label classifiers on a manu- ally annotated corpus at sentence level. The XLM-RoBERTa model achieves highest per- formance on this multilingual multi-label clas- sification task, with a macro-average F1 score of 59.8%.

2020

pdf bib abs
What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models
Wietse de Vries | Andreas van Cranenburgh | Malvina Nissim
Findings of the Association for Computational Linguistics: EMNLP 2020

Peeking into the inner workings of BERT has shown that its layers resemble the classical NLP pipeline, with progressively more complex tasks being concentrated in later layers. To investigate to what extent these results also hold for a language other than English, we probe a Dutch BERT-based model and the multilingual BERT model for Dutch NLP tasks. In addition, through a deeper analysis of part-of-speech tagging, we show that also within a given task, information is spread over different parts of the network and the pipeline might not be as neat as it seems. Each layer has different specialisations, so that it may be more useful to combine information from different layers, instead of selecting a single one based on the best overall performance.

Co-authors

Eugenie de Saint-Phalle 1

Clara Egger 1

Tommaso Caselli 1

Venues

Findings3
NLLP1