Floris Bex
2026
BERT, are you paying attention? Attention regularization with human-annotated rationales
Elize Herrewijnen | Dong Nguyen | Floris Bex | Albert Gatt
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Elize Herrewijnen | Dong Nguyen | Floris Bex | Albert Gatt
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Attention regularisation aims to supervise the attention patterns in language models like BERT. Various studies have shown that using human-annotated rationales, in the form of highlights that explain why a text has a specific label, can have positive effects on model generalisability. In this work, we ask to what extent attention regularisation with human-annotated rationales improve model performance and model robustness, as well as susceptibility to spurious correlations. We compare regularisation on human rationales with randomly selected tokens, a baseline which has hitherto remained unexplored.Our results suggest that often, attention regularisation with randomly selected tokens yields similar improvements to attention regularisation with human-annotated rationales. Nevertheless, we find that human-annotated rationales surpass randomly selected tokens when it comes to reducing model sensitivity to strong spurious correlations.
2022
Abstractive Summarization of Dutch Court Verdicts Using Sequence-to-sequence Models
Marijn Schraagen | Floris Bex | Nick Van De Luijtgaarden | Daniël Prijs
Proceedings of the Natural Legal Language Processing Workshop 2022
Marijn Schraagen | Floris Bex | Nick Van De Luijtgaarden | Daniël Prijs
Proceedings of the Natural Legal Language Processing Workshop 2022
With the legal sector embracing digitization, the increasing availability of information has led to a need for systems that can automatically summarize legal documents. Most existing research on legal text summarization has so far focused on extractive models, which can result in awkward summaries, as sentences in legal documents can be very long and detailed. In this study, we apply two abstractive summarization models on a Dutch legal domain dataset. The results show that existing models transfer quite well across domains and languages: the ROUGE scores of our experiments are comparable to state-of-the-art studies on English news article texts. Examining one of the models showed the capability of rewriting long legal sentences to much shorter ones, using mostly vocabulary from the source document. Human evaluation shows that for both models hand-made summaries are still perceived as more relevant and readable, and automatic summaries do not always capture elements such as background, considerations and judgement. Still, generated summaries are valuable if only a keyword summary or no summary at all is present.
2021
Generating Realistic Natural Language Counterfactuals
Marcel Robeer | Floris Bex | Ad Feelders
Findings of the Association for Computational Linguistics: EMNLP 2021
Marcel Robeer | Floris Bex | Ad Feelders
Findings of the Association for Computational Linguistics: EMNLP 2021
Counterfactuals are a valuable means for understanding decisions made by ML systems. However, the counterfactuals generated by the methods currently available for natural language text are either unrealistic or introduce imperceptible changes. We propose CounterfactualGAN: a method that combines a conditional GAN and the embeddings of a pretrained BERT encoder to model-agnostically generate realistic natural language text counterfactuals for explaining regression and classification tasks. Experimental results show that our method produces perceptibly distinguishable counterfactuals, while outperforming four baseline methods on fidelity and human judgments of naturalness, across multiple datasets and multiple predictive models.