Nada Lavrač


2021

pdf
BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers
Enja Kokalj | Blaž Škrlj | Nada Lavrač | Senja Pollak | Marko Robnik-Šikonja
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

Transformer-based neural networks offer very good classification performance across a wide range of domains, but do not provide explanations of their predictions. While several explanation methods, including SHAP, address the problem of interpreting deep learning models, they are not adapted to operate on state-of-the-art transformer-based neural networks such as BERT. Another shortcoming of these methods is that their visualization of explanations in the form of lists of most relevant words does not take into account the sequential and structurally dependent nature of text. This paper proposes the TransSHAP method that adapts SHAP to transformer models including BERT-based text classifiers. It advances SHAP visualizations by showing explanations in a sequential manner, assessed by human evaluators as competitive to state-of-the-art solutions.

pdf
EMBEDDIA Tools, Datasets and Challenges: Resources and Hackathon Contributions
Senja Pollak | Marko Robnik-Šikonja | Matthew Purver | Michele Boggia | Ravi Shekhar | Marko Pranjić | Salla Salmela | Ivar Krustok | Tarmo Paju | Carl-Gustav Linden | Leo Leppänen | Elaine Zosa | Matej Ulčar | Linda Freienthal | Silver Traat | Luis Adrián Cabrera-Diego | Matej Martinc | Nada Lavrač | Blaž Škrlj | Martin Žnidaršič | Andraž Pelicon | Boshko Koloski | Vid Podpečan | Janez Kranjc | Shane Sheehan | Emanuela Boros | Jose G. Moreno | Antoine Doucet | Hannu Toivonen
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program. The collected resources were offered to participants of a hackathon organized as part of the EACL Hackashop on News Media Content Analysis and Automated Report Generation in February 2021. The hackathon had six participating teams who addressed different challenges, either from the list of proposed challenges or their own news-industry-related tasks. This paper goes beyond the scope of the hackathon, as it brings together in a coherent and compact form most of the resources developed, collected and released by the EMBEDDIA project. Moreover, it constitutes a handy source for news media industry and researchers in the fields of Natural Language Processing and Social Science.

2012

pdf
Irregularity Detection in Categorized Document Corpora
Borut Sluban | Senja Pollak | Roel Coesemans | Nada Lavrač
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The paper presents an approach to extract irregularities in document corpora, where the documents originate from different sources and the analyst's interest is to find documents which are atypical for the given source. The main contribution of the paper is a voting-based approach to irregularity detection and its evaluation on a collection of newspaper articles from two sources: Western (UK and US) and local (Kenyan) media. The evaluation of a domain expert proves that the method is very effective in uncovering interesting irregularities in categorized document corpora.