Daniil Ignatev
2026
Human Label Variation in Implicit Discourse Relation Recognition
Frances Yung | Daniil Ignatev | Merel Scholman | Vera Demberg | Massimo Poesio
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Frances Yung | Daniil Ignatev | Merel Scholman | Vera Demberg | Massimo Poesio
Proceedings of the Fifteenth Language Resources and Evaluation Conference
There is growing recognition that many NLP tasks lack a single ground truth, as human judgments reflect diverse perspectives. To capture this variation, models have been developed to predict full annotation distributions rather than majority labels, while perspectivist models aim to reproduce the interpretations of individual annotators. In this work, we compare these approaches on Implicit Discourse Relation Recognition (IDRR), a highly ambiguous task where disagreement often arises from cognitive complexity rather than ideological bias. Our experiments show that existing annotator-specific models perform poorly in IDRR unless ambiguity is reduced, whereas models trained on label distributions yield more stable predictions. Further analysis indicates that frequent cognitively demanding cases drive inconsistency in human interpretation, posing challenges for perspectivist modeling in IDRR.
2025
Disentangling the Roles of Representation and Selection in Data Pruning
Yupei Du | Yingjin Song | Hugh Mee Wong | Daniil Ignatev | Albert Gatt | Dong Nguyen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yupei Du | Yingjin Song | Hugh Mee Wong | Daniil Ignatev | Albert Gatt | Dong Nguyen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Data pruning—selecting small but impactful subsets—offers a promising way to efficiently scale NLP model training. However, existing methods often involve many different design choices, which have not been systematically studied. This limits future developments. In this work, we decompose data pruning into two key components: data representation and selection algorithm, and systematically analyze their influence on selected instances. Our theoretical and empirical results highlight the crucial role of representations: better representations, e.g., training gradients, generally lead to better selected instances, regardless of the chosen selection algorithm. Furthermore, different selection algorithms excel in different settings, and none consistently outperform the others. Moreover, the selection algorithms do not always align with their intended objectives: for example, algorithms designed for the same objective can select drastically different instances, highlighting the need for careful evaluation.
Annotator disagreement in RST annotation schemes
Daniil Ignatev | Denis Paperno | Massimo Poesio
Proceedings of the Society for Computation in Linguistics 2025
Daniil Ignatev | Denis Paperno | Massimo Poesio
Proceedings of the Society for Computation in Linguistics 2025
DeMeVa at LeWiDi-2025: Modeling Perspectives with In-Context Learning and Label Distribution Learning
Daniil Ignatev | Nan Li | Hugh Mee Wong | Anh Dang | Shane Kaszefski Yaschuk
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Daniil Ignatev | Nan Li | Hugh Mee Wong | Anh Dang | Shane Kaszefski Yaschuk
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
This system paper presents the DeMeVa team’s approaches to the third edition of the Learning with Disagreements shared task (LeWiDi 2025; Leonardelli et al., 2025). We explore two directions: in-context learning (ICL) with large language models, where we compare example sampling strategies; and label distribution learning (LDL) methods with RoBERTa (Liu et al., 2019b), where we evaluate several fine-tuning methods. Our contributions are twofold: (1) we show that ICL can effectively predict annotator-specific annotations (perspectivist annotations), and that aggregating these predictions into soft labels yields competitive performance; and (2) we argue that LDL methods are promising for soft label predictions and merit further exploration by the perspectivist community.
Hypernetworks for Perspectivist Adaptation
Daniil Ignatev | Denis Paperno | Massimo Poesio
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Daniil Ignatev | Denis Paperno | Massimo Poesio
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
The task of perspective-aware classification introduces a bottleneck in terms of parametric efficiency that did not get enough recognition in existing studies. In this article, we aim to address this issue by applying an existing architecture, the hypernetwork+adapters combination, to perspectivist classification. Ultimately, we arrive at a solution that can compete with specialized models in adopting user perspectives on hate speech and toxicity detection, while also making use of considerably fewer parameters. Our solution is architecture-agnostic and can be applied to a wide range of base models out of the box.