Alexander Bondarenko


CausalQA: A Benchmark for Causal Question Answering
Alexander Bondarenko | Magdalena Wolska | Stefan Heindorf | Lukas Blübaum | Axel-Cyrille Ngonga Ngomo | Benno Stein | Pavel Braslavski | Matthias Hagen | Martin Potthast
Proceedings of the 29th International Conference on Computational Linguistics

At least 5% of questions submitted to search engines ask about cause-effect relationships in some way. To support the development of tailored approaches that can answer such questions, we construct Webis-CausalQA-22, a benchmark corpus of 1.1 million causal questions with answers. We distinguish different types of causal questions using a novel typology derived from a data-driven, manual analysis of questions from ten large question answering (QA) datasets. Using high-precision lexical rules, we extract causal questions of each type from these datasets to create our corpus. As an initial baseline, the state-of-the-art QA model UnifiedQA achieves a ROUGE-L F1 score of 0.48 on our new benchmark.


Which is Better for Deep Learning: Python or MATLAB? Answering Comparative Questions in Natural Language
Viktoriia Chekalina | Alexander Bondarenko | Chris Biemann | Meriem Beloucif | Varvara Logacheva | Alexander Panchenko
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

We present a system for answering comparative questions (Is X better than Y with respect to Z?) in natural language. Answering such questions is important for assisting humans in making informed decisions. The key component of our system is a natural language interface for comparative QA that can be used in personal assistants, chatbots, and similar NLP devices. Comparative QA is a challenging NLP task, since it requires collecting support evidence from many different sources, and direct comparisons of rare objects may be not available even on the entire Web. We take the first step towards a solution for such a task offering a testbed for comparative QA in natural language by probing several methods, making the three best ones available as an online demo.


Categorizing Comparative Sentences
Alexander Panchenko | Alexander Bondarenko | Mirco Franzek | Matthias Hagen | Chris Biemann
Proceedings of the 6th Workshop on Argument Mining

We tackle the tasks of automatically identifying comparative sentences and categorizing the intended preference (e.g., “Python has better NLP libraries than MATLAB” → Python, better, MATLAB). To this end, we manually annotate 7,199 sentences for 217 distinct target item pairs from several domains (27% of the sentences contain an oriented comparison in the sense of “better” or “worse”). A gradient boosting model based on pre-trained sentence embeddings reaches an F1 score of 85% in our experimental evaluation. The model can be used to extract comparative sentences for pro/con argumentation in comparative / argument search engines or debating technologies.

TARGER: Neural Argument Mining at Your Fingertips
Artem Chernodub | Oleksiy Oliynyk | Philipp Heidenreich | Alexander Bondarenko | Matthias Hagen | Chris Biemann | Alexander Panchenko
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present TARGER, an open source neural argument mining framework for tagging arguments in free input texts and for keyword-based retrieval of arguments from an argument-tagged web-scale corpus. The currently available models are pre-trained on three recent argument mining datasets and enable the use of neural argument mining without any reproducibility effort on the user’s side. The open source code ensures portability to other domains and use cases.