Andrey Tagarev


2021

pdf
Tackling Multilinguality and Internationality in Fake News
Andrey Tagarev | Krasimira Bozhanova | Ivelina Nikolova-Koleva | Ivan Ivanov
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

The last several years have seen a massive increase in the quantity and influence of disinformation being spread online. Various approaches have been developed to target the process at different stages from identifying sources to tracking distribution in social media to providing follow up debunks to people who have encountered the disinformation. One common conclusion in each of these approaches is that disinformation is too nuanced and subjective a topic for fully automated solutions to work but the quantity of data to process and cross-reference is too high for humans to handle unassisted. Ultimately, the problem calls for a hybrid approach of human experts with technological assistance. In this paper we will demonstrate the application of certain state-of-the-art NLP techniques in assisting expert debunkers and fact checkers as well as the role of these NLP algorithms within a more holistic approach to analyzing and countering the spread of disinformation. We will present a multilingual corpus of disinformation and debunks which contains text, concept tags, images and videos as well as various methods for searching and leveraging the content.

2019

pdf
Comparison of Machine Learning Approaches for Industry Classification Based on Textual Descriptions of Companies
Andrey Tagarev | Nikola Tulechki | Svetla Boytcheva
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

This paper addresses the task of categorizing companies within industry classification schemes. The datasets consists of encyclopedic articles about companies and their economic activities. The target classification schema is build by mapping linked open data in a semi-supervised manner. Target classes are build bottom-up from DBpedia. We apply several state of the art text classification techniques, based both on deep-learning and classical vector-space models.