Irina Temnikova

2024

pdf abs
SM-FEEL-BG - the First Bulgarian Datasets and Classifiers for Detecting Feelings, Emotions, and Sentiments of Bulgarian Social Media Text
Irina Temnikova | Iva Marinova | Silvia Gargova | Ruslana Margova | Alexander Komarov | Tsvetelina Stefanova | Veneta Kireva | Dimana Vyatrova | Nevena Grigorova | Yordan Mandevski | Stefan Minkov
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This article introduces SM-FEEL-BG – the first Bulgarian-language package, containing 6 datasets with Social Media (SM) texts with emotion, feeling, and sentiment labels and 4 classifiers trained on them. All but one dataset from these are freely accessible for research purposes. The largest dataset contains 6000 Twitter, Telegram, and Facebook texts, manually annotated with 21 fine-grained emotion/feeling categories. The fine-grained labels are automatically merged into three coarse-grained sentiment categories, producing a dataset with two parallel sets of labels. Several classification experiments are run on different subsets of the fine-grained categories and their respective sentiment labels with a Bulgarian fine-tuned BERT. The highest Acc. reached was 0.61 for 16 emotions and 0.70 for 11 emotions (incl. 310 ChatGPT 4-generated texts). The sentiments Acc. of the 11 emotions dataset was also the highest (0.79). As Facebook posts cannot be shared, we ran experiments on the Twitter and Telegram subset of the 11 emotions dataset, obtaining 0.73 Acc. for emotions and 0.80 for sentiments. The article describes the annotation procedures, guidelines, experiments, and results. We believe that this package will be of significant benefit to researchers working on emotion detection and sentiment analysis in Bulgarian.

2023

pdf abs
Looking for Traces of Textual Deepfakes in Bulgarian on Social Media
Irina Temnikova | Iva Marinova | Silvia Gargova | Ruslana Margova | Ivan Koychev
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Textual deepfakes can cause harm, especially on social media. At the moment, there are models trained to detect deepfake messages mainly for the English language, but no research or datasets currently exist for detecting them in most low-resource languages, such as Bulgarian. To address this gap, we explore three approaches. First, we machine translate an English-language social media dataset with bot messages into Bulgarian. However, the translation quality is unsatisfactory, leading us to create a new Bulgarian-language dataset with real social media messages and those generated by two language models (a new Bulgarian GPT-2 model – GPT-WEB-BG, and ChatGPT). We machine translate it into English and test existing English GPT-2 and ChatGPT detectors on it, achieving only 0.44-0.51 accuracy. Next, we train our own classifiers on the Bulgarian dataset, obtaining an accuracy of 0.97. Additionally, we apply the classifier with the highest results to a recently released Bulgarian social media dataset with manually fact-checked messages, which successfully identifies some of the messages as generated by Language Models (LM). Our results show that the use of machine translation is not suitable for textual deepfakes detection. We conclude that combining LM text detection with fact-checking is the most appropriate method for this task, and that identifying Bulgarian textual deepfakes is indeed possible.

2022

pdf abs
Evaluation of Off-the-Shelf Language Identification Tools on Bulgarian Social Media Posts
Silvia Gargova | Irina Temnikova | Ivo Dzhumerov | Hristiana Nikolaeva
Proceedings of the Fifth International Conference on Computational Linguistics in Bulgaria (CLIB 2022)

Automatic Language Identification (LI) is a widely addressed task, but not all users (for example linguists) have the means or interest to develop their own tool or to train the existing ones with their own data. There are several off-the-shelf LI tools, but for some languages, it is unclear which tool is the best for specific types of text. This article presents a comparison of the performance of several off-the-shelf language identification tools on Bulgarian social media data. The LI tools are tested on a multilingual Twitter dataset (composed of 2966 tweets) and an existing Bulgarian Twitter dataset on the topic of fake content detection of 3350 tweets. The article presents the manual annotation procedure of the first dataset, a dis- cussion of the decisions of the two annotators, and the results from testing the 7 off-the-shelf LI tools on both datasets. Our findings show that the tool, which is the easiest for users with no programming skills, achieves the highest F1-Score on Bulgarian social media data, while other tools have very useful functionalities for Bulgarian social media texts.

2019

pdf abs
Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite
Prathyusha Jwalapuram | Shafiq Joty | Irina Temnikova | Preslav Nakov
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments.

pdf abs
Human-Informed Speakers and Interpreters Analysis in the WAW Corpus and an Automatic Method for Calculating Interpreters’ Décalage
Irina Temnikova | Ahmed Abdelali | Souhila Djabri | Samy Hedaya
Proceedings of the Human-Informed Translation and Interpreting Technology Workshop (HiT-IT 2019)

This article presents a multi-faceted analysis of a subset of interpreted conference speeches from the WAW corpus for the English-Arabic language pair. We analyze several speakers and interpreters variables via manual annotation and automatic methods. We propose a new automatic method for calculating interpreters’ décalage based on Automatic Speech Recognition (ASR) and automatic alignment of named entities and content words between speaker and interpreter. The method is evaluated by two human annotators who have expertise in interpreting and Interpreting Studies and shows highly satisfactory results, accompanied with a high inter-annotator agreement. We provide insights about the relations of speakers’ variables, interpreters’ variables and décalage and discuss them from Interpreting Studies and interpreting practice point of view. We had interesting findings about interpreters behavior which need to be extended to a large number of conference sessions in our future research.

pdf bib
Proceedings of the Student Research Workshop Associated with RANLP 2019
Venelin Kovatchev | Irina Temnikova | Branislava Šandrih | Ivelina Nikolova
Proceedings of the Student Research Workshop Associated with RANLP 2019

2018

pdf
The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic
Ahmed Abdelali | Irina Temnikova | Samy Hedaya | Stephan Vogel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

bib
Proceedings of the Student Research Workshop Associated with RANLP 2017
Venelin Kovatchev | Irina Temnikova | Pepa Gencheva | Yasen Kiprov | Ivelina Nikolova
Proceedings of the Student Research Workshop Associated with RANLP 2017

bib
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Irina Temnikova | Constantin Orasan | Gloria Corpas Pastor | Stephan Vogel
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

pdf abs
Interpreting Strategies Annotation in the WAW Corpus
Irina Temnikova | Ahmed Abdelali | Samy Hedaya | Stephan Vogel | Aishah Al Daher
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

With the aim to teach our automatic speech-to-text translation system human interpreting strategies, our first step is to identify which interpreting strategies are most often used in the language pair of our interest (English-Arabic). In this article we run an automatic analysis of a corpus of parallel speeches and their human interpretations, and provide the results of manually annotating the human interpreting strategies in a sample of the corpus. We give a glimpse of the corpus, whose value surpasses the fact that it contains a high number of scientific speeches with their interpretations from English into Arabic, as it also provides rich information about the interpreters. We also discuss the difficulties, which we encountered on our way, as well as our solutions to them: our methodology for manual re-segmentation and alignment of parallel segments, the choice of annotation tool, and the annotation procedure. Our annotation findings explain the previously extracted specific statistical features of the interpreted corpus (compared with a translation one) as well as the quality of interpretation provided by different interpreters.

2016

pdf
Eyes Don’t Lie: Predicting Machine Translation Quality Using Eye Movement
Hassan Sajjad | Francisco Guzmán | Nadir Durrani | Ahmed Abdelali | Houda Bouamor | Irina Temnikova | Stephan Vogel
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf abs
Evaluating the Readability of Text Simplification Output for Readers with Cognitive Disabilities
Victoria Yaneva | Irina Temnikova | Ruslan Mitkov
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents an approach for automatic evaluation of the readability of text simplification output for readers with cognitive disabilities. First, we present our work towards the development of the EasyRead corpus, which contains easy-to-read documents created especially for people with cognitive disabilities. We then compare the EasyRead corpus to the simplified output contained in the LocalNews corpus (Feng, 2009), the accessibility of which has been evaluated through reading comprehension experiments including 20 adults with mild intellectual disability. This comparison is made on the basis of 13 disability-specific linguistic features. The comparison reveals that there are no major differences between the two corpora, which shows that the EasyRead corpus is to a similar reading level as the user-evaluated texts. We also discuss the role of Simple Wikipedia (Zhu et al., 2010) as a widely-used accessibility benchmark, in light of our finding that it is significantly more complex than both the EasyRead and the LocalNews corpora.

pdf abs
A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults
Victoria Yaneva | Irina Temnikova | Ruslan Mitkov
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The paper presents a corpus of text data and its corresponding gaze fixations obtained from autistic and non-autistic readers. The data was elicited through reading comprehension testing combined with eye-tracking recording. The corpus consists of 1034 content words tagged with their POS, syntactic role and three gaze-based measures corresponding to the autistic and control participants. The reading skills of the participants were measured through multiple-choice questions and, based on the answers given, they were divided into groups of skillful and less-skillful readers. This division of the groups informs researchers on whether particular fixations were elicited from skillful or less-skillful readers and allows a fair between-group comparison for two levels of reading ability. In addition to describing the process of data collection and corpus development, we present a study on the effect that word length has on reading in autism. The corpus is intended as a resource for investigating the particular linguistic constructions which pose reading difficulties for people with autism and hopefully, as a way to inform future text simplification research intended for this population.

pdf abs
SuperCAT: The (New and Improved) Corpus Analysis Toolkit
K. Bretonnel Cohen | William A. Baumgartner Jr. | Irina Temnikova
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper reports SuperCAT, a corpus analysis toolkit. It is a radical extension of SubCAT, the Sublanguage Corpus Analysis Toolkit, from sublanguage analysis to corpus analysis in general. The idea behind SuperCAT is that representative corpora have no tendency towards closure―that is, they tend towards infinity. In contrast, non-representative corpora have a tendency towards closure―roughly, finiteness. SuperCAT focuses on general techniques for the quantitative description of the characteristics of any corpus (or other language sample), particularly concerning the characteristics of lexical distributions. Additionally, SuperCAT features a complete re-engineering of the previous SubCAT architecture.

pdf abs
Applying the Cognitive Machine Translation Evaluation Approach to Arabic
Irina Temnikova | Wajdi Zaghouani | Stephan Vogel | Nizar Habash
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The goal of the cognitive machine translation (MT) evaluation approach is to build classifiers which assign post-editing effort scores to new texts. The approach helps estimate fair compensation for post-editors in the translation industry by evaluating the cognitive difficulty of post-editing MT output. The approach counts the number of errors classified in different categories on the basis of how much cognitive effort they require in order to be corrected. In this paper, we present the results of applying an existing cognitive evaluation approach to Modern Standard Arabic (MSA). We provide a comparison of the number of errors and categories of errors in three MSA texts of different MT quality (without any language-specific adaptation), as well as a comparison between MSA texts and texts from three Indo-European languages (Russian, Spanish, and Bulgarian), taken from a previous experiment. The results show how the error distributions change passing from the MSA texts of worse MT quality to MSA texts of better MT quality, as well as a similarity in distinguishing the texts of better MT quality for all four languages.

2015

pdf
Norwegian Native Language Identification
Shervin Malmasi | Mark Dras | Irina Temnikova
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib
Proceedings of the Student Research Workshop
Irina Temnikova | Ivelina Nikolova | Alexander Popov
Proceedings of the Student Research Workshop

pdf
How do Humans Evaluate Machine Translation
Francisco Guzmán | Ahmed Abdelali | Irina Temnikova | Hassan Sajjad | Stephan Vogel
Proceedings of the Tenth Workshop on Statistical Machine Translation

2014

pdf abs
Building a Crisis Management Term Resource for Social Media: The Case of Floods and Protests
Irina Temnikova | Andrea Varga | Dogan Biyikli
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Extracting information from social media is being currently exploited for a variety of tasks, including the recognition of emergency events in Twitter. This is done in order to supply Crisis Management agencies with additional crisis information. The existing approaches, however, mostly rely on geographic location and hashtags/keywords, obtained via a manual Twitter search. As we expect that Twitter crisis terminology would differ from existing crisis glossaries, we start collecting a specialized terminological resource to support this task. The aim of this resource is to contain sets of crisis-related Twitter terms which are the same for different instances of the same type of event. This article presents a preliminary investigation of the nature of terms used in four events of two crisis types, tests manual and automatic ways to collect these terms and comes up with an initial collection of terms for these two types of events. As contributions, a novel annotation schema is presented, along with important insights into the differences in annotations between different specialists, descriptive term statistics, and performance results of existing automatic terminology recognition approaches for this task.

pdf abs
Sublanguage Corpus Analysis Toolkit: A tool for assessing the representativeness and sublanguage characteristics of corpora
Irina Temnikova | William A. Baumgartner Jr. | Negacy D. Hailu | Ivelina Nikolova | Tony McEnery | Adam Kilgarriff | Galia Angelova | K. Bretonnel Cohen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Sublanguages are varieties of language that form subsets of the general language, typically exhibiting particular types of lexical, semantic, and other restrictions and deviance. SubCAT, the Sublanguage Corpus Analysis Toolkit, assesses the representativeness and closure properties of corpora to analyze the extent to which they are either sublanguages, or representative samples of the general language. The current version of SubCAT contains scripts and applications for assessing lexical closure, morphological closure, sentence type closure, over-represented words, and syntactic deviance. Its operation is illustrated with three case studies concerning scientific journal articles, patents, and clinical records. Materials from two language families are analyzed―English (Germanic), and Bulgarian (Slavic). The software is available at sublanguage.sourceforge.net under a liberal Open Source license.

2013

pdf
Recognizing Sublanguages in Scientific Journal Articles through Closure Properties
Irina Temnikova | Kevin Cohen
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

pdf
The C-Score – Proposing a Reading Comprehension Metrics as a Common Evaluation Measure for Text Simplification
Irina Temnikova | Galina Maneva
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

pdf
Enriching Patent Search with External Keywords: a Feasibility Study
Ivelina Nikolova | Irina Temnikova | Galia Angelova
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf
Measuring Closure Properties of Patent Sublanguages
Irina Temnikova | Negacy Hailu | Galia Angelova | K. Bretonnel Cohen
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf
Closure Properties of Bulgarian Clinical Text
Irina Temnikova | Ivelina Nikolova | William A. Baumgartner | Galia Angelova | K. Bretonnel Cohen
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
Proceedings of the Student Research Workshop associated with RANLP 2013
Irina Temnikova | Ivelina Nikolova | Natalia Konstantinova
Proceedings of the Student Research Workshop associated with RANLP 2013

2012

pdf abs
CLCM - A Linguistic Resource for Effective Simplification of Instructions in the Crisis Management Domain and its Evaluations
Irina Temnikova | Constantin Orasan | Ruslan Mitkov
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Due to the increasing number of emergency situations which can have substantial consequences, both financially and fatally, the Crisis Management (CM) domain is developing at an exponential speed. The efficient management of emergency situations relies on clear communication between all of the participants in a crisis situation. For these reasons the Text Complexity (TC) of the CM domain needed to be investigated and showed that CM domain texts exhibit high TC levels. This article presents a new linguistic resource in the form of Controlled Language (CL) guidelines for manual text simplification in the CM domain which aims to address high TC in the CM domain and produce clear messages to be used in crisis situations. The effectiveness of the resource has been tested via evaluation from several different perspectives important for the domain. The overall results show that the CLCM simplification has a positive impact on TC, reading comprehension, manual translation and machine translation. Additionally, an investigation of the cognitive difficulty in applying manual simplification operations led to interesting discoveries. This article provides details of the evaluation methods, the conducted experiments, their results and indications about future work.

2011

pdf
Establishing Implementation Priorities in Aiding Writers of Controlled Crisis Management Texts
Irina Temnikova
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Proceedings of the Second Student Research Workshop associated with RANLP 2011
Irina Temnikova | Ivelina Nikolova | Natalia Konstantinova
Proceedings of the Second Student Research Workshop associated with RANLP 2011

2010

pdf abs
Cognitive Evaluation Approach for a Controlled Language Post-Editing Experiment
Irina Temnikova
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In emergency situations it is crucial that instructions are straightforward to understand. For this reason a controlled language for crisis management (CLCM), based on psycholinguistic studies of human comprehension under stress, was developed. In order to test the impact of CLCM machine translatability of this particular kind of sub-language text, a previous experiment involving machine translation and human post-editing has been conducted. Employing two automatic evaluation metrics, a previous evaluation of the experiment has proved that instructions written according to this CL can improve machine translation (MT) performance. This paper presents a new cognitive evaluation approach for MT post-editing, which is tested on the previous controlled and uncontrolled textual data. The presented evaluation approach allows a deeper look into the post-editing process and specifically how much effort post-editors put into correcting the different kinds of MT errors. The method is based on existing MT error classification, which is enriched with a new error ranking motivated by the cognitive effort involved in the detection and correction of these MT errors. The preliminary results of applying this approach to a subset of the original data confirmed once again the positive impact of CLCM on emergency instructions' machine translatability and thus the validity of the approach.