A. Seza Doğruöz

2023

pdf bib
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Atul Kr. Ojha | A. Seza Doğruöz | Giovanni Da San Martino | Harish Tayyar Madabushi | Ritesh Kumar | Elisa Sartori
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

pdf abs
Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning
Semere Kiros Bitew | Johannes Deleu | A. Seza Doğruöz | Chris Develder | Thomas Demeester
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

Since performing exercises (including, e.g.,practice tests) forms a crucial component oflearning, and creating such exercises requiresnon-trivial effort from the teacher. There is agreat value in automatic exercise generationin digital tools in education. In this paper, weparticularly focus on automatic creation of gap-filling exercises for language learning, specifi-cally grammar exercises. Since providing anyannotation in this domain requires human ex-pert effort, we aim to avoid it entirely and ex-plore the task of converting existing texts intonew gap-filling exercises, purely based on anexample exercise, without explicit instructionor detailed annotation of the intended gram-mar topics. We contribute (i) a novel neuralnetwork architecture specifically designed foraforementioned gap-filling exercise generationtask, and (ii) a real-world benchmark datasetfor French grammar. We show that our modelfor this French grammar gap-filling exercisegeneration outperforms a competitive baselineclassifier by 8% in F1 percentage points, achiev-ing an average F1 score of 82%. Our model im-plementation and the dataset are made publiclyavailable to foster future research, thus offeringa standardized evaluation and baseline solutionof the proposed partially annotated data predic-tion task in grammar exercise creation.

2022

pdf abs
Language Technologies for Low Resource Languages: Sociolinguistic and Multilingual Insights
A. Seza Doğruöz | Sunayana Sitaram
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

There is a growing interest in building language technologies (LTs) for low resource languages (LRLs). However, there are flaws in the planning, data collection and development phases mostly due to the assumption that LRLs are similar to High Resource Languages (HRLs) but only smaller in size. In our paper, we first provide examples of failed LTs for LRLs and provide the reasons for these failures. Second, we discuss the problematic issues with the data for LRLs. Finally, we provide recommendations for building better LTs for LRLs through insights from sociolinguistics and multilingualism. Our goal is not to solve all problems around LTs for LRLs but to raise awareness about the existing issues, provide recommendations toward possible solutions and encourage collaboration across academic disciplines for developing LTs that actually serve the needs and preferences of the LRL communities.

pdf abs
Automatic Identification and Classification of Bragging in Social Media
Mali Jin | Daniel Preotiuc-Pietro | A. Seza Doğruöz | Nikolaos Aletras
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself. It is widespread in daily communication and especially popular in social media, where users aim to build a positive image of their persona directly or indirectly. In this paper, we present the first large scale study of bragging in computational linguistics, building on previous research in linguistics and pragmatics. To facilitate this, we introduce a new publicly available data set of tweets annotated for bragging and their types. We empirically evaluate different transformer-based models injected with linguistic information in (a) binary bragging classification, i.e., if tweets contain bragging statements or not; and (b) multi-class bragging type prediction including not bragging. Our results show that our models can predict bragging with macro F1 up to 72.42 and 35.95 in the binary and multi-class classification tasks respectively. Finally, we present an extensive linguistic and error analysis of bragging prediction to guide future research on this topic.

2021

pdf abs
How “open” are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation
A. Seza Doğruöz | Gabriel Skantze
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Open-domain chatbots are supposed to converse freely with humans without being restricted to a topic, task or domain. However, the boundaries and/or contents of open-domain conversations are not clear. To clarify the boundaries of “openness”, we conduct two studies: First, we classify the types of “speech events” encountered in a chatbot evaluation data set (i.e., Meena by Google) and find that these conversations mainly cover the “small talk” category and exclude the other speech event categories encountered in real life human-human communication. Second, we conduct a small-scale pilot study to generate online conversations covering a wider range of speech event categories between two humans vs. a human and a state-of-the-art chatbot (i.e., Blender by Facebook). A human evaluation of these generated conversations indicates a preference for human-human conversations, since the human-chatbot conversations lack coherence in most speech event categories. Based on these results, we suggest (a) using the term “small talk” instead of “open-domain” for the current chatbots which are not that “open” in terms of conversational abilities yet, and (b) revising the evaluation methods to test the chatbot conversations against other speech events.

pdf abs
Open Machine Translation for Low Resource South American Languages (AmericasNLP 2021 Shared Task Contribution)
Shantipriya Parida | Subhadarshi Panda | Amulya Dash | Esau Villatoro-Tello | A. Seza Doğruöz | Rosa M. Ortega-Mendoza | Amadeo Hernández | Yashvardhan Sharma | Petr Motlicek
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas

This paper describes the team (“Tamalli”)’s submission to AmericasNLP2021 shared task on Open Machine Translation for low resource South American languages. Our goal was to evaluate different Machine Translation (MT) techniques, statistical and neural-based, under several configuration settings. We obtained the second-best results for the language pairs “Spanish-Bribri”, “Spanish-Asháninka”, and “Spanish-Rarámuri” in the category “Development set not used for training”. Our performed experiments will serve as a point of reference for researchers working on MT with low-resource languages.

pdf abs
A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies
A. Seza Doğruöz | Sunayana Sitaram | Barbara E. Bullock | Almeida Jacqueline Toribio
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The analysis of data in which multiple languages are represented has gained popularity among computational linguists in recent years. So far, much of this research focuses mainly on the improvement of computational methods and largely ignores linguistic and social aspects of C-S discussed across a wide range of languages within the long-established literature in linguistics. To fill this gap, we offer a survey of code-switching (C-S) covering the literature in linguistics with a reflection on the key issues in language technologies. From the linguistic perspective, we provide an overview of structural and functional patterns of C-S focusing on the literature from European and Indian contexts as highly multilingual areas. From the language technologies perspective, we discuss how massive language models fail to represent diverse C-S types due to lack of appropriate training data, lack of robust evaluation benchmarks for C-S (across multilingual situations and types of C-S) and lack of end-to- end systems that cover sociolinguistic aspects of C-S as well. Our survey will be a step to- wards an outcome of mutual benefit for computational scientists and linguists with a shared interest in multilingualism and C-S.

2017

pdf abs
Integrating Meaning into Quality Evaluation of Machine Translation
Osman Başkaya | Eray Yildiz | Doruk Tunaoğlu | Mustafa Tolga Eren | A. Seza Doğruöz
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Machine translation (MT) quality is evaluated through comparisons between MT outputs and the human translations (HT). Traditionally, this evaluation relies on form related features (e.g. lexicon and syntax) and ignores the transfer of meaning reflected in HT outputs. Instead, we evaluate the quality of MT outputs through meaning related features (e.g. polarity, subjectivity) with two experiments. In the first experiment, the meaning related features are compared to human rankings individually. In the second experiment, combinations of meaning related features and other quality metrics are utilized to predict the same human rankings. The results of our experiments confirm the benefit of these features in predicting human evaluation of translation quality in addition to traditional metrics which focus mainly on form.

Co-authors

Venues

bea1

cl1