This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
Lütfiye SedaMut Altın
Also published as:
Lutfiye Seda Mut Altin
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
In this paper, we present a novel dataset for the study of automated sexism identification and categorization on social media in Turkish. For this purpose, we have collected, following a well established methodology, a set of Tweets and YouTube comments. Relying on expert organizations in the area of gender equality, each text has been annotated based on a two-level labelling schema derived from previous research. Our resulting dataset consists of around 7,000 annotated instances useful for the study of expressions of sexism and misogyny on the Web. To the best of our knowledge, this is the first two-level manually annotated comprehensive Turkish dataset for sexism identification. In order to fuel research in this relevant area, we also present the result of our benchmarking experiments in the area of sexism identification in Turkish.
This paper presents the participation of the LaSTUS/TALN team at TRAC-2020 Trolling, Aggression and Cyberbullying shared task. The aim of the task is to determine whether a given text is aggressive and contains misogynistic content. Our approach is based on a bidirectional Long Short Term Memory network (bi-LSTM). Our system performed well at sub-task A, aggression detection; however underachieved at sub-task B, misogyny detection.
We present a bidirectional Long-Short Term Memory network for identifying offensive language in Twitter. Our system has been developed in the context of the SemEval 2019 Task 6 which comprises three different sub-tasks, namely A: Offensive Language Detection, B: Categorization of Offensive Language, C: Offensive Language Target Identification. We used a pre-trained Word Embeddings in tweet data, including information about emojis and hashtags. Our approach achieves good performance in the three sub-tasks.