This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
ErfanMoosavi Monazzah
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Large language models predominantly reflect Western cultures, largely due to the dominance of English-centric training data. This imbalance presents a significant challenge, as LLMs are increasingly used across diverse contexts without adequate evaluation of their cultural competence in non-English languages, including Persian. To address this gap, we introduce PerCul, a carefully constructed dataset designed to assess the sensitivity of LLMs toward Persian culture. PerCul features story-based, multiple-choice questions that capture culturally nuanced scenarios.Unlike existing benchmarks, PerCul is curated with input from native Persian annotators to ensure authenticity and to prevent the use of translation as a shortcut. We evaluate several state-of-the-art multilingual and Persian-specific LLMs, establishing a foundation for future research in cross-cultural NLP evaluation. Our experiments demonstrate a 11.3% gap between best closed source model and layperson baseline while the gap increases to 21.3% by using the best open-weight model. You can access the dataset from here:https://huggingface.co/datasets/teias-ai/percul
This paper outlines our approach to SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes, specifically addressing subtask 1. The study focuses on model fine-tuning using language models, including BERT, GPT-2, and RoBERTa, with the experiment results demonstrating optimal performance with GPT-2. Our system submission achieved a competitive ranking of 17th out of 33 teams in subtask 1, showcasing the effectiveness of the employed methodology in the context of persuasive technique identification within meme texts.
This study addresses a task encompassing two distinct subtasks: Sentence-puzzle and Word-puzzle. Our primary focus lies within the Sentence-puzzle subtask, which involves discerning the correct answer from a set of three options for a given riddle constructed from sentence fragments. We propose four distinct methodologies tailored to address this subtask effectively. Firstly, we introduce a zero-shot approach leveraging the capabilities of the GPT-3.5 model. Additionally, we present three fine-tuning methodologies utilizing MPNet as the underlying architecture, each employing a different loss function. We conduct comprehensive evaluations of these methodologies on the designated task dataset and meticulously document the obtained results. Furthermore, we conduct an in-depth analysis to ascertain the respective strengths and weaknesses of each method. Through this analysis, we aim to provide valuable insights into the challenges inherent to this task domain.
The successful deployment of large language models in numerous NLP tasks has spurred the demand for tackling more complex tasks, which were previously unattainable. SemEval-2024 Task 9 introduces the brainteaser dataset that necessitates intricate, human-like reasoning to solve puzzles that challenge common sense. At first glance, the riddles in the dataset may appear trivial for humans to solve. However, these riddles demand lateral thinking, which deviates from vertical thinking that is the dominant form when it comes to current reasoning tasks. In this paper, we examine the ability of current state-of-the-art LLMs to solve this task. Our study is diversified by selecting both open and closed source LLMs with varying numbers of parameters. Additionally, we extend the task dataset with synthetic explanations derived from the LLMs’ reasoning processes during task resolution. These could serve as a valuable resource for further expanding the task dataset and developing more robust methods for tasks that require complex reasoning. All the codes and datasets are available in paper’s GitHub repository.
ImageArg is a shared task at the 10th ArgMining Workshop at EMNLP 2023. It leverages the ImageArg dataset to advance multimodal persuasiveness techniques. This challenge comprises two distinct subtasks: 1) Argumentative Stance (AS) Classification: Assessing whether a given tweet adopts an argumentative stance. 2) Image Persuasiveness (IP) Classification: Determining if the tweet image enhances the persuasive quality of the tweet. We conducted various experiments on both subtasks and ranked sixth out of the nine participating teams.
This paper introduces a data augmentation technique for the task of detecting human values. Our approach involves generating additional examples using metadata that describes the labels in the datasets. We evaluated the effectiveness of our method by fine-tuning BERT and RoBERTa models on our augmented dataset and comparing their F1 -scores to those of the non-augmented dataset. We obtained competitive results on both the Main test set and the Nahj al-Balagha test set, ranking 14th and 7th respectively among the participants. We also demonstrate that by incorporating our augmentation technique, the classification performance of BERT and RoBERTa is improved, resulting in an increase of up to 10.1% in their F1-score.