Somayeh Bakhshaei


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
ELAB: Extensive LLM Alignment Benchmark in Persian Language
Zahra Pourbahman | Fatemeh Rajabi | Mohammadhossein Sadeghi | Omid Ghahroodi | Somayeh Bakhshaei | Arash Amini | Reza Kazemi | Mahdieh Soleymani Baghshah
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)

This paper presents a comprehensive evaluation framework for aligning Persian Large Language Models (LLMs) with critical ethical dimensions, including safety, fairness, and social norms. It addresses the gaps in existing LLM evaluation frameworks by adapting them to Persian linguistic and cultural contexts. This benchmark creates three types of Persian-language benchmarks: (i) translated data, (ii) new data generated synthetically, and (iii) new naturally collected data. We translate Anthropic Red Teaming data, AdvBench, HarmBench, and DecodingTrust into Persian. Furthermore, we create ProhibiBench-fa, SafeBench-fa, FairBench-fa, and SocialBench-fa as new datasets to address harmful and prohibited content in indigenous culture. Moreover, we collect extensive dataset as GuardBench-fa to consider Persian cultural norms. By combining these datasets, our work establishes a unified framework for evaluating Persian LLMs, offering a new approach to culturally grounded alignment evaluation. A systematic evaluation of Persian LLMs is performed across the three alignment aspects: safety (avoiding harmful content), fairness (mitigating biases), and social norms (adhering to culturally accepted behaviors). We present a publicly available leaderboard that benchmarks Persian LLMs with respect to safety, fairness, and social norms.

2015

pdf bib
A Generative Model for Extracting Parallel Fragments from Comparable Documents
Somayeh Bakhshaei | Shahram Khadivi | Reza Safabakhsh
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora

pdf bib
AUT Document Alignment Framework for BUCC Workshop Shared Task
Atefeh Zafarian | Amir Pouya Agha Sadeghi | Fatemeh Azadi | Sonia Ghiasifard | Zeinab Ali Panahloo | Somayeh Bakhshaei | Seyyed Mohammad Mohammadzadeh Ziabary
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora

2013

pdf bib
Using Context Vectors in Improving a Machine Translation System with Bridge Language
Samira Tofighi Zahabi | Somayeh Bakhshaei | Shahram Khadivi
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)