Manan Dey


2022

pdf
You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings
Zeerak Talat | Aurélie Névéol | Stella Biderman | Miruna Clinciu | Manan Dey | Shayne Longpre | Sasha Luccioni | Maraim Masoud | Margaret Mitchell | Dragomir Radev | Shanya Sharma | Arjun Subramonian | Jaesung Tae | Samson Tan | Deepak Tunuguntla | Oskar Van Der Wal
Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models

Evaluating bias, fairness, and social impact in monolingual language models is a difficult task. This challenge is further compounded when language modeling occurs in a multilingual context. Considering the implication of evaluation biases for large multilingual language models, we situate the discussion of bias evaluation within a wider context of social scientific research with computational work.We highlight three dimensions of developing multilingual bias evaluation frameworks: (1) increasing transparency through documentation, (2) expanding targets of bias beyond gender, and (3) addressing cultural differences that exist between languages.We further discuss the power dynamics and consequences of training large language models and recommend that researchers remain cognizant of the ramifications of developing such technologies.

pdf
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen Bach | Victor Sanh | Zheng Xin Yong | Albert Webson | Colin Raffel | Nihal V. Nayak | Abheesht Sharma | Taewoon Kim | M Saiful Bari | Thibault Fevry | Zaid Alyafeai | Manan Dey | Andrea Santilli | Zhiqing Sun | Srulik Ben-david | Canwen Xu | Gunjan Chhablani | Han Wang | Jason Fries | Maged Al-shaibani | Shanya Sharma | Urmish Thakker | Khalid Almubarak | Xiangru Tang | Dragomir Radev | Mike Tian-jian Jiang | Alexander Rush
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges in this new setting with (1) a templating language for defining data-linked prompts, (2) an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and (3) a community-driven set of guidelines for contributing new prompts to a common pool. Over 2,000 prompts for roughly 170 datasets are already available in PromptSource. PromptSource is available at https://github.com/bigscience-workshop/promptsource.