Jordan Clive
2025
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code
Taishi Nakamura | Mayank Mishra | Simone Tedeschi | Yekun Chai | Jason T. Stillerman | Felix Friedrich | Prateek Yadav | Tanmay Laud | Vu Minh Chien | Terry Yue Zhuo | Diganta Misra | Ben Bogin | Xuan-Son Vu | Marzena Karpinska | Arnav Varma Dantuluri | Wojciech Kusa | Tommaso Furlanello | Rio Yokota | Niklas Muennighoff | Suhas Pai | Tosin Adewumi | Veronika Laippala | Xiaozhe Yao | Adalberto Barbosa Junior | Aleksandr Drozd | Jordan Clive | Kshitij Gupta | Liangyu Chen | Qi Sun | Ken Tsui | Nour Moustafa-Fahmy | Nicolo Monti | Tai Dang | Ziyang Luo | Tien-Tung Bui | Roberto Navigli | Virendra Mehta | Matthew Blumberg | Victor May | Hiep Nguyen | Sampo Pyysalo
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Taishi Nakamura | Mayank Mishra | Simone Tedeschi | Yekun Chai | Jason T. Stillerman | Felix Friedrich | Prateek Yadav | Tanmay Laud | Vu Minh Chien | Terry Yue Zhuo | Diganta Misra | Ben Bogin | Xuan-Son Vu | Marzena Karpinska | Arnav Varma Dantuluri | Wojciech Kusa | Tommaso Furlanello | Rio Yokota | Niklas Muennighoff | Suhas Pai | Tosin Adewumi | Veronika Laippala | Xiaozhe Yao | Adalberto Barbosa Junior | Aleksandr Drozd | Jordan Clive | Kshitij Gupta | Liangyu Chen | Qi Sun | Ken Tsui | Nour Moustafa-Fahmy | Nicolo Monti | Tai Dang | Ziyang Luo | Tien-Tung Bui | Roberto Navigli | Virendra Mehta | Matthew Blumberg | Victor May | Hiep Nguyen | Sampo Pyysalo
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Pretrained language models are integral part of AI applications, but their high computational cost for training limits accessibility. Initiatives such as Bloom and StarCoder aim to democratize access to pretrained models for collaborative community development. Despite these efforts, such models encounter challenges such as limited multilingual capabilities, risks of catastrophic forgetting during continual pretraining, and the high costs of training models from scratch, alongside the need to align with AI safety standards and regulatory frameworks. This paper presents Aurora-M, a 15B parameter multilingual open-source model trained on English, Finnish, Hindi, Japanese, Vietnamese, and code. Continually pretrained from StarCoderPlus on 435B additional tokens, Aurora-M surpasses 2T tokens in total training token count. It is the first open-source multilingual model fine-tuned on human-reviewed safety instructions, thus aligning its development not only with conventional red-teaming considerations, but also with the specific concerns articulated in the Biden-Harris Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. We evaluate Aurora-M across a wide range of tasks and languages, showcasing its robustness against catastrophic forgetting and its superior performance in multilingual settings, particularly in safety evaluations. We open-source Aurora-M and its variants to encourage responsible open-source development of large language models at https://huggingface.co/aurora-m.
SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models
Margaret Mitchell | Giuseppe Attanasio | Ioana Baldini | Miruna Clinciu | Jordan Clive | Pieter Delobelle | Manan Dey | Sil Hamilton | Timm Dill | Jad Doughman | Ritam Dutt | Avijit Ghosh | Jessica Zosa Forde | Carolin Holtermann | Lucie-Aimée Kaffee | Tanmay Laud | Anne Lauscher | Roberto L Lopez-Davila | Maraim Masoud | Nikita Nangia | Anaelia Ovalle | Giada Pistilli | Dragomir Radev | Beatrice Savoldi | Vipul Raheja | Jeremy Qin | Esther Ploeger | Arjun Subramonian | Kaustubh Dhole | Kaiser Sun | Amirbek Djanibekov | Jonibek Mansurov | Kayo Yin | Emilio Villa Cueva | Sagnik Mukherjee | Jerry Huang | Xudong Shen | Jay Gala | Hamdan Al-Ali | Tair Djanibekov | Nurdaulet Mukhituly | Shangrui Nie | Shanya Sharma | Karolina Stanczak | Eliza Szczechla | Tiago Timponi Torrent | Deepak Tunuguntla | Marcelo Viridiano | Oskar Van Der Wal | Adina Yakefu | Aurélie Névéol | Mike Zhang | Sydney Zink | Zeerak Talat
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Margaret Mitchell | Giuseppe Attanasio | Ioana Baldini | Miruna Clinciu | Jordan Clive | Pieter Delobelle | Manan Dey | Sil Hamilton | Timm Dill | Jad Doughman | Ritam Dutt | Avijit Ghosh | Jessica Zosa Forde | Carolin Holtermann | Lucie-Aimée Kaffee | Tanmay Laud | Anne Lauscher | Roberto L Lopez-Davila | Maraim Masoud | Nikita Nangia | Anaelia Ovalle | Giada Pistilli | Dragomir Radev | Beatrice Savoldi | Vipul Raheja | Jeremy Qin | Esther Ploeger | Arjun Subramonian | Kaustubh Dhole | Kaiser Sun | Amirbek Djanibekov | Jonibek Mansurov | Kayo Yin | Emilio Villa Cueva | Sagnik Mukherjee | Jerry Huang | Xudong Shen | Jay Gala | Hamdan Al-Ali | Tair Djanibekov | Nurdaulet Mukhituly | Shangrui Nie | Shanya Sharma | Karolina Stanczak | Eliza Szczechla | Tiago Timponi Torrent | Deepak Tunuguntla | Marcelo Viridiano | Oskar Van Der Wal | Adina Yakefu | Aurélie Névéol | Mike Zhang | Sydney Zink | Zeerak Talat
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large Language Models (LLMs) reproduce and exacerbate the social biases present in their training data, and resources to quantify this issue are limited. While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual parallel dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 regions around the world and 16 languages, spanning multiple identity categories subject to discrimination worldwide. We demonstrate its utility in a series of exploratory evaluations for both “base” and “instruction-tuned” language models. Our results suggest that stereotypes are consistently reflected across models and languages, with some languages and models indicating much stronger stereotype biases than others.
2022
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Sebastian Gehrmann | Abhik Bhattacharjee | Abinaya Mahendiran | Alex Wang | Alexandros Papangelis | Aman Madaan | Angelina Mcmillan-major | Anna Shvets | Ashish Upadhyay | Bernd Bohnet | Bingsheng Yao | Bryan Wilie | Chandra Bhagavatula | Chaobin You | Craig Thomson | Cristina Garbacea | Dakuo Wang | Daniel Deutsch | Deyi Xiong | Di Jin | Dimitra Gkatzia | Dragomir Radev | Elizabeth Clark | Esin Durmus | Faisal Ladhak | Filip Ginter | Genta Indra Winata | Hendrik Strobelt | Hiroaki Hayashi | Jekaterina Novikova | Jenna Kanerva | Jenny Chim | Jiawei Zhou | Jordan Clive | Joshua Maynez | João Sedoc | Juraj Juraska | Kaustubh Dhole | Khyathi Raghavi Chandu | Laura Perez Beltrachini | Leonardo F . R. Ribeiro | Lewis Tunstall | Li Zhang | Mahim Pushkarna | Mathias Creutz | Michael White | Mihir Sanjay Kale | Moussa Kamal Eddine | Nico Daheim | Nishant Subramani | Ondrej Dusek | Paul Pu Liang | Pawan Sasanka Ammanamanchi | Qi Zhu | Ratish Puduppully | Reno Kriz | Rifat Shahriyar | Ronald Cardenas | Saad Mahamood | Salomey Osei | Samuel Cahyawijaya | Sanja Štajner | Sebastien Montella | Shailza Jolly | Simon Mille | Tahmid Hasan | Tianhao Shen | Tosin Adewumi | Vikas Raunak | Vipul Raheja | Vitaly Nikolaev | Vivian Tsai | Yacine Jernite | Ying Xu | Yisi Sang | Yixin Liu | Yufang Hou
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Sebastian Gehrmann | Abhik Bhattacharjee | Abinaya Mahendiran | Alex Wang | Alexandros Papangelis | Aman Madaan | Angelina Mcmillan-major | Anna Shvets | Ashish Upadhyay | Bernd Bohnet | Bingsheng Yao | Bryan Wilie | Chandra Bhagavatula | Chaobin You | Craig Thomson | Cristina Garbacea | Dakuo Wang | Daniel Deutsch | Deyi Xiong | Di Jin | Dimitra Gkatzia | Dragomir Radev | Elizabeth Clark | Esin Durmus | Faisal Ladhak | Filip Ginter | Genta Indra Winata | Hendrik Strobelt | Hiroaki Hayashi | Jekaterina Novikova | Jenna Kanerva | Jenny Chim | Jiawei Zhou | Jordan Clive | Joshua Maynez | João Sedoc | Juraj Juraska | Kaustubh Dhole | Khyathi Raghavi Chandu | Laura Perez Beltrachini | Leonardo F . R. Ribeiro | Lewis Tunstall | Li Zhang | Mahim Pushkarna | Mathias Creutz | Michael White | Mihir Sanjay Kale | Moussa Kamal Eddine | Nico Daheim | Nishant Subramani | Ondrej Dusek | Paul Pu Liang | Pawan Sasanka Ammanamanchi | Qi Zhu | Ratish Puduppully | Reno Kriz | Rifat Shahriyar | Ronald Cardenas | Saad Mahamood | Salomey Osei | Samuel Cahyawijaya | Sanja Štajner | Sebastien Montella | Shailza Jolly | Simon Mille | Tahmid Hasan | Tianhao Shen | Tosin Adewumi | Vikas Raunak | Vipul Raheja | Vitaly Nikolaev | Vivian Tsai | Yacine Jernite | Ying Xu | Yisi Sang | Yixin Liu | Yufang Hou
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Evaluations in machine learning rarely use the latest metrics, datasets, or human evaluation in favor of remaining compatible with prior work. The compatibility, often facilitated through leaderboards, thus leads to outdated but standardized evaluation practices. We pose that the standardization is taking place in the wrong spot. Evaluation infrastructure should enable researchers to use the latest methods and what should be standardized instead is how to incorporate these new evaluation advances. We introduce GEMv2, the new version of the Generation, Evaluation, and Metrics Benchmark which uses a modular infrastructure for dataset, model, and metric developers to benefit from each other’s work. GEMv2 supports 40 documented datasets in 51 languages, ongoing online evaluation for all datasets, and our interactive tools make it easier to add new datasets to the living benchmark.
Control Prefixes for Parameter-Efficient Text Generation
Jordan Clive | Kris Cao | Marek Rei
Proceedings of the Second Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)
Jordan Clive | Kris Cao | Marek Rei
Proceedings of the Second Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)
Prefix-tuning is a parameter-efficient and powerful technique for adapting a pre-trained language model to a downstream application. However, it uses the same dataset-level tuned set of parameters for all examples in the dataset. We extend the framework with a dynamic method, Control Prefixes, which allows for the effective inclusion of input-dependent information, thereby demonstrating how prefix-tuning can be used for controlled text generation tasks. The method incorporates attribute-level learnable representations into different layers of a pre-trained Transformer, enabling the generated text to be guided in a particular direction. We provide a systematic evaluation of the technique and apply it to five datasets from the GEM benchmark for natural language generation (NLG). Using only 0.1–2% additional trainable parameters, we show Control Prefixes can even outperform full fine-tuning methods, and present state-of-the-art results on several data-to-text datasets, including WebNLG. We also examine the common case where input-dependent information is unavailable at test time and show Control Prefixes can excel in this setting also.
Search
Fix author
Co-authors
- Tosin Adewumi 2
- Kaustubh Dhole 2
- Tanmay Laud 2
- Dragomir Radev 2
- Vipul Raheja 2
- Hamdan Al-Ali 1
- Pawan Sasanka Ammanamanchi 1
- Giuseppe Attanasio 1
- Ioana Baldini 1
- Chandra Bhagavatula 1
- Abhik Bhattacharjee 1
- Matthew Blumberg 1
- Ben Bogin 1
- Bernd Bohnet 1
- Tien-Tung Bui 1
- Samuel Cahyawijaya 1
- Kris Cao 1
- Ronald Cardenas 1
- Yekun Chai 1
- Khyathi Raghavi Chandu 1
- Liang-Yu Chen 1
- Vu Minh Chien 1
- Jenny Chim 1
- Elizabeth Clark 1
- Miruna Clinciu 1
- Mathias Creutz 1
- Nico Daheim 1
- Tai Dang 1
- Arnav Varma Dantuluri 1
- Pieter Delobelle 1
- Daniel Deutsch 1
- Manan Dey 1
- Timm Dill 1
- Amirbek Djanibekov 1
- Jad Doughman 1
- Aleksandr Drozd 1
- Esin Durmus 1
- Ritam Dutt 1
- Ondřej Dušek 1
- Moussa Kamal Eddine 1
- Jessica Zosa Forde 1
- Felix Friedrich 1
- Tommaso Furlanello 1
- Jay Gala 1
- Cristina Garbacea 1
- Sebastian Gehrmann 1
- Avijit Ghosh 1
- Filip Ginter 1
- Dimitra Gkatzia 1
- Kshitij Gupta 1
- Sil Hamilton 1
- Tahmid Hasan 1
- Hiroaki Hayashi 1
- Carolin Holtermann 1
- Yufang Hou 1
- Jerry Huang 1
- Yacine Jernite 1
- Di Jin 1
- Shailza Jolly 1
- Adalberto Barbosa Junior 1
- Juraj Juraska 1
- Lucie-Aimée Kaffee 1
- Mihir Sanjay Kale 1
- Jenna Kanerva 1
- Marzena Karpinska 1
- Reno Kriz 1
- Wojciech Kusa 1
- Faisal Ladhak 1
- Veronika Laippala 1
- Anne Lauscher 1
- Paul Pu Liang 1
- Yixin Liu 1
- Roberto L Lopez-Davila 1
- Ziyang Luo 1
- Aman Madaan 1
- Saad Mahamood 1
- Abinaya Mahendiran 1
- Jonibek Mansurov 1
- Maraim Masoud 1
- Victor May 1
- Joshua Maynez 1
- Angelina McMillan-Major 1
- Virendra Mehta 1
- Simon Mille 1
- Mayank Mishra 1
- Diganta Misra 1
- Margaret Mitchell 1
- Sebastien Montella 1
- Nicolo Monti 1
- Nour Moustafa-Fahmy 1
- Niklas Muennighoff 1
- Sagnik Mukherjee 1
- Nurdaulet Mukhituly 1
- Taishi Nakamura 1
- Nikita Nangia 1
- Roberto Navigli 1
- Aurelie Neveol 1
- Hiep Nguyen 1
- Shangrui Nie 1
- Vitaly Nikolaev 1
- Jekaterina Novikova 1
- Salomey Osei 1
- Anaelia Ovalle 1
- Suhas Pai 1
- Alexandros Papangelis 1
- Laura Perez-Beltrachini 1
- Giada Pistilli 1
- Esther Ploeger 1
- Ratish Puduppully 1
- Mahim Pushkarna 1
- Sampo Pyysalo 1
- Jeremy Qin 1
- Vikas Raunak 1
- Marek Rei 1
- Leonardo F. R. Ribeiro 1
- Yisi Sang 1
- Beatrice Savoldi 1
- João Sedoc 1
- Rifat Shahriyar 1
- Shanya Sharma 1
- Tianhao Shen 1
- Xudong Shen 1
- Anna Shvets 1
- Karolina Stanczak 1
- Jason T. Stillerman 1
- Hendrik Strobelt 1
- Nishant Subramani 1
- Arjun Subramonian 1
- Qi Sun 1
- Kaiser Sun 1
- Eliza Szczechla 1
- Tair Djanibekov 1
- Zeerak Talat 1
- Simone Tedeschi 1
- Craig Thomson 1
- Tiago Timponi Torrent 1
- Vivian Tsai 1
- Ken Tsui 1
- Lewis Tunstall 1
- Deepak Tunuguntla 1
- Ashish Upadhyay 1
- Oskar Van Der Wal 1
- Emilio Villa-Cueva 1
- Marcelo Viridiano 1
- Xuan-Son Vu 1
- Alex Wang 1
- Dakuo Wang 1
- Michael White 1
- Bryan Wilie 1
- Genta Indra Winata 1
- Deyi Xiong 1
- Ying Xu 1
- Prateek Yadav 1
- Adina Yakefu 1
- Bingsheng Yao 1
- Xiaozhe Yao 1
- Kayo Yin 1
- Rio Yokota 1
- Chaobin You 1
- Li Zhang 1
- Mike Zhang 1
- Jiawei Zhou 1
- Qi Zhu 1
- Terry Yue Zhuo 1
- Sydney Zink 1
- Sanja Štajner 1