Christopher M. Homan


2024

pdf bib
Diversity-Aware Annotation for Conversational AI Safety
Alicia Parrish | Vinodkumar Prabhakaran | Lora Aroyo | Mark Díaz | Christopher M. Homan | Greg Serapio-García | Alex S. Taylor | Ding Wang
Proceedings of Safety4ConvAI: The Third Workshop on Safety for Conversational AI @ LREC-COLING 2024

How people interpret content is deeply influenced by their socio-cultural backgrounds and lived experiences. This is especially crucial when evaluating AI systems for safety, where accounting for such diversity in interpretations and potential impacts on human users will make them both more successful and inclusive. While recent work has demonstrated the importance of diversity in human ratings that underlie AI pipelines, effective and efficient ways to incorporate diverse perspectives in human data annotation pipelines is still largely elusive. In this paper, we discuss the primary challenges faced in incorporating diversity into model evaluations, and propose a practical diversity-aware annotation approach. Using an existing dataset with highly parallel safety annotations, we take as a test case a policy that prioritizes recall of safety issues, and demonstrate that our diversity-aware approach can efficiently obtain a higher recall of safety issues flagged by minoritized rater groups without hurting overall precision.

2023

pdf
Findings from the Bambara - French Machine Translation Competition (BFMT 2023)
Ninoh Agostinho Da Silva | Tunde Oluwaseyi Ajayi | Alexander Antonov | Panga Azazia Kamate | Moussa Coulibaly | Mason Del Rio | Yacouba Diarra | Sebastian Diarra | Chris Emezue | Joel Hamilcaro | Christopher M. Homan | Alexander Most | Joseph Mwatukange | Peter Ohue | Michael Pham | Abdoulaye Sako | Sokhar Samb | Yaya Sy | Tharindu Cyril Weerasooriya | Yacine Zahidi | Sarah Luger
Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023)

Orange Silicon Valley hosted a low-resource machine translation (MT) competition with monetary prizes. The goals of the competition were to raise awareness of the challenges in the low-resource MT domain, improve MT algorithms and data strategies, and support MT expertise development in the regions where people speak Bambara and other low-resource languages. The participants built Bambara to French and French to Bambara machine translation systems using data provided by the organizers and additional data resources shared amongst the competitors. This paper details each team’s different approaches and motivation for ongoing work in Bambara and the broader low-resource machine translation domain.