Alex Xijie Lu
2025
Investigating Dictionary Expansion for Video-based Sign Language Dictionaries
Aashaka Desai
|
Daniela Massiceti
|
Richard Ladner
|
Hal Daumé Iii
|
Danielle Bragg
|
Alex Xijie Lu
Findings of the Association for Computational Linguistics: EMNLP 2025
Like most languages, sign languages evolve over time. It is important that sign language dictionaries’ vocabularies are updated over time to reflect these changes, such as by adding new signs. However, most dictionary retrieval methods based upon machine learning models only work with fixed vocabularies, and it is unclear how they might support dictionary expansion without retraining. In this work, we explore the feasibility of dictionary expansion for sign language dictionaries using a simple representation-based method. We explore a variety of dictionary expansion scenarios, e.g., varying number of signs added as well as amount of data for these newly added signs. Through our results, we show how performance varies significantly across different scenarios, many of which are reflective of real-world data challenges. Our findings offer implications for the development & maintenance of video-based sign language dictionaries, and highlight directions for future research on dictionary expansion.
2024
ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles
Kayo Yin
|
Chinmay Singh
|
Fyodor O Minakov
|
Vanessa Milan
|
Hal Daumé Iii
|
Cyril Zhang
|
Alex Xijie Lu
|
Danielle Bragg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, technology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages. To help address this, we introduce ASL STEM Wiki: a parallel corpus of 254 Wikipedia articles on STEM topics in English, interpreted into over 300 hours of American Sign Language (ASL). ASL STEM Wiki is the first continuous signing dataset focused on STEM, facilitating the development of AI resources for STEM education in ASL.We identify several use cases of ASL STEM Wiki with human-centered applications. For example, because this dataset highlights the frequent use of fingerspelling for technical concepts, which inhibits DHH students’ ability to learn,we develop models to identify fingerspelled words—which can later be used to query for appropriate ASL signs to suggest to interpreters.
Search
Fix author
Co-authors
- Danielle Bragg 2
- Hal Daumé III 2
- Aashaka Desai 1
- Richard Ladner 1
- Daniela Massiceti 1
- show all...