Data-mining and Extraction: the gold rush of AI on Indigenous Languages

Marie-Odile Junker


Abstract
The goal of this paper is to start a discussion on the topic of Data mining and Extraction of Indigenous Language data, describing recent events that took place within the Algonquian Dictionaries and Language Resources common infrastructure. We raise questions about ethics, social context, vulnerability, responsibility, and societal benefits and concerns in the age of generative AI.
Anthology ID:
2024.computel-1.8
Volume:
Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Sarah Moeller, Godfred Agyapong, Antti Arppe, Aditi Chaudhary, Shruti Rijhwani, Christopher Cox, Ryan Henke, Alexis Palmer, Daisy Rosenblum, Lane Schwartz
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
52–57
Language:
URL:
https://aclanthology.org/2024.computel-1.8
DOI:
Bibkey:
Cite (ACL):
Marie-Odile Junker. 2024. Data-mining and Extraction: the gold rush of AI on Indigenous Languages. In Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 52–57, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Data-mining and Extraction: the gold rush of AI on Indigenous Languages (Junker, ComputEL-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2024.computel-1.8.pdf