Gabriel Ayoubi

2026

Formal Machine Interpretation for the Semasiographic Mixtec Codices of Precolonial and Early Colonial Mesoamerica
Christopher Driggers-Ellis | Gabriel Ayoubi | Girish.Salunke811@Gmail.Com Girish.Salunke811@Gmail.Com | Christan Grant
Proceedings of the 4th Workshop on Advances in Language and Vision Research (ALVR)

The precolonial and early colonial Mixtec codices describe the history and stories of the region in a semasiographic medium that is full of symbolic representations and meant to be narrated.Recently, the community has introduced datasets of XML representations of related media, including Aztec codices and Mayan hieroglyphic script, in a step towards symbolic machine interpretation of these historic Mesoamerican artifacts.In this work, we propose formal symbolic machine interpretation of XML encodings representing facsimile images from the Mixtec Codex Zouche-Nuttal.We demonstrate the efficacy of symbolic machine interpretation from XML step-by-step, showing how our parser and interpreter process text capturing a scene from the Mixtec Codex Zouche-Nuttall.We hope our contribution and the example we provide motivate collaboration among the archaeological, historical, linguistic, and natural language processing research communities to apply machine interpretation to Mixtec codices and similar manuscripts.

2024

pdf bib abs

Analyzing Finetuned Vision Models for Mixtec Codex Interpretation
Alexander Webber | Zachary Sayers | Amy Wu | Elizabeth Thorner | Justin Witter | Gabriel Ayoubi | Christan Grant
Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024)

Throughout history, pictorial record-keeping has been used to document events, stories, and concepts. A popular example of this is the Tzolk’in Maya Calendar. The pre-Columbian Mixtec society also recorded many works through graphical media called codices that depict both stories and real events. Mixtec codices are unique because the depicted scenes are highly structured within and across documents. As a first effort toward translation, we created two binary classification tasks over Mixtec codices, namely, gender and pose. The composition of figures within a codex is essential for understanding the codex’s narrative. We labeled a dataset with around 1300 figures drawn from three codices of varying qualities. We finetuned the Visual Geometry Group 16 (VGG-16) and Vision Transformer 16 (ViT-16) models, measured their performance, and compared learned features with expert opinions found in literature. The results show that when finetuned, both VGG and ViT perform well, with the transformer-based architecture (ViT) outperforming the CNN-based architecture (VGG) at higher learning rates. We are releasing this work to allow collaboration with the Mixtec community and domain scientists.

Co-authors

Alexander Webber 1

Justin Witter 1

Amy Wu 1

Venues

Fix author