Azul Alpizar-Velazquez
2026
Schema-Constrained Image Captioning for Five Low-Resource Indigenous Languages
Diego Cuadros | Nicholas Leeds | Amanda Avalos | Azul Alpizar-Velazquez | Jared Coleman | Faezeh Dehghan Tarzjani | Bhaskar Krishnamachari
Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
Diego Cuadros | Nicholas Leeds | Amanda Avalos | Azul Alpizar-Velazquez | Jared Coleman | Faezeh Dehghan Tarzjani | Bhaskar Krishnamachari
Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
We describe our submission to all five tracks of the AmericasNLP 2026 Shared Task on Cultural Image Captioning: Bribri, Guaraní, Yucatec Maya, Orizaba Nahuatl, and Wixárika. Our system is an LLM-assisted rule-based machine translation (LLM-RBMT) captioner. For each language, a coding agent reads the small development split and open-web linguistic references and writes a complete Pydantic grammar package with a closed vocabulary. At inference time, a vision–language model sees the image and the schema, emits a structured SentenceList under constrained decoding, and a deterministic Python renderer produces the surface string. The model never generates target-language tokens. The same architecture handles all five languages with no fine-tuning, no parallel corpora, and no human edits to the generated packages. On the official test set, the system placed first on human evaluation in Bribri and Orizaba Nahuatl, third on Yucatec Maya, and first on ChrF++ in Yucatec Maya. We suggest that a strength of the approach is that outputs are restricted to simple sentences that are grammatically correct by construction, modulo the correctness of the generated grammar itself.