Grammar as Control: Modular Language Generation for the Long Tail

Ndapa Nakashole


Abstract
Large language models (LLMs) can, in principle, bootstrap language technologies for long-tail languages due to their pattern recognition capabilities. Yet in practice, without structured guidance, they produce narrow, unrepresentative samples that fail to cover the morphosyntactic space of typologically underrepresented languages.We propose Modular Typology-Informed Generation (mTIG), a prompting framework that transforms descriptive grammars into explicit control mechanisms that guide LLMs to generate typologically balanced synthetic data for downstream training. mTIG decomposes grammars into modular grammar slices, each targeting a specific morphosyntactic phenomenon (e.g., passive voice, causative morphology).Across three low-resource languages, mTIG improves typological entropy by up to 19% and yields a "student-beats-teacher" effect, where distilled models outperform the source LLM by up to +20 chrF in machine translation. These findings show that grammar-as-control can construct training corpora wherever formal linguistic descriptions exist.
Anthology ID:
2026.acl-long.1725
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
37192–37222
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1725/
DOI:
Bibkey:
Cite (ACL):
Ndapa Nakashole. 2026. Grammar as Control: Modular Language Generation for the Long Tail. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 37192–37222, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Grammar as Control: Modular Language Generation for the Long Tail (Nakashole, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1725.pdf
Checklist:
 2026.acl-long.1725.checklist.pdf