Learning Stress in Arabic Low-Resource Settings
Abed Qaddoumi, Jordan Kodner, Owen Rambow, Salam Khalifa, Jeffrey Heinz
Abstract
We predict lexical stress in Arabic varieties using syllable structure (a sequence of CVs, with C for consonants and V for vowels). Our task is generation: given an unstressed input, the system outputs a stress-marked word. We compare four approaches: a grammar induction algorithm (BUFIA), a transformer-based neural network (NN), a rule-based method, and a frequency baseline. The models are evaluated across several low-resource settings by varying the training data size by words, structural type, and syllable count. BUFIA outperforms the neural network, especially when data are scarce. This supports grammar induction as an interpretable and sample-efficient alternative for learning stress.- Anthology ID:
- 2026.scil-main.24
- Volume:
- Proceedings of the Society for Computation in Linguistics 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, CA
- Editors:
- Rob Voigt, Alex Warstadt, Naomi Feldman, Tal Linzen
- Venues:
- SCiL | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 262–279
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.scil-main.24/
- DOI:
- Cite (ACL):
- Abed Qaddoumi, Jordan Kodner, Owen Rambow, Salam Khalifa, and Jeffrey Heinz. 2026. Learning Stress in Arabic Low-Resource Settings. In Proceedings of the Society for Computation in Linguistics 2026, pages 262–279, San Diego, CA. Association for Computational Linguistics.
- Cite (Informal):
- Learning Stress in Arabic Low-Resource Settings (Qaddoumi et al., SCiL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.scil-main.24.pdf