Mapping ‘when’-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology

Nilo Pedrazzini


Abstract
Languages can encode temporal subordination lexically, via subordinating conjunctions, and morphologically, by marking the relation on the predicate. Systematic cross-linguistic variation among the former can be studied using well-established token-based typological approaches to token-aligned parallel corpora. Variation among different morphological means is instead much harder to tackle and therefore more poorly understood, despite being predominant in several language groups. This paper explores variation in the expression of generic temporal subordination (‘when’-clauses) among the languages of Latin America and the Caribbean, where morphological marking is particularly common. It presents probabilistic semantic maps computed on the basis of the languages of the region, thus avoiding bias towards the many world’s languages that exclusively use lexified connectors, incorporating associations between character in/i-grams and English iwhen/i. The approach allows capturing morphological clause-linkage devices in addition to lexified connectors, paving the way for larger-scale, strategy-agnostic analyses of typological variation in temporal subordination.
Anthology ID:
2024.americasnlp-1.4
Original:
2024.americasnlp-1.4v1
Version 2:
2024.americasnlp-1.4v2
Volume:
Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Manuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, Katharina von der Wense
Venues:
AmericasNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–33
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.americasnlp-1.4/
DOI:
10.18653/v1/2024.americasnlp-1.4
Bibkey:
Cite (ACL):
Nilo Pedrazzini. 2024. Mapping ‘when’-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology. In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), pages 24–33, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Mapping ‘when’-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology (Pedrazzini, AmericasNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.americasnlp-1.4.pdf
Supplementarymaterial:
 2024.americasnlp-1.4.SupplementaryMaterial.zip
Video:
 https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.americasnlp-1.4.mp4