New Methods for Exploring Intonosyntax: Introducing an Intonosyntactic Treebank for Nigerian Pidgin

Emmett Strickland, Anne Lacheret-Dujour, Sylvain Kahane, Marc Evrard, Perrine Quennehen, Bernard Caron, Francis Egbokhare, Bruno Guillaume


Abstract
This paper presents a new phonetic resource for Nigerian Pidgin, a low-resource language of West Africa. Aiming to provide a new tool for research on intonosyntax, we have augmented an existing syntactic treebank of Nigerian Pidgin, associating each orthographically transcribed token with a series of syllable-level alignments and phonetizations. Syllables are further described using a set of continuous and discrete prosodic features. This new approach provides a simple tool for researchers to explore the prosodic characteristics of various syntactic phenomena. In this paper, we present the format of the corpus, the various features added, and several explorations that can be performed using an online interface. We also present a prosodically specified lexicon extracted using this resource. In it, each orthographic form is accompanied by the frequency of its phoneme-level variants, as well as the suprasegmental features that most frequently accompany each syllable. Finally, we present several additional case studies on how this corpus can used in the study of the language’s prosody.
Anthology ID:
2024.lrec-main.1068
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
12207–12216
Language:
URL:
https://aclanthology.org/2024.lrec-main.1068
DOI:
Bibkey:
Cite (ACL):
Emmett Strickland, Anne Lacheret-Dujour, Sylvain Kahane, Marc Evrard, Perrine Quennehen, Bernard Caron, Francis Egbokhare, and Bruno Guillaume. 2024. New Methods for Exploring Intonosyntax: Introducing an Intonosyntactic Treebank for Nigerian Pidgin. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12207–12216, Torino, Italia. ELRA and ICCL.
Cite (Informal):
New Methods for Exploring Intonosyntax: Introducing an Intonosyntactic Treebank for Nigerian Pidgin (Strickland et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2024.lrec-main.1068.pdf