Using LatInfLexi for an Entropy-Based Assessment of Predictability in Latin Inflection

Matteo Pellegrini


Abstract
This paper presents LatInfLexi, a large inflected lexicon of Latin providing information on all the inflected wordforms of 3,348 verbs and 1,038 nouns. After a description of the structure of the resource and some data on its size, the procedure followed to obtain the lexicon from the database of the Lemlat 3.0 morphological analyzer is detailed, as well as the choices made regarding overabundant and defective cells. The way in which the data of LatInfLexi can be exploited in order to perform a quantitative assessment of predictability in Latin verb inflection is then illustrated: results obtained by computing the conditional entropy of guessing the content of a paradigm cell assuming knowledge of one wordform or multiple wordforms are presented in turn, highlighting the descriptive and theoretical relevance of the analysis. Lastly, the paper envisages the advantages of an inclusion of LatInfLexi into the LiLa knowledge base, both for the presented resource and for the knowledge base itself.
Anthology ID:
2020.lt4hala-1.6
Volume:
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LT4HALA
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
37–46
Language:
English
URL:
https://aclanthology.org/2020.lt4hala-1.6
DOI:
Bibkey:
Cite (ACL):
Matteo Pellegrini. 2020. Using LatInfLexi for an Entropy-Based Assessment of Predictability in Latin Inflection. In Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pages 37–46, Marseille, France. European Language Resources Association (ELRA).
Cite (Informal):
Using LatInfLexi for an Entropy-Based Assessment of Predictability in Latin Inflection (Pellegrini, LT4HALA 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-url/2020.lt4hala-1.6.pdf