2025
pdf
bib
abs
Harmonizing Divergent Lemmatization and Part-of-Speech Tagging Practices for Latin Participles through the LiLa Knowledge Base
Marco Passarotti
|
Federica Iurescia
|
Paolo Ruffolo
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
This paper addresses the challenge of divergent lemmatization and part-of-speech (PoS) tagging practices for Latin participles in annotated corpora. We propose a solution through the LiLa Knowledge Base, a Linked Open Data framework designed to unify lexical and textual data for Latin. Using lemmas as the point of connection between distributed textual and lexical resources, LiLa introduces hypolemmas — secondary citation forms belonging to a word’s inflectional paradigm — as a means of reconciling divergent annotations for participles. Rather than advocating a single uniform annotation scheme, LiLa preserves each resource’s native guidelines while ensuring that users can retrieve and analyze participial data seamlessly. Via empirical assessments of multiple Latin corpora, we show how the LiLa’s integration of lemmas and hypolemmas enables consistent retrieval of participle forms regardless of whether they are categorized as verbal or adjectival.
2024
pdf
bib
abs
Combining Universal Dependencies and FrameNet to Identify Constructions in a Poetic Corpus: Syntax and Semantics of Latin Felix and Infelix in Virgilian Poetics
Giulia Calvi
|
Riccardo Ginevra
|
Federica Iurescia
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
The paper is a pilot study which argues for a constructionist and computer-based approach to the syntactic and semantic analysis of a poetic corpus in Latin. We focus on the terms felix and on its opposite infelix and perform manual annotation of their occurrences in Virgil’s poems using Universal Dependencies for the syntactic analysis and FrameNet for the semantic one. Integrating the approaches of Dependency Syntax and Construction Grammar, we analyze the linguistic contexts in which the two terms occur and identify the different “constructions” (pairings of form and function) that they instantiate. Our methodology is language-independent and has the potential to aid scholars in the comparative analysis of poetic texts, allowing for the detection of hidden parallels in the style and poetics of different texts and authors.
pdf
bib
abs
Overview of the EvaLatin 2024 Evaluation Campaign
Rachele Sprugnoli
|
Federica Iurescia
|
Marco Passarotti
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024
This paper describes the organization and the results of the third edition of EvaLatin, the campaign for the evaluation of Natural Language Processing tools for Latin. The two shared tasks proposed in EvaLatin 2024, i.,e., Dependency Parsing and Emotion Polarity Detection, are aimed to foster research in the field of language technologies for Classical languages. The shared datasets are described and the results obtained by the participants for each task are presented and discussed.
2023
pdf
bib
Linking the Corpus CLaSSES to the LiLa Knowledge Base of Interoperable Linguistic Resources for Latin
Irene De Felice
|
Lucia Tamponi
|
Federica Iurescia
|
Marco Passarotti
Proceedings of the 9th Italian Conference on Computational Linguistics (CLiC-it 2023)
pdf
bib
abs
Linking the Neulateinische Wortliste to the LiLa Knowledge Base of Interoperable Resources for Latin
Federica Iurescia
|
Eleonora Litta
|
Marco Passarotti
|
Matteo Pellegrini
|
Giovanni Moretti
|
Paolo Ruffolo
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
This paper describes the process of interlinking a lexical resource consisting of a list of more than 20,000 Neo-Latin words with other resources for Latin. The resources are made interoperable thanks to their linking to the anonymous Knowledge Base, which applies Linguistic Linked Open Data practices and data categories to describe and publish on the Web both textual and lexical resources for the Latin language.