Costanza Marini


CroaTPAS: A Survey-based Evaluation
Costanza Marini
Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022

The Croatian Typed Predicate Argument Structures resource is a Croatian/English bilingual digital dictionary of corpus-derived verb valency structures, whose argument slots have been annotated with Semantic Types labels following the CPA methodology. CroaTPAS is tailor-made to represent verb polysemy and currently contains 180 Croatian verbs for a total of 683 different verbs senses. In order to evaluate the resource both in terms of identified Croatian verb senses, as well as of the English descriptions explaining them, an online survey based on a multiple-choice sense disambiguation task was devised, pilot tested and distributed among respondents following a snowball sampling methodology. Answers from 30 respondents were collected and compared against a yardstick set of answers in line with CroaTPAS’s sense distinctions. Jaccard similarity index was used as a measure of agreement. Since the multiple-choice items respondents answered to were based on a representative selection of CroaTPAS verbs, they allowed for a generalization of the results to the whole of the resource.


Annotating Croatian Semantic Type Coercions in CROATPAS
Costanza Marini | Elisabetta Jezek
16th Joint ACL - ISO Workshop on Interoperable Semantic Annotation PROCEEDINGS

This short research paper presents the results of a corpus-based metonymy annotation exercise on a sample of 101 Croatian verb entries – corresponding to 457 patters and over 20,000 corpus lines – taken from CROATPAS (Marini & Ježek, 2019), a digital repository of verb argument structures manually annotated with Semantic Type labels on their argument slots following a methodology inspired by Corpus Pattern Analysis (Hanks, 2004 & 2013; Hanks & Pustejovsky, 2005). CROATPAS will be made available online in 2020. Semantic Type labelling is not only well-suited to annotate verbal polysemy, but also metonymic shifts in verb argument combinations, which in Generative Lexicon (Pustejovsky, 1995 & 1998; Pustejovsky & Ježek, 2008) are called Semantic Type coercions. From a sub lexical point of view, Semantic Type coercions can be considered as exploitations of one of the qualia roles of those Semantic Types which do not satisfy a verb’s selectional requirements, but do not trigger a different verb sense. Overall, we were able to identify 62 different Semantic Type coercions linked to 1,052 metonymic corpus lines. In the future, we plan to compare our results with those from an equivalent study on Italian verbs (Romani, 2020) for a crosslinguistic analysis of metonymic shifts.