Abigail Walsh


2020

pdf bib
Annotating MWEs in the Irish UD Treebank
Sarah McGuinness | Jason Phelan | Abigail Walsh | Teresa Lynn
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)

This paper reports on the analysis and annotation of Multiword Expressions in the Irish Universal Dependency Treebank. We provide a linguistic discussion around decisions on how to appropri- ately label Irish MWEs using the compound, flat and fixed dependency relation labels within the framework of the Universal Dependencies annotation guidelines. We discuss some nuances of the Irish language that pose challenges for assigning these UD labels and provide this report in support of the Irish UD annotation guidelines. With this we hope to ensure consistency in annotation across the dataset and provide a basis for future MWE annotation for Irish.

pdf bib
Annotating Verbal MWEs in Irish for the PARSEME Shared Task 1.2
Abigail Walsh | Teresa Lynn | Jennifer Foster
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons

This paper describes the creation of two Irish corpora (labelled and unlabelled) for verbal MWEs for inclusion in the PARSEME Shared Task 1.2 on automatic identification of verbal MWEs, and the process of developing verbal MWE categories for Irish. A qualitative analysis on the two corpora is presented, along with discussion of Irish verbal MWEs.

pdf bib
Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions
Carlos Ramisch | Agata Savary | Bruno Guillaume | Jakub Waszczuk | Marie Candito | Ashwini Vaidya | Verginica Barbu Mititelu | Archna Bhatia | Uxoa Iñurrieta | Voula Giouli | Tunga Güngör | Menghan Jiang | Timm Lichte | Chaya Liebeskind | Johanna Monti | Renata Ramisch | Sara Stymne | Abigail Walsh | Hongzhi Xu
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.

2019

pdf bib
Ilfhocail: A Lexicon of Irish MWEs
Abigail Walsh | Teresa Lynn | Jennifer Foster
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)

This paper describes the categorisation of Irish MWEs, and the construction of the first version of a lexicon of Irish MWEs for NLP purposes (Ilfhocail, meaning ‘Multiwords’), collected from a number of resources. For the purposes of quality assurance, 530 entries of this lexicon were examined and manually annotated for POS information and MWE category.

2018

pdf bib
Constructing an Annotated Corpus of Verbal MWEs for English
Abigail Walsh | Claire Bonial | Kristina Geeraert | John P. McCrae | Nathan Schneider | Clarissa Somers
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This paper describes the construction and annotation of a corpus of verbal MWEs for English, as part of the PARSEME Shared Task 1.1 on automatic identification of verbal MWEs. The criteria for corpus selection, the categories of MWEs used, and the training process are discussed, along with the particular issues that led to revisions in edition 1.1 of the annotation guidelines. Finally, an overview of the characteristics of the final annotated corpus is presented, as well as some discussion on inter-annotator agreement.

pdf bib
Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
Carlos Ramisch | Silvio Ricardo Cordeiro | Agata Savary | Veronika Vincze | Verginica Barbu Mititelu | Archna Bhatia | Maja Buljan | Marie Candito | Polona Gantar | Voula Giouli | Tunga Güngör | Abdelati Hawwari | Uxoa Iñurrieta | Jolanta Kovalevskaitė | Simon Krek | Timm Lichte | Chaya Liebeskind | Johanna Monti | Carla Parra Escartín | Behrang QasemiZadeh | Renata Ramisch | Nathan Schneider | Ivelina Stoyanova | Ashwini Vaidya | Abigail Walsh
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year’s shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20 languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17 participating systems, their methods and obtained results are also presented and analysed.