Elaine Uí Dhonnchadha

Also published as: E. Uí Dhonnchadha


2012

pdf bib
Active Learning and the Irish Treebank
Teresa Lynn | Jennifer Foster | Mark Dras | Elaine Uí Dhonnchadha
Proceedings of the Australasian Language Technology Association Workshop 2012

pdf bib
Irish Treebanking and Parsing: A Preliminary Evaluation
Teresa Lynn | Özlem Çetinoğlu | Jennifer Foster | Elaine Uí Dhonnchadha | Mark Dras | Josef van Genabith
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Language resources are essential for linguistic research and the development of NLP applications. Low-density languages, such as Irish, therefore lack significant research in this area. This paper describes the early stages in the development of new language resources for Irish ― namely the first Irish dependency treebank and the first Irish statistical dependency parser. We present the methodology behind building our new treebank and the steps we take to leverage upon the few existing resources. We discuss language-specific choices made when defining our dependency labelling scheme, and describe interesting Irish language characteristics such as prepositional attachment, copula, and clefting. We manually develop a small treebank of 300 sentences based on an existing POS-tagged corpus and report an inter-annotator agreement of 0.7902. We train MaltParser to achieve preliminary parsing results for Irish and describe a bootstrapping approach for further stages of development.

2010

pdf bib
Partial Dependency Parsing for Irish
Elaine Uí Dhonnchadha | Josef Van Genabith
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present a partial dependency parser for Irish. Constraint Grammar (CG) based rules are used to annotate dependency relations and grammatical functions. Chunking is performed using a regular-expression grammar which operates on the dependency tagged sentences. As this is the first implementation of a parser for unrestricted Irish text (to our knowledge), there were no guidelines or precedents available. Therefore deciding what constitutes a syntactic unit, and how it should be annotated, accounts for a major part of the early development effort. Currently, all tokens in a sentence are tagged for grammatical function and local dependency. Long-distance dependencies, prepositional attachments or coordination are not handled, resulting in a partial dependency analysis. Evaluations show that the partial dependency analysis achieves an f-score of 93.60% on development data and 94.28% on unseen test data, while the chunker achieves an f-score of 97.20% on development data and 93.50% on unseen test data.

2006

pdf bib
A Part-of-speech tagger for Irish using Finite-State Morphology and Constraint Grammar Disambiguation
E. Uí Dhonnchadha | J. Van Genabith
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the methodology used to develop a part-of-speech tagger for Irish, which is used to annotate a corpus of 30 million words of text with part-of-speech tags and lemmas. The tagger is evaluated using a manually disambiguated test corpus and it currently achieves 95% accuracy on unrestricted text. To our knowledge, this is the first part-of-speech tagger for Irish.

2004

pdf bib
CL for CALL in the Primary School
Katrina Keogh | Thomas Koller | Monica Ward | Elaine Uí Dhonnchadha | Josef van Genabith
Proceedings of the Workshop on eLearning for Computational Linguistics and Computational Linguistics for eLearning

2002

pdf bib
A Two-level Morphological Analyser and Generator for Irish using Finite-State Transducers
Elaine Uí Dhonnchadha
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)