2022
pdf
abs
Use of Transformer-Based Models for Word-Level Transliteration of the Book of the Dean of Lismore
Edward Gow-Smith
|
Mark McConville
|
William Gillies
|
Jade Scott
|
Roibeard Ó Maolalaigh
Proceedings of the 4th Celtic Language Technology Workshop within LREC2022
The Book of the Dean of Lismore (BDL) is a 16th-century Scottish Gaelic manuscript written in a non-standard orthography. In this work, we outline the problem of transliterating the text of the BDL into a standardised orthography, and perform exploratory experiments using Transformer-based models for this task. In particular, we focus on the task of word-level transliteration, and achieve a character-level BLEU score of 54.15 with our best model, a BART architecture pre-trained on the text of Scottish Gaelic Wikipedia and then fine-tuned on around 2,000 word-level parallel examples. Our initial experiments give promising results, but we highlight the shortcomings of our model, and discuss directions for future work.
2008
pdf
‘Deep’ Grammatical Relations for Semantic Interpretation
Mark McConville
|
Myroslava O. Dzikovska
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
pdf
abs
Evaluating Complement-Modifier Distinctions in a Semantically Annotated Corpus
Mark McConville
|
Myroslava O. Dzikovska
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We evaluate the extent to which the distinction between semantically core and non-core dependents as used in the FrameNet corpus corresponds to the traditional distinction between syntactic complements and modifiers of a verb, for the purposes of harvesting a wide-coverage verb lexicon from FrameNet for use in deep linguistic processing applications. We use the VerbNet verb database as our gold standard for making judgements about complement-hood, in conjunction with our own intuitions in cases where VerbNet is incomplete. We conclude that there is enough agreement between the two notions (0.85) to make practical the simple expedient of equating core PP dependents in FrameNet with PP complements in our lexicon. Doing so means that we lose around 13% of PP complements, whilst around 9% of the PP dependents left in the lexicon are not complements.
2007
pdf
Extracting a Verb Lexicon for Deep Parsing from FrameNet
Mark McConville
|
Myroslava O. Dzikovska
ACL 2007 Workshop on Deep Linguistic Processing
2006
pdf
bib
Inheritance and the CCG Lexicon
Mark McConville
11th Conference of the European Chapter of the Association for Computational Linguistics