This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
MartaColl-Florit
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese.
In this paper we present a novel resource-inexpensive architecture for metaphor detection based on a residual bidirectional long short-term memory and conditional random fields. Current approaches on this task rely on deep neural networks to identify metaphorical words, using additional linguistic features or word embeddings. We evaluate our proposed approach using different model configurations that combine embeddings, part of speech tags, and semantically disambiguated synonym sets. This evaluation process was performed using the training and testing partitions of the VU Amsterdam Metaphor Corpus. We use this method of evaluation as reference to compare the results with other current neural approaches for this task that implement similar neural architectures and features, and that were evaluated using this corpus. Results show that our system achieves competitive results with a simpler architecture compared to previous approaches.
We present the results of an agreement task carried out in the framework of the KNOW Project and consisting in manually annotating an agreement sample totaling 50 sentences extracted from the SenSem corpus. Diambiguation was carried out for all nouns, proper nouns and adjectives in the sample, all of which were assigned EuroWordNet (EWN) synsets. As a result of the task, Spanish WN has been shown to exhibit 1) lack of explanatory clarity (it does not define word meanings, but glosses and examplifies them instead; it does not systematically encode metaphoric meanings, either); 2) structural inadequacy (some words appear as hyponyms of another sense of the same word; sometimes there even coexist in Spanish WN a general sense and a specific one related to the same concept, but with no structural link in between; hyperonymy relationships have been detected that are likely to raise doubts to human annotators; there can even be found cases of auto-hyponymy); 3) cross-linguistic inconsistency (there exist in English EWN concepts whose lexical equivalent is missing in Spanish WN; glosses in one language more often than not contradict or diverge from glosses in another language).