2020
pdf
abs
Representing Multiword Term Variation in a Terminological Knowledge Base: a Corpus-Based Study
Pilar León-Araúz
|
Arianne Reimerink
|
Melania Cabezas-García
Proceedings of the Twelfth Language Resources and Evaluation Conference
In scientific and technical communication, multiword terms are the most frequent type of lexical units. Rendering them in another language is not an easy task due to their cognitive complexity, the proliferation of different forms, and their unsystematic representation in terminographic resources. This often results in a broad spectrum of translations for multiword terms, which also foment term variation since they consist of two or more constituents. In this study we carried out a quantitative and qualitative analysis of Spanish translation variants of a set of environment-related concepts by evaluating equivalents in three parallel corpora, two comparable corpora and two terminological resources. Our results showed that MWTs exhibit a significant degree of term variation of different characteristics, which were used to establish a set of criteria according to which term variants should be selected, organized and described in terminological knowledge bases.
pdf
abs
Extraction of Hyponymic Relations in French with Knowledge-Pattern-Based Word Sketches
Antonio San Martín
|
Catherine Trekker
|
Pilar León-Araúz
Proceedings of the Twelfth Language Resources and Evaluation Conference
Hyponymy is the cornerstone of taxonomies and concept hierarchies. However, the extraction of hypernym-hyponym pairs from a corpus can be time-consuming, and reconstructing the hierarchical network of a domain is often an extremely complex process. This paper presents the development and evaluation of the French EcoLexicon Semantic Sketch Grammar (ESSG-fr), a French hyponymic sketch grammar for Sketch Engine based on knowledge patterns. It offers a user-friendly way of extracting hyponymic pairs in the form of word sketches in any user-owned corpus. The ESSG-fr contains three times more hyponymic patterns than its English counterpart and has been tested in a multidisciplinary corpus. It is thus expected to be domain-independent. Moreover, the following methodological innovations have been included in its development: (1) use of English hyponymic patterns in a parallel corpus to find new French patterns; (2) automatic inclusion of the results of the Sketch Engine thesaurus to find new variants of the patterns. As for its evaluation, the ESSG-fr returns 70% valid hyperonyms and hyponyms, measured on 180 extracted pairs of terms in three different domains.
2018
pdf
Manzanilla: An Image Annotation Tool for TKB Building
Arianne Reimerink
|
Pilar León-Araúz
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
Evaluating EcoLexiCAT: a Terminology-Enhanced CAT Tool
Pilar León-Araúz
|
Arianne Reimerink
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
Towards the Inference of Semantic Relations in Complex Nominals: a Pilot Study
Melania Cabezas-García
|
Pilar León-Araúz
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
2016
pdf
abs
Pattern-based Word Sketches for the Extraction of Semantic Relations
Pilar León-Araúz
|
Antonio San Martín
|
Pamela Faber
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
Despite advances in computer technology, terminologists still tend to rely on manual work to extract all the semantic information that they need for the description of specialized concepts. In this paper we propose the creation of new word sketches in Sketch Engine for the extraction of semantic relations. Following a pattern-based approach, new sketch grammars are devel-oped in order to extract some of the most common semantic relations used in the field of ter-minology: generic-specific, part-whole, location, cause and function.
2010
pdf
abs
EcoLexicon: An Environmental TKB
Arianne Reimerink
|
Pilar León Araúz
|
Pedro J. Magaña Redondo
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
EcoLexicon, a multilingual knowledge resource on the environment, provides an internally coherent information system covering a wide range of specialized linguistic and conceptual needs. Data in our terminological knowledge base (TKB) are primarily hosted in a relational database which is now linked to an ontology in order to apply reasoning techniques and enhance user queries. The advantages of ontological reasoning can only be obtained if conceptual description is based on systematic criteria and a wide inventory of non-hierarchical relations, which confer dynamism to knowledge representation. Thus, our research has mainly focused on conceptual modelling and providing a user-friendly multimodal interface. The dynamic interface, which combines conceptual (networks and definitions), linguistic (contexts, concordances) and graphical information offers users the freedom to surf it according to their needs. Furthermore, dynamism is also present at the representational level. Contextual constraints have been applied to reconceptualise versatile concepts that cause a great deal of information overload.