Antonio San Martín


2020

pdf
Extraction of Hyponymic Relations in French with Knowledge-Pattern-Based Word Sketches
Antonio San Martín | Catherine Trekker | Pilar León-Araúz
Proceedings of the Twelfth Language Resources and Evaluation Conference

Hyponymy is the cornerstone of taxonomies and concept hierarchies. However, the extraction of hypernym-hyponym pairs from a corpus can be time-consuming, and reconstructing the hierarchical network of a domain is often an extremely complex process. This paper presents the development and evaluation of the French EcoLexicon Semantic Sketch Grammar (ESSG-fr), a French hyponymic sketch grammar for Sketch Engine based on knowledge patterns. It offers a user-friendly way of extracting hyponymic pairs in the form of word sketches in any user-owned corpus. The ESSG-fr contains three times more hyponymic patterns than its English counterpart and has been tested in a multidisciplinary corpus. It is thus expected to be domain-independent. Moreover, the following methodological innovations have been included in its development: (1) use of English hyponymic patterns in a parallel corpus to find new French patterns; (2) automatic inclusion of the results of the Sketch Engine thesaurus to find new variants of the patterns. As for its evaluation, the ESSG-fr returns 70% valid hyperonyms and hyponyms, measured on 180 extracted pairs of terms in three different domains.

2017

pdf
Semantic annotation to characterize contextual variation in terminological noun compounds: a pilot study
Melania Cabezas-García | Antonio San Martín
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)

Noun compounds (NCs) are semantically complex and not fully compositional, as is often assumed. This paper presents a pilot study regarding the semantic annotation of environmental NCs with a view to accessing their semantics and exploring their domain-based contextual variation. Our results showed that the semantic annotation of NCs afforded important insights into how context impacts their conceptualization.

2016

pdf
Pattern-based Word Sketches for the Extraction of Semantic Relations
Pilar León-Araúz | Antonio San Martín | Pamela Faber
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)

Despite advances in computer technology, terminologists still tend to rely on manual work to extract all the semantic information that they need for the description of specialized concepts. In this paper we propose the creation of new word sketches in Sketch Engine for the extraction of semantic relations. Following a pattern-based approach, new sketch grammars are devel-oped in order to extract some of the most common semantic relations used in the field of ter-minology: generic-specific, part-whole, location, cause and function.

2014

pdf
Definition patterns for predicative terms in specialized lexical resources
Antonio San Martín | Marie-Claude L’Homme
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The research presented in this paper is part of a larger project on the semi-automatic generation of definitions of semantically-related terms in specialized resources. The work reported here involves the formulation of instructions to generate the definitions of sets of morphologically-related predicative terms, based on the definition of one of the members of the set. In many cases, it is assumed that the definition of a predicative term can be inferred by combining the definition of a related lexical unit with the information provided by the semantic relation (i.e. lexical function) that links them. In other words, terminographers only need to know the definition of “pollute” and the semantic relation that links it to other morphologically-related terms (“polluter”, “polluting”, “pollutant”, etc.) in order to create the definitions of the set. The results show that rules can be used to generate a preliminary set of definitions (based on specific lexical functions). They also show that more complex rules would need to be devised for other morphological pairs.