Terminology extraction using co-occurrence patterns as predictors of semantic relevance

Rogelio Nazar; David Lindemann

Terminology extraction using co-occurrence patterns as predictors of semantic relevance

Abstract

We propose a method for automatic term extraction based on a statistical measure that ranks term candidates according to their semantic relevance to a specialised domain. As a measure of relevance we use term co-occurrence, defined as the repeated instantiation of two terms in the same sentences, in indifferent order and at variable distances. In this way, term candidates are ranked higher if they show a tendency to co-occur with a selected group of other units, as opposed to those showing more uniform distributions. No external resources are needed for the application of the method, but performance improves when provided with a pre-existing term list. We present results of the application of this method to a Spanish-English Linguistics corpus, and the evaluation compares favourably with a standard method based on reference corpora.

Anthology ID:: 2022.term-1.5
Volume:: Proceedings of the Workshop on Terminology in the 21st century: many faces, many places
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Rute Costa, Sara Carvalho, Ana Ostroški Anić, Anas Fahad Khan
Venue:: TERM
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 26–29
Language:
URL:: https://preview.aclanthology.org/nschneid-patch-2/2022.term-1.5/
DOI:
Bibkey:
Cite (ACL):: Rogelio Nazar and David Lindemann. 2022. Terminology extraction using co-occurrence patterns as predictors of semantic relevance. In Proceedings of the Workshop on Terminology in the 21st century: many faces, many places, pages 26–29, Marseille, France. European Language Resources Association.
Cite (Informal):: Terminology extraction using co-occurrence patterns as predictors of semantic relevance (Nazar & Lindemann, TERM 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2022.term-1.5.pdf

PDF Cite Search Fix data