Bilingual Terminology Extraction Using Neural Word Embeddings on Comparable Corpora

Darya Filippova, Burcu Can, Gloria Corpas Pastor


Abstract
Term and glossary management are vital steps of preparation of every language specialist, and they play a very important role at the stage of education of translation professionals. The growing trend of efficient time management and constant time constraints we may observe in every job sector increases the necessity of the automatic glossary compilation. Many well-performing bilingual AET systems are based on processing parallel data, however, such parallel corpora are not always available for a specific domain or a language pair. Domain-specific, bilingual access to information and its retrieval based on comparable corpora is a very promising area of research that requires a detailed analysis of both available data sources and the possible extraction techniques. This work focuses on domain-specific automatic terminology extraction from comparable corpora for the English – Russian language pair by utilizing neural word embeddings.
Anthology ID:
2021.ranlp-srw.9
Volume:
Proceedings of the Student Research Workshop Associated with RANLP 2021
Month:
September
Year:
2021
Address:
Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
58–64
Language:
URL:
https://aclanthology.org/2021.ranlp-srw.9
DOI:
Bibkey:
Cite (ACL):
Darya Filippova, Burcu Can, and Gloria Corpas Pastor. 2021. Bilingual Terminology Extraction Using Neural Word Embeddings on Comparable Corpora. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 58–64, Online. INCOMA Ltd..
Cite (Informal):
Bilingual Terminology Extraction Using Neural Word Embeddings on Comparable Corpora (Filippova et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.ranlp-srw.9.pdf