TERMinator: A System for Scientific Texts Processing

Elena Bruches, Olga Tikhobaeva, Yana Dementyeva, Tatiana Batura


Abstract
This paper is devoted to the extraction of entities and semantic relations between them from scientific texts, where we consider scientific terms as entities. In this paper, we present a dataset that includes annotations for two tasks and develop a system called TERMinator for the study of the influence of language models on term recognition and comparison of different approaches for relation extraction. Experiments show that language models pre-trained on the target language are not always show the best performance. Also adding some heuristic approaches may improve the overall quality of the particular task. The developed tool and the annotated corpus are publicly available at https://github.com/iis-research-team/terminator and may be useful for other researchers.
Anthology ID:
2022.coling-1.302
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3420–3426
Language:
URL:
https://aclanthology.org/2022.coling-1.302
DOI:
Bibkey:
Cite (ACL):
Elena Bruches, Olga Tikhobaeva, Yana Dementyeva, and Tatiana Batura. 2022. TERMinator: A System for Scientific Texts Processing. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3420–3426, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
TERMinator: A System for Scientific Texts Processing (Bruches et al., COLING 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/2022.coling-1.302.pdf
Data
SciERC