Abstract
The paper addresses TermoUD — a language-independent terminology extraction tool. Itsprevious version, i.e. TermoPL (Marciniak et al., 2016; Rychlik et al., 2022), uses languagedependent shallow grammar which selects candidate terms. The goal behind the development of TermoUD is to make the procedure as universal as possible, while taking care of the linguistic correctness of selected phrases. The tool is suitable for languages for which the Universal Dependencies (UD) parser exists. We describe a method of candidate term extraction based on UD POS tags and UD relations. The candidate ranking is performed by the C-value metric (contexts counting is adapted to the UD formalism), which doesn’t need any additional language resources. The performance of the tool has been tested on texts in English, French, Dutch, and Slovenian. The results are evaluated on the manually annotated datasets: ACTER, RD-TEC 2.0, GENIA and RSDO5, and compared to those obtained by other tools.- Anthology ID:
- 2023.eacl-demo.21
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Danilo Croce, Luca Soldaini
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 178–186
- Language:
- URL:
- https://aclanthology.org/2023.eacl-demo.21
- DOI:
- 10.18653/v1/2023.eacl-demo.21
- Cite (ACL):
- Malgorzata Marciniak, Piotr Rychlik, and Agnieszka Mykowiecka. 2023. TermoUD - a language-independent terminology extraction tool. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 178–186, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- TermoUD - a language-independent terminology extraction tool (Marciniak et al., EACL 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.eacl-demo.21.pdf