Analysing the Impact of Supervised Machine Learning on Automatic Term Extraction: HAMLET vs TermoStat
Ayla Rigouts Terryn, Patrick Drouin, Veronique Hoste, Els Lefever
Abstract
Traditional approaches to automatic term extraction do not rely on machine learning (ML) and select the top n ranked candidate terms or candidate terms above a certain predefined cut-off point, based on a limited number of linguistic and statistical clues. However, supervised ML approaches are gaining interest. Relatively little is known about the impact of these supervised methodologies; evaluations are often limited to precision, and sometimes recall and f1-scores, without information about the nature of the extracted candidate terms. Therefore, the current paper presents a detailed and elaborate analysis and comparison of a traditional, state-of-the-art system (TermoStat) and a new, supervised ML approach (HAMLET), using the results obtained for the same, manually annotated, Dutch corpus about dressage.- Anthology ID:
- R19-1117
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1012–1021
- Language:
- URL:
- https://aclanthology.org/R19-1117
- DOI:
- 10.26615/978-954-452-056-4_117
- Cite (ACL):
- Ayla Rigouts Terryn, Patrick Drouin, Veronique Hoste, and Els Lefever. 2019. Analysing the Impact of Supervised Machine Learning on Automatic Term Extraction: HAMLET vs TermoStat. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1012–1021, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Analysing the Impact of Supervised Machine Learning on Automatic Term Extraction: HAMLET vs TermoStat (Rigouts Terryn et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/R19-1117.pdf