Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality

Dhouha Bouamor, Leonardo Campillos Llanos, Anne-Laure Ligozat, Sophie Rosset, Pierre Zweigenbaum


Abstract
While measuring the readability of texts has been a long-standing research topic, assessing the technicality of terms has only been addressed more recently and mostly for the English language. In this paper, we train a learning-to-rank model to determine a specialization degree for each term found in a given list. Since no training data for this task exist for French, we train our system with non-lexical features on English data, namely, the Consumer Health Vocabulary, then apply it to French. The features include the likelihood ratio of the term based on specialized and lay language models, and tests for containing morphologically complex words. The evaluation of this approach is conducted on 134 terms from the UMLS Metathesaurus and 868 terms from the Eugloss thesaurus. The Normalized Discounted Cumulative Gain obtained by our system is over 0.8 on both test sets. Besides, thanks to the learning-to-rank approach, adding morphological features to the language model features improves the results on the Eugloss thesaurus.
Anthology ID:
L16-1366
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2312–2316
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/L16-1366/
DOI:
Bibkey:
Cite (ACL):
Dhouha Bouamor, Leonardo Campillos Llanos, Anne-Laure Ligozat, Sophie Rosset, and Pierre Zweigenbaum. 2016. Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2312–2316, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality (Bouamor et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/L16-1366.pdf