Experiments with ad hoc ambiguous abbreviation expansion

Agnieszka Mykowiecka, Malgorzata Marciniak


Abstract
The paper addresses experiments to expand ad hoc ambiguous abbreviations in medical notes on the basis of morphologically annotated texts, without using additional domain resources. We work on Polish data but the described approaches can be used for other languages too. We test two methods to select candidates for word abbreviation expansions. The first one automatically selects all words in text which might be an expansion of an abbreviation according to the language rules. The second method uses clustering of abbreviation occurrences to select representative elements which are manually annotated to determine lists of potential expansions. We then train a classifier to assign expansions to abbreviations based on three training sets: automatically obtained, consisting of manual annotation, and concatenation of the two previous ones. The results obtained for the manually annotated training data significantly outperform automatically obtained training data. Adding the automatically obtained training data to the manually annotated data improves the results, in particular for less frequent abbreviations. In this context the proposed a priori data driven selection of possible extensions turned out to be crucial.
Anthology ID:
D19-6207
Volume:
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Month:
November
Year:
2019
Address:
Hong Kong
Editors:
Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
44–53
Language:
URL:
https://aclanthology.org/D19-6207
DOI:
10.18653/v1/D19-6207
Bibkey:
Cite (ACL):
Agnieszka Mykowiecka and Malgorzata Marciniak. 2019. Experiments with ad hoc ambiguous abbreviation expansion. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019), pages 44–53, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Experiments with ad hoc ambiguous abbreviation expansion (Mykowiecka & Marciniak, Louhi 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/D19-6207.pdf