Abstract
Many terms used in physics have a different meaning or usage pattern in general language, constituting a learning barrier in physics teaching. The systematic identification of such terms is considered to be useful for science education as well as for terminology extraction. This article compares three methods based on vector semantics and a simple frequency-based baseline for automatically identifying terms used in general language with domain-specific use in physics. For evaluation, we use ambiguity scores from a survey among physicists and data about the number of term senses from Wiktionary. We show that the so-called Vector Initialization method obtains the best results.- Anthology ID:
- 2023.iwcs-1.26
- Volume:
- Proceedings of the 15th International Conference on Computational Semantics
- Month:
- June
- Year:
- 2023
- Address:
- Nancy, France
- Editors:
- Maxime Amblard, Ellen Breitholtz
- Venue:
- IWCS
- SIG:
- SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 252–257
- Language:
- URL:
- https://aclanthology.org/2023.iwcs-1.26
- DOI:
- Cite (ACL):
- Vitor Fontanella, Christian Wartena, and Gunnar Friege. 2023. Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language. In Proceedings of the 15th International Conference on Computational Semantics, pages 252–257, Nancy, France. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language (Fontanella et al., IWCS 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2023.iwcs-1.26.pdf