Abstract
We report on work in progress dealing with the automated generation of pronunciation information for English multiword terms (MWTs) in Wiktionary, combining information available for their single components. We describe the issues we were encountering, the building of an evaluation dataset, and our teaming with the DBnary resource maintainer. Our approach shows potential for automatically adding morphosyntactic and semantic information to the components of such MWTs.- Anthology ID:
- 2023.mwe-1.10
- Volume:
- Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 65–72
- Language:
- URL:
- https://aclanthology.org/2023.mwe-1.10
- DOI:
- Cite (ACL):
- Lenka Bajcetic, Thierry Declerck, and Gilles Sérasset. 2023. Enriching Multiword Terms in Wiktionary with Pronunciation Information. In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), pages 65–72, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Enriching Multiword Terms in Wiktionary with Pronunciation Information (Bajcetic et al., MWE 2023)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2023.mwe-1.10.pdf