Abstract
This paper introduces the Open Cantonese Sense-Tagged Corpus, a new and ongoing project to serve as the companion to the development of the Cantonese Wordnet. This corpus is built on top of the Cantonese Wordnet Corpus, which currently provides example sentences for most verbs in this wordnet. This paper motivates the choice of starting a sense-tagged corpus from both linguistic and educational perspectives, and discusses the current solutions to issues arisen from the sense-tagging exercise. In total, we have tagged over 5,000 concepts, with more than 3,700 direct links to the Cantonese Wordnet.- Anthology ID:
- 2023.gwc-1.32
- Volume:
- Proceedings of the 12th Global Wordnet Conference
- Month:
- January
- Year:
- 2023
- Address:
- University of the Basque Country, Donostia - San Sebastian, Basque Country
- Editors:
- German Rigau, Francis Bond, Alexandre Rademaker
- Venue:
- GWC
- SIG:
- Publisher:
- Global Wordnet Association
- Note:
- Pages:
- 263–268
- Language:
- URL:
- https://aclanthology.org/2023.gwc-1.32
- DOI:
- Cite (ACL):
- Joanna Sio and Luis Morgado Da Costa. 2023. The Open Cantonese Sense-Tagged Corpus. In Proceedings of the 12th Global Wordnet Conference, pages 263–268, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
- Cite (Informal):
- The Open Cantonese Sense-Tagged Corpus (Sio & Costa, GWC 2023)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2023.gwc-1.32.pdf