Abstract
We present the parallel creation of a WordNet resource for Swedish and Bulgarian which is tightly aligned with the Princeton WordNet. The alignment is not only on the synset level, but also on word level, by matching words with their closest translations in each language. We argue that the tighter alignment is essential in machine translation and natural language generation. About one-fifth of the lexical entries are also linked to the corresponding Wikipedia articles. In addition to the traditional semantic relations in WordNet, we also integrate morphological and morpho-syntactic information. The resource comes with a corpus where examples from Princeton WordNet are translated to Swedish and Bulgarian. The examples are aligned on word and phrase level. The new resource is open-source and in its development we used only existing open-source resources.- Anthology ID:
- 2020.lrec-1.368
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 3008–3015
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.368
- DOI:
- Cite (ACL):
- Krasimir Angelov. 2020. A Parallel WordNet for English, Swedish and Bulgarian. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3008–3015, Marseille, France. European Language Resources Association.
- Cite (Informal):
- A Parallel WordNet for English, Swedish and Bulgarian (Angelov, LREC 2020)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2020.lrec-1.368.pdf