Synonymy in Bilingual Context: The CzEngClass Lexicon

Zdeňka Urešová, Eva Fučíková, Eva Hajičová, Jan Hajič


Abstract
This paper describes CzEngClass, a bilingual lexical resource being built to investigate verbal synonymy in bilingual context and to relate semantic roles common to one synonym class to verb arguments (verb valency). In addition, the resource is linked to existing resources with the same of a similar aim: English and Czech WordNet, FrameNet, PropBank, VerbNet (SemLink), and valency lexicons for Czech and English (PDT-Vallex, Vallex, and EngVallex). There are several goals of this work and resource: (a) to provide gold standard data for automatic experiments in the future (such as automatic discovery of synonym classes, word sense disambiguation, assignment of classes to occurrences of verbs in text, coreferential linking of verb and event arguments in text, etc.), (b) to build a core (bilingual) lexicon linked to existing resources, for comparative studies and possibly for training automatic tools, and (c) to enrich the annotation of a parallel treebank, the Prague Czech English Dependency Treebank, which so far contained valency annotation but has not linked synonymous senses of verbs together. The method used for extracting the synonym classes is a semi-automatic process with a substantial amount of manual work during filtering, role assignment to classes and individual Class members’ arguments, and linking to the external lexical resources. We present the first version with 200 classes (about 1800 verbs) and evaluate interannotator agreement using several metrics.
Anthology ID:
C18-1208
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2456–2469
Language:
URL:
https://aclanthology.org/C18-1208
DOI:
Bibkey:
Cite (ACL):
Zdeňka Urešová, Eva Fučíková, Eva Hajičová, and Jan Hajič. 2018. Synonymy in Bilingual Context: The CzEngClass Lexicon. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2456–2469, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Synonymy in Bilingual Context: The CzEngClass Lexicon (Urešová et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/C18-1208.pdf
Data
FrameNetPenn Treebank