Viktor Bielický


2008

pdf
Building the Valency Lexicon of Arabic Verbs
Viktor Bielický | Otakar Smrž
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the building of a valency lexicon of Arabic verbs using a morphologically and syntactically annotated corpus, the Prague Arabic Dependency Treebank (PADT), as its primary source. We present the theoretical account on valency developed within the Functional Generative Description (FGD) theory. We apply the framework to Modern Standard Arabic and discuss various valency-related phenomena with respect to examples from the corpus. We then outline the methodology and the linguistic and technical resources used in the building of the lexicon. The key concept in our scenario is that of PDT-VALLEX of Czech. Our lexicon will be developed by linking the conceivable entries with their instances in the treebank. Conversely, the treebank’s annotations will be linked to the lexicon. While a comparable scheme has been developed for Czech, our own contribution is to design and implement this model thoroughly for Arabic and the PADT data. The Arabic valency lexicon is intended for applications in computational parsing or language generation, and for use by human researchers. The proposed valency lexicon will be exploited in particular during further tectogrammatical annotations of PADT and might serve for enriching the expected second edition of the corpus-based Arabic-Czech Dictionary.
Search
Co-authors
Venues