Christopher Gledhill


2008

pdf
A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions
Amalia Todiraşcu | Dan Tufiş | Ulrich Heid | Christopher Gledhill | Dan Ştefanescu | Marion Weller | François Rousselot
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present the main findings and preliminary results of an ongoing project aimed at developing a system for collocation extraction based on contextual morpho-syntactic properties. We explored two hybrid extraction methods: the first method applies language-indepedent statistical techniques followed by a linguistic filtering, while the second approach, available only for German, is based on a set of lexico-syntactic patterns to extract collocation candidates. To define extraction and filtering patterns, we studied a specific collocation category, the Verb-Noun constructions, using a model inspired by the systemic functional grammar, proposing three level analysis: lexical, functional and semantic criteria. From tagged and lemmatized corpus, we identify some contextual morpho-syntactic properties helping to filter the output of the statistical methods and to extract some potential interesting VN constructions (complex predicates vs complex predicators). The extracted candidates are validated and classified manually.