PersPred is a manually elaborated multilingual syntactic and semantic Lexicon for Persian Complex Predicates (CPs), referred to also as Light Verb Constructions (LVCs) or Compound Verbs. CPs constitutes the regular and the most common way of expressing verbal concepts in Persian, which has only around 200 simplex verbs. CPs can be defined as multi-word sequences formed by a verb and a non-verbal element and functioning in many respects as a simplex verb. Bonami & Samvelain (2010) and Samvelian & Faghiri (to appear) extendedly argue that Persian CPs are MWEs and consequently must be listed. The first delivery of PersPred, contains more than 600 combinations of the verb zadan hit with a noun, presented in a spreadsheet. In this paper we present a semi-automatic method used to extend the coverage of PersPred 1.0, which relies on the syntactic information on valency alternations already encoded in the database. Given the importance of CPs in the verbal lexicon of Persian and the fact that lexical resources cruelly lack for Persian, this method can be further used to achieve our goal of making PersPred an appropriate resource for NLP applications.
Question answering systems are complex systems using natural language processing. Some evaluation campaigns are organized to evaluate such systems in order to propose a classification of systems based on final results (number of correct answers). Nevertheless, teams need to evaluate more precisely the results obtained by their systems if they want to do a diagnostic evaluation. There are no tools or methods to do these evaluations systematically. We present REVISE, a tool for glass box evaluation based on diagnostic of question answering system results.
Les campagnes d’évaluation ne tiennent compte que des résultats finaux obtenus par les systèmes de recherche d’informations (RI). Nous nous situons dans une perspective d’évaluation transparente d’un système de questions-réponses, où le traitement d’une question se fait grâce à plusieurs composants séquentiels. Dans cet article, nous nous intéressons à l’étude de l’élément de la question qui porte l’information qui se trouvera dans la phrase réponse à proximité de la réponse elle-même : le focus. Nous définissons ce concept, l’appliquons au système de questions-réponses QALC, et démontrons l’utilité d’évaluations des composants afin d’augmenter la performance globale du système.