Chaya Liebeskind

2021

pdf bib
Multiword expressions as discourse markers in Hebrew and Lithuanian
Giedre Valunaite Oleskeviciene | Chaya Liebeskind
Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age

pdf bib abs
JCT at SemEval-2021 Task 1: Context-aware Representation for Lexical Complexity Prediction
Chaya Liebeskind | Otniel Elkayam | Shmuel Liebeskind
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

In this paper, we present our contribution in SemEval-2021 Task 1: Lexical Complexity Prediction, where we integrate linguistic, statistical, and semantic properties of the target word and its context as features within a Machine Learning (ML) framework for predicting lexical complexity. In particular, we use BERT contextualized word embeddings to represent the semantic meaning of the target word and its context. We participated in the sub-task of predicting the complexity score of single words

2020

pdf bib abs
JCT at SemEval-2020 Task 1: Combined Semantic Vector Spaces Models for Unsupervised Lexical Semantic Change Detection
Efrat Amar | Chaya Liebeskind
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this paper, we present our contribution in SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection, where we systematically combine existing models for unsupervised capturing of lexical semantic change across time in text corpora of German, English, Latin and Swedish. In particular, we analyze the score distribution of existing models. Then we define a general threshold, adjust it independently to each of the models and measure the models’ score reliability. Finally, using both the threshold and score reliability, we aggregate the models for the two sub- tasks: binary classification and ranking.

pdf bib abs
Automatic Construction of Aramaic-Hebrew Translation Lexicon
Chaya Liebeskind | Shmuel Liebeskind
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages

Aramaic is an ancient Semitic language with a 3,000 year history. However, since the number of Aramaic speakers in the world hasdeclined, Aramaic is in danger of extinction. In this paper, we suggest a methodology for automatic construction of Aramaic-Hebrew translation Lexicon. First, we generate an initial translation lexicon by a state-of-the-art word alignment translation model. Then,we filter the initial lexicon using string similarity measures of three types: similarity between terms in the target language, similarity between a source and a target term, and similarity between terms in the source language. In our experiments, we use a parallel corporaof Biblical Aramaic-Hebrew sentence pairs and evaluate various string similarity measures for each type of similarity. We illustratethe empirical benefit of our methodology and its effect on precision and F1. In particular, we demonstrate that our filtering methodsignificantly exceeds a filtering approach based on the probability scores given by a state-of-the-art word alignment translation model.

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.

2018

This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year’s shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20 languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17 participating systems, their methods and obtained results are also presented and analysed.

pdf bib
Automatic Thesaurus Construction for Modern Hebrew
Chaya Liebeskind | Ido Dagan | Jonathan Schler
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib abs
Semantically Motivated Hebrew Verb-Noun Multi-Word Expressions Identification
Chaya Liebeskind | Yaakov HaCohen-Kerner
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Identification of Multi-Word Expressions (MWEs) lies at the heart of many natural language processing applications. In this research, we deal with a particular type of Hebrew MWEs, Verb-Noun MWEs (VN-MWEs), which combine a verb and a noun with or without other words. Most prior work on MWEs classification focused on linguistic and statistical information. In this paper, we claim that it is essential to utilize semantic information. To this end, we propose a semantically motivated indicator for classifying VN-MWE and define features that are related to various semantic spaces and combine them as features in a supervised classification framework. We empirically demonstrate that our semantic feature set yields better performance than the common linguistic and statistical feature sets and that combining semantic features contributes to the VN-MWEs identification task.

pdf bib abs
A Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions
Chaya Liebeskind | Yaakov HaCohen-Kerner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

A verb-noun Multi-Word Expression (MWE) is a combination of a verb and a noun with or without other words, in which the combination has a meaning different from the meaning of the words considered separately. In this paper, we present a new lexical resource of Hebrew Verb-Noun MWEs (VN-MWEs). The VN-MWEs of this resource were manually collected and annotated from five different web resources. In addition, we analyze the lexical properties of Hebrew VN-MWEs by classifying them to three types: morphological, syntactic, and semantic. These two contributions are essential for designing algorithms for automatic VN-MWEs extraction. The analysis suggests some interesting features of VN-MWEs for exploration. The lexical resource enables to sample a set of positive examples for Hebrew VN-MWEs. This set of examples can either be used for training supervised algorithms or as seeds in unsupervised bootstrapping algorithms. Thus, this resource is a first step towards automatic identification of Hebrew VN-MWEs, which is important for natural language understanding, generation and translation systems.