Automatic Construction of Aramaic-Hebrew Translation Lexicon

Chaya Liebeskind, Shmuel Liebeskind


Abstract
Aramaic is an ancient Semitic language with a 3,000 year history. However, since the number of Aramaic speakers in the world hasdeclined, Aramaic is in danger of extinction. In this paper, we suggest a methodology for automatic construction of Aramaic-Hebrew translation Lexicon. First, we generate an initial translation lexicon by a state-of-the-art word alignment translation model. Then,we filter the initial lexicon using string similarity measures of three types: similarity between terms in the target language, similarity between a source and a target term, and similarity between terms in the source language. In our experiments, we use a parallel corporaof Biblical Aramaic-Hebrew sentence pairs and evaluate various string similarity measures for each type of similarity. We illustratethe empirical benefit of our methodology and its effect on precision and F1. In particular, we demonstrate that our filtering methodsignificantly exceeds a filtering approach based on the probability scores given by a state-of-the-art word alignment translation model.
Anthology ID:
2020.lt4hala-1.2
Volume:
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LT4HALA
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
10–16
Language:
English
URL:
https://aclanthology.org/2020.lt4hala-1.2
DOI:
Bibkey:
Cite (ACL):
Chaya Liebeskind and Shmuel Liebeskind. 2020. Automatic Construction of Aramaic-Hebrew Translation Lexicon. In Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pages 10–16, Marseille, France. European Language Resources Association (ELRA).
Cite (Informal):
Automatic Construction of Aramaic-Hebrew Translation Lexicon (Liebeskind & Liebeskind, LT4HALA 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-url/2020.lt4hala-1.2.pdf