Farnoosh Shamsian


2023

pdf
Classical Philology in the Time of AI: Exploring the Potential of Parallel Corpora in Ancient Language
Tariq Yousef | Chiara Palladino | Farnoosh Shamsian
Proceedings of the Ancient Language Processing Workshop

This paper provides an overview of diverse applications of parallel corpora in ancient languages, particularly Ancient Greek. In the first part, we provide the fundamental principles of parallel corpora and a short overview of their applications in the study of ancient texts. In the second part, we illustrate how to leverage on parallel corpora to perform various NLP tasks, including automatic translation alignment, dynamic lexica induction, and Named Entity Recognition. In the conclusions, we emphasize current limitations and future work.

2022

pdf
An automatic model and Gold Standard for translation alignment of Ancient Greek
Tariq Yousef | Chiara Palladino | Farnoosh Shamsian | Anise d’Orange Ferreira | Michel Ferreira dos Reis
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper illustrates a workflow for developing and evaluating automatic translation alignment models for Ancient Greek. We designed an annotation Style Guide and a gold standard for the alignment of Ancient Greek-English and Ancient Greek-Portuguese, measured inter-annotator agreement and used the resulting dataset to evaluate the performance of various translation alignment models. We proposed a fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences. The results indicate that the fine-tuned model based on XLM-Roberta is superior in performance, and it achieved good results on language pairs that were not part of the training data.