Sibel Ozer


TED-MDB Lexicons: Tr-EnConnLex, Pt-EnConnLex
Murathan Kurfalı | Sibel Ozer | Deniz Zeyrek | Amália Mendes
Proceedings of the First Workshop on Computational Approaches to Discourse

In this work, we present two new bilingual discourse connective lexicons, namely, for Turkish-English and European Portuguese-English created automatically using the existing discourse relation-aligned TED-MDB corpus. In their current form, the Pt-En lexicon includes 95 entries, whereas the Tr-En lexicon contains 133 entries. The lexicons constitute the first step of a larger project of developing a multilingual discourse connective lexicon.


An automatic discourse relation alignment experiment on TED-MDB
Sibel Ozer | Deniz Zeyrek
Proceedings of the 2019 Workshop on Widening NLP

This paper describes an automatic discourse relation alignment experiment as an empirical justification of the planned annotation projection approach to enlarge the 3600-word multilingual corpus of TED Multilingual Discourse Bank (TED-MDB). The experiment is carried out on a single language pair (English-Turkish) included in TED-MDB. The paper first describes the creation of a large corpus of English-Turkish bi-sentences, then it presents a sense-based experiment that automatically aligns the relations in the English sentences of TED-MDB with the Turkish sentences. The results are very close to the results obtained from an earlier semi-automatic post-annotation alignment experiment validated by human annotators and are encouraging for future annotation projection tasks.