PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation
Dimitris Papadopoulos, Nikolaos Papadakis, Nikolaos Matsatsinis
Abstract
In this work, we present a methodology that aims at bridging the gap between high and low-resource languages in the context of Open Information Extraction, showcasing it on the Greek language. The goals of this paper are twofold: First, we build Neural Machine Translation (NMT) models for English-to-Greek and Greek-to-English based on the Transformer architecture. Second, we leverage these NMT models to produce English translations of Greek text as input for our NLP pipeline, to which we apply a series of pre-processing and triple extraction tasks. Finally, we back-translate the extracted triples to Greek. We conduct an evaluation of both our NMT and OIE methods on benchmark datasets and demonstrate that our approach outperforms the current state-of-the-art for the Greek natural language.- Anthology ID:
- 2021.eacl-srw.4
- Volume:
- Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Editors:
- Ionut-Teodor Sorodoc, Madhumita Sushil, Ece Takmaz, Eneko Agirre
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 23–29
- Language:
- URL:
- https://aclanthology.org/2021.eacl-srw.4
- DOI:
- 10.18653/v1/2021.eacl-srw.4
- Cite (ACL):
- Dimitris Papadopoulos, Nikolaos Papadakis, and Nikolaos Matsatsinis. 2021. PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 23–29, Online. Association for Computational Linguistics.
- Cite (Informal):
- PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation (Papadopoulos et al., EACL 2021)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2021.eacl-srw.4.pdf
- Code
- lighteternal/PENELOPIE
- Data
- CaRB, Tatoeba