Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction

Mehdi Mirzapour, Waleed Ragheb, Mohammad Javad Saeedizade, Kevin Cousot, Helene Jacquenet, Lawrence Carbon, Mathieu Lafourcade


Abstract
Knowledge graphs applications, in industry and academia, motivate substantial research directions towards large-scale information extraction from various types of resources. Nowadays, most of the available knowledge graphs are either in English or multilingual. In this paper, we introduce RezoJDM16k, a French knowledge graph dataset based on RezoJDM. With 16k nodes, 832k triplets, and 53 relation types, RezoJDM16k can be employed in many NLP downstream tasks for the French language such as machine translation, question-answering, and recommendation systems. Moreover, we provide strong knowledge graph embedding baselines that are used in link prediction tasks for future benchmarking. Compared to the state-of-the-art English knowledge graph datasets used in link prediction, RezoJDM16k shows a similar promising predictive behavior.
Anthology ID:
2022.lrec-1.553
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5163–5169
Language:
URL:
https://aclanthology.org/2022.lrec-1.553
DOI:
Bibkey:
Cite (ACL):
Mehdi Mirzapour, Waleed Ragheb, Mohammad Javad Saeedizade, Kevin Cousot, Helene Jacquenet, Lawrence Carbon, and Mathieu Lafourcade. 2022. Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5163–5169, Marseille, France. European Language Resources Association.
Cite (Informal):
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction (Mirzapour et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.553.pdf
Data
ConceptNetFB15kWN18WN18RR