----------------------------------
-- Freebase FB15k data --  2013 --
----------------------------------

------------------
OUTLINE:
1. Introduction
2. Content
3. Data Format
4. Data Statistics
5. How to Cite
6. License
7. Contact
-------------------


1. INTRODUCTION:

This FREEBASE FB15k DATA consists of a collection of triplets (synset, relation_type, 
triplet) extracted from Freebase (http://www.freebase.com). This data set can 
be seen as a 3-mode tensor depicting ternary relationships between synsets. 

2. CONTENT:

The data archive contains 4 files:
  - README 3K
  - freebase_mtr100_mte100-train.txt 36M
  - freebase_mtr100_mte100-valid.txt 3.7K
  - freebase_mtr100_mte100-test.txt  4.4M

The 3 files freebase_mtr100_mte100-*.txt contain the triplets (training, validation
and test sets).

3. DATA FORMAT

All freebase_mtr100_mte100-*.txt files contain one triplet per line, with 2 mids
(unique Freebase entity identifier) and relation type identifier in a tab separated 
format. The first element is the mid of the left hand side (of head) of the relation triple, 
the third one is the mid of the right hand side (or tail) and the second element is the name 
of the relationship between them.

4. DATA STATISTICS

There are 14,951 mids and 1,345 relation types among them. The training set contains 
483,142 triplets, the validation set 50,000 and the test set 59,071.

All triplets are unique and we made sure that all synsets appearing in
the validation or test sets were occurring in the training set.

5. HOW TO CITE

When using this data, one should cite the original paper:
  @incollection{bordes-nips13,
    title = {Translating Embeddings for Modeling Multi-relational Data},
    author = {Antoine Bordes and Nicolas Usunier and Alberto Garcia-Dur\'an and Jason Weston and Oksana Yakhnenko},
    booktitle={Advances in Neural Information Processing Systems (NIPS 26)},
    year={2013}
  }

One should also point at the project page with either the long URL:
https://www.hds.utc.fr/everest/doku.php?id=en:transe , or the short
one: http://goo.gl/0PpKQe .

6. LICENSE:

FB15k data follows Freebase license, that is Creative Commons Attribution (aka CC-BY) 
(http://creativecommons.org/licenses/by/2.5/).

7. CONTACT

For all remarks or questions please contact Antoine Bordes: antoine
(dot) bordes (at) utc (dot) fr .



