-- Description

- table-2-combsum-thesaurus-50ngh.txt
  thesaurus resulting from the fusion of the [initial], [symmetry] and [compound] thesauri
  with the late fusion method CombSum (our best method)
  Its global evaluation corresponds to the "CombSum" line of Table 2 of the article.

  format: each line corresponds to one entry and the list of its semantic neighbors, ordered
  according to the decreasing value of their similarity with the entry. The maximal number of
  neighbors for each entry is equal to 50(*) but all entries do not have necessarily 50 neighbors. 
  
  Each line has the following format:
       <entry>(<space><neighbor><space><similarity>)+

  entry, neighbor: lemma of nouns

  This thesaurus contains 14,670 entries.

(*) In the article, this number is equal to 100 but only the first 50 neighbors are provided here 
    due to size restrictions


-- License

This dataset is licensed by CEA LIST under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (http://creativecommons.org/licenses/by-nc-sa/4.0)

Please cite the following publication if you use this dataset:

Olivier Ferret (2015) Early and Late Combinations of Criteria for Reranking Distributional Thesauri. 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China.

