-------------------------------------------------------------------------
-- RADICAL RESOURCES USED FOR MULTI-GRANULARITY CHINESE WORD EMBEDDING --
-------------------------------------------------------------------------

------------------
OUTLINE:
1. Introduction
2. Content
3. Data Format
4. Data Statistics
5. How to Cite
6. Contact
------------------


------------------
1. INTRODUCTION:
------------------

These are the radical resources used for Multi-Granularity Chinese Word Embedding. 
All these resources are extracted from Web site http://zd.diyifanwen.com/zidian/bs/


------------------
2. CONTENT:
------------------

The archive contains 1 README file + 2 resource files:
  - README: the specification document
  - radicalIndex: Chinese characters indexed by radicals
  - characterRadicalMap: Chinese character-radical map 


------------------
3. DATA FORMAT
------------------

The radicalIndex.dict contains characters indexed by a radical per line, stored in a 
blank (' ') separated format. The first element is the radical, others are characters 
indexed by it.

The characterRadicalMap.dict contains a character and its radical per line, stored in 
a blank(' ') separated format. The first element is the character, and the second the 
radical of that character. 

------------------
4. DATA STATISTICS
------------------

Both radicalIndex.dict and characterRadicalMap.dict consist of 20850 characters and 
268 radicals among them.


------------------
5. HOW TO CITE
------------------

When using this resource, one should cite the original paper:
  @inproceedings{rongchao2016:MGE,
    title     = {Multi-Granularity Chinese Word Embedding},
    author    = {Rongchao Yin and Quan Wang and Rui Li and Peng Li and Bin Wang},
    booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
    year      = {2016}
  }


------------------  
6. CONTACT
------------------

For all remarks or questions please contact Quan Wang:
wangquan (at) iie (dot) ac (dot) cn .


