transliteration pairs 0.003634484
chinese word 0.00357624
chinese data 0.00349409
transliteration variants 0.0031970049999999997
chinese corpus 0.003151837
new transliteration 0.003146313
extraction transliteration 0.0031356409999999998
bootstrapping transliteration 0.0030923559999999997
transliteration strategies 0.0030890419999999997
transliteration variations 0.0030633039999999998
ent transliteration 0.003040659
transliteration forms 0.003039535
chinese characters 0.002845704
transliteration 0.0027668
chinese script 0.0027269290000000003
mandarin chinese 0.002682817
chinese ner 0.002635346
chinese gigaword 0.002623968
chinese texts 0.0026152190000000002
chinese speaking 0.002612255
chinese giga 0.002590339
chinese 0.00232129
language pairs 0.001969664
other language 0.00191667
language pair 0.001674773
other names 0.0016432819999999998
word second 0.001618462
same name 0.001560831
word lists 0.001550094
language variants 0.0015321850000000001
word forms 0.001527685
word sketches 0.001523432
word ending 0.001523432
data mining 0.001517296
original language 0.001517079
ing data 0.001483174
data collection 0.0014755850000000002
data sets 0.0014559680000000002
other set 0.001438173
candidate pairs 0.0013851620000000001
phonological features 0.001378316
name entity 0.00131459
good pairs 0.001300096
large corpus 0.001280008
new pairs 0.001247197
personal name 0.00123693
pairs extraction 0.001236525
variant pairs 0.001221488
seed pairs 0.001207877
foreign name 0.001197081
feature set 0.001195644
other translit 0.001186176
related names 0.001183017
corpus example 0.001179509
different words 0.00117088
whole corpus 0.001167338
different characters 0.0011638199999999999
contrast pairs 0.001160447
other approaches 0.0011580829999999999
phonological representation 0.00115634
phonological rules 0.001155308
pairs notice 0.001135548
gigaword corpus 0.0011332249999999999
phonological representations 0.0011321220000000002
grammatical information 0.001124422
phonological analysis 0.0011178070000000002
phonological value 0.001108444
names entities 0.001107449
edition corpus 0.00110728
language 0.00110198
suitable corpus 0.001100783
occidental names 0.001097448
peoples names 0.001097448
sonal names 0.001097448
phonological similarity 0.001096734
other hand 0.001091285
phonological interpretation 0.001066197
phonological comparison 0.00106342
small set 0.001061517
information aug 0.001045541
extraction algorithm 0.001043729
pair candidates 0.001043357
phonological corrections 0.001039189
phonological reality 0.001039189
phonological checking 0.001039189
text type 0.001033899
levenstein algorithm 9.89924E-4
preliminary results 9.6203E-4
same period 9.588489999999999E-4
default set 9.4235E-4
different communities 9.31596E-4
different domain 9.26671E-4
seed pair 9.12986E-4
different domains 9.097700000000001E-4
xinhua news 9.043E-4
original term 9.03999E-4
ferent set 8.97426E-4
promising results 8.94215E-4
distinctive feature 8.937699999999999E-4
tagging system 8.82533E-4
