chinese word 0.00370206
word information 0.00347175
word segmentation 0.0034456900000000004
chinese character 0.00289665
english word 0.0028116760000000004
word segmenter 0.002734676
word seg 0.0027264000000000004
standard word 0.002724555
word level 0.0027190400000000003
nese word 0.0026799280000000003
word bigrams 0.0026707560000000003
word infor 0.002655316
pku word 0.0026434540000000004
compound word 0.00263913
chinese pos 0.00236924
chinese characters 0.00236864
training data 0.0023612900000000003
chinese words 0.00235035
character sequence 0.002335958
segmentation data 0.0022868899999999998
chinese abbreviation 0.002258303
character tagging 0.002238653
first character 0.002195974
full character 0.00219043
previous character 0.002156802
same character 0.002144285
pos information 0.00213893
current character 0.001992549
training corpus 0.00195928
model accuracy 0.00191384
chinese language 0.001908177
last character 0.00188995
duplicated character 0.0018818560000000001
pos tag 0.001869782
character duplication 0.001848428
ith character 0.0018348980000000002
ous character 0.001833999
character dupli 0.001826315
many features 0.001817984
reasonable features 0.001775544
chinese abbreviations 0.0017739160000000002
abbreviation corpus 0.001719353
unlabeled data 0.001648117
chinese abbrevia 0.0016402790000000001
same model 0.001633455
other words 0.001633001
stanford chinese 0.001616772
chinese abbre 0.0016136710000000001
chinese academy 0.001609512
results table 0.001606721
segmented data 0.001606257
feature templates 0.0015662929999999999
character 0.00155655
sequence label 0.00154761
testing data 0.001540378
local information 0.001520251
labeling results 0.0015151169999999999
learning approach 0.001503245
tagging results 0.001495195
training examples 0.0014879300000000002
log data 0.001487821
semantic information 0.0014867910000000001
different corpus 0.001484452
search results 0.001483189
sequence labeling 0.0014814329999999999
own data 0.001476514
segmentation result 0.001469562
level information 0.0014668700000000001
labeling method 0.001441931
abbreviation candidates 0.001438031
features 0.00143682
abbreviation generation 0.001435553
engine information 0.00143007
global information 0.001427326
web information 0.001424315
tagging method 0.0014220090000000001
useful information 0.001421305
msu information 0.00141941
ing model 0.001416459
form abbreviation 0.001408064
other sequence 0.001402159
msu segmentation 0.00139335
svm model 0.001385981
dplvm model 0.001383382
information glob 0.0013823210000000002
information retrieval 0.0013804960000000002
corresponding abbreviation 0.001371721
segmentation errors 0.001365937
different tagging 0.001365405
crfs model 0.001361985
segmentation tools 0.00135667
duplicated characters 0.0013538460000000001
specific characters 0.001349519
labeling problem 0.001347988
identical characters 0.001346879
chinese 0.0013401
representative characters 0.0013374950000000002
abbreviation prediction 0.001320762
memm model 0.001316041
third characters 0.0013120000000000002
