unknown word 0.0034524210000000002
chinese word 0.003057768
unknown words 0.003020341
new word 0.0029982240000000003
word tagging 0.002978282
word distribution 0.002956049
statistical model 0.002903167
model figure 0.002886558
tags words 0.0028484310000000002
entropy model 0.002846138
word tokens 0.002830528
hybrid model 0.0028208689999999997
word delimiters 0.0028109600000000004
word hav 0.0028109600000000004
word identification 0.0028109600000000004
word resolution 0.0028109600000000004
word abc 0.0028109600000000004
trigram model 0.0028109289999999998
model the 0.0028011029999999997
individual model 0.0027936379999999998
other words 0.002767855
combined model 0.0027662439999999997
model guesses 0.0027612709999999996
tistical model 0.0027472599999999996
model comple 0.0027372839999999996
component words 0.002507844
model 0.00246251
monosyllabic words 0.002419108
known words 0.0024095740000000003
loan words 0.0023963920000000002
disyllabic words 0.002394775
syllabic words 0.002388232
words chars 0.0023834200000000002
trisyllabic words 0.002380131
reduplicated words 0.002378724
words 0.00210463
unknown pos 0.0018762380000000001
training corpus 0.001711202
pos tags 0.001704328
training data 0.0016545660000000001
such tags 0.0016477599999999998
pos information 0.001641596
other models 0.001599361
pos tag 0.001595289
training test 0.0014728150000000001
different models 0.001434727
data test 0.001418035
test data 0.001418035
pos context 0.001380419
chinese corpus 0.001377587
statistical models 0.0013767929999999999
pos categories 0.0013706439999999999
previous pos 0.001369469
character strings 0.001354643
particular pos 0.001340794
pos probabilities 0.001327249
pos category 0.001319377
hybrid models 0.0012944949999999999
ing data 0.00128032
data types 0.001273845
individual models 0.001267264
pos guess 0.001252949
unknown nouns 0.001250278
pos cate 0.0012480479999999999
pos cat 0.001239355
tistical models 0.001220886
rule tags 0.001219449
dividual models 0.001212019
training lexicon 0.001198871
total training 0.001182003
separate set 0.001171214
nese corpus 0.001144958
other types 0.001137177
chars training 0.001133463
noun morphemes 0.001117828
noun morpheme 0.001083897
chars data 0.001078683
complete set 0.0010750780000000001
noun figure 0.0010337039999999999
internal structure 0.0010316919999999999
tags ranks 0.001019461
different segmentation 9.886790000000001E-4
useful information 9.84747E-4
contextual information 9.82189E-4
character 9.74701E-4
semantic information 9.641509999999999E-4
same time 9.62971E-4
separate task 9.457529999999999E-4
first reduplication 9.40568E-4
other logo 9.391410000000001E-4
models 9.36136E-4
tag description 9.263889999999999E-4
first check 9.227689999999999E-4
component morphemes 9.113859999999999E-4
component characters 9.08128E-4
disyllabic noun 8.99801E-4
several reasons 8.975579999999999E-4
reduplication rules 8.96337E-4
rule distribution 8.949870000000001E-4
joint probabilities 8.94977E-4
