model word 0.00564006
different word 0.005260191
chinese word 0.005126871
segmentation word 0.005087687
word segmentation 0.005087687
new word 0.004974258
other word 0.004857805
word probability 0.0048563109999999994
unknown word 0.004706432
morphological word 0.00470492
lexical word 0.0046956129999999995
word candidate 0.004695489
word lexicon 0.004675716999999999
word class 0.004666102
word list 0.004654506
word detection 0.004634129
possible word 0.004612511
single word 0.0046024709999999995
word candidates 0.004588999
adaptive word 0.004537965
word formation 0.004534601
oov word 0.004531853
allowable word 0.004530139
word identification 0.004529864
nese word 0.004529487
word segmenter 0.004528763
word seg 0.0045222909999999995
word segmentations 0.004519317
generic word 0.004517847
word segmen 0.0045148079999999995
overall word 0.00450666
word classes 0.0045056079999999995
word boundaries 0.004501694
independent word 0.00449839
word break 0.004493926
extraneous word 0.004493926
pendent word 0.004493926
new words 0.002878828
morphological words 0.00260949
lexical words 0.002600183
common words 0.002459631
oov words 0.002436423
segmented words 0.002434576
independent words 0.00240296
model training 0.002274637
words 0.00217584
model feature 0.0020777779999999997
context model 0.002057074
name model 0.001833941
different segmentation 0.001805338
class model 0.0017636219999999998
source model 0.0017199399999999999
training data 0.0017150120000000001
such character 0.001707082
model parameter 0.0016888229999999999
model score 0.0016867219999999999
trigram model 0.001685319
model the 0.001628329
model parameters 0.001607683
parameters model 0.001607683
channel model 0.0016051099999999999
chinese language 0.001586864
different standard 0.0015721469999999999
character string 0.001474388
training corpus 0.0014724579999999998
training set 0.00146104
segmentation standard 0.001399643
different class 0.0013837530000000002
model 0.00136879
chinese sentence 0.0013665040000000002
approach models 0.001365956
character pairs 0.001342754
different applications 0.0013426850000000001
different sampling 0.001330346
character pair 0.00132972
training method 0.001322338
single character 0.001319666
different sets 0.001293585
segmentation errors 0.0012914150000000002
character strings 0.001288417
training algorithm 0.001287719
distribution models 0.001286456
different standards 0.001264359
statistical features 0.001262301
test corpus 0.001251391
test set 0.0012399730000000001
different seg 0.001239942
segmentation system 0.00123444
different ways 0.001231277
different domains 0.001230753
parameter training 0.00122588
different sam 0.001216782
different granularities 0.001214638
different requirements 0.001213886
chinese characters 0.001212887
different vocabularies 0.001212696
bakeoff training 0.001180351
msr training 0.001172947
class models 0.0011710940000000001
rate training 0.001165859
