word model 0.00485016
chinese word 0.00416748
word length 0.003648943
single word 0.003643174
word boundaries 0.0035577630000000002
last word 0.003546224
unseen word 0.003484893
real word 0.003468577
preceding word 0.00345503
segmentation algorithm 0.002839624
segmentation problem 0.002698117
chinese words 0.00260639
segmentation system 0.002556175
segmentation errors 0.002528953
character model 0.002504045
segmentation accuracy 0.0024654860000000002
segmentation example 0.002449651
language model 0.002368271
segmentation criterion 0.002367382
original segmentation 0.002367211
optimal segmentation 0.002364394
correct segmentation 0.002354322
segmentation points 0.00235044
segmentation algo 0.002340085
inal segmentation 0.002325297
rect segmentation 0.002325297
other words 0.002120798
segmentation 0.00208186
trigram model 0.002048089
good words 0.002037709
common words 0.002027447
interpolation model 0.002007862
history words 0.001974815
long words 0.0019428050000000001
unigram model 0.001935184
separate words 0.001929867
unseen words 0.001923803
based model 0.001901191
pound words 0.001893577
problematic words 0.001893577
chinese corpus 0.0018333120000000001
chinese character 0.0018213650000000001
chinese information 0.00169725
chinese language 0.0016855910000000002
words 0.00164988
training data 0.001646895
model 0.00163919
chinese sentence 0.001590344
chinese text 0.0015124700000000001
first corpus 0.0014339029999999998
chinese characters 0.001433251
text data 0.001403252
sentence test 0.001342104
chinese lms 0.001341787
segmented corpus 0.001307355
test set 0.001292884
segmented data 0.0012778450000000001
last character 0.001200109
other language 0.001199999
same set 0.0011790490000000002
unsegmented corpus 0.001160946
unsegmented data 0.001131436
gram character 0.001121801
iterative algorithm 0.001106636
whole data 0.001098473
large set 0.001094432
presegmented data 0.0010914660000000001
language processing 0.001058225
greedy algorithm 0.001057194
same time 0.00105631
statistical speech 0.001035678
tation algorithm 0.0010319419999999999
segmented set 0.001015167
speech recognition 9.92973E-4
segmented text 9.86513E-4
nese language 9.7635E-4
ion accuracy 9.737389999999999E-4
morphosyllabic language 9.72997E-4
independent test 9.5657E-4
chinese 9.5651E-4
second set 9.415129999999999E-4
statistical approach 9.41485E-4
average length 9.23879E-4
fundamental problem 9.192310000000001E-4
initial segmenta 9.03414E-4
initial vocabu 8.98201E-4
segmentat ion 8.86539E-4
same level 8.860580000000001E-4
partial sentence 8.78292E-4
lus ion 8.77835E-4
corpus 8.76802E-4
luat ion 8.763639999999999E-4
character 8.64855E-4
serious problem 8.634980000000001E-4
ing procedure 8.45694E-4
good source 8.39266E-4
long sentences 8.37369E-4
first cor 8.289069999999999E-4
human segmen 8.23697E-4
text segmenta 8.127E-4
