word segmentation 0.00347291
word length 0.002923631
morphological word 0.002805523
segmentation system 0.0026078300000000002
word boundaries 0.002499495
unsupervized word 0.002450528
word structures 0.002378919
nese word 0.0023759469999999998
segmentation systems 0.0019659789999999996
sentence segmentation 0.0018529759999999999
high segmentation 0.001852785
segmentation problem 0.001808482
training corpus 0.0017991320000000002
various segmentation 0.0017566869999999998
segmentation scores 0.001707105
diﬀerent segmentation 0.0016765859999999999
mentation system 0.001658256
unsupervized segmentation 0.001655838
supervized segmentation 0.001655015
model 0.00164196
segmentation decisions 0.0016205059999999999
chinese data 0.0016185919999999999
segmentation sys 0.001612068
segmentation guidelines 0.0016117339999999999
segmentation bakeoﬀ 0.001597699
unsupervized system 0.0015854480000000002
supervized system 0.001584625
segmentation guide 0.0015790679999999999
system deal 0.0015203600000000001
human language 0.001392824
segmented corpus 0.001386609
cityu corpus 0.001375011
est corpus 0.0013612000000000001
segmentation 0.00133911
input data 0.0013258319999999999
empirical data 0.001319488
system 0.00126872
grammatical words 0.001245613
bakeoﬀ data 0.001236753
other systems 0.001233269
same length 0.001222335
data visualization 0.0012187069999999999
length level 0.001132302
corpus 0.00111508
content words 0.0010897950000000002
other errors 0.001013933
chinese text 0.0010116419999999999
chinese characters 0.001007195
previous systems 0.001002312
overall results 9.58651E-4
segmented training 9.55581E-4
decoding algorithm 9.51232E-4
chinese numbers 9.49777E-4
good performance 9.44523E-4
unsupervized systems 9.43597E-4
supervized systems 9.42774E-4
processing systems 9.20328E-4
chinese speaking 9.19642E-4
high frequency 9.063249999999999E-4
mandarin chinese 9.05795E-4
same corpora 8.9589E-4
questionable results 8.85396E-4
report results 8.85396E-4
chinese script 8.82625E-4
standard baseline 8.730789999999999E-4
performance levels 8.632819999999999E-4
other end 8.48379E-4
language 8.40774E-4
nvbe algorithm 8.38546E-4
words 8.29982E-4
many errors 8.25242E-4
challenging task 8.22743E-4
various seg 8.185919999999999E-4
human labor 7.974740000000001E-4
human eﬀort 7.974740000000001E-4
length 7.89831E-4
many text 7.88923E-4
diﬀerent number 7.87428E-4
branching entropy 7.837510000000001E-4
balanced cor 7.7194E-4
linguistic hypothesis 7.680250000000001E-4
possible segmentations 7.67564E-4
unsupervized learning 7.64248E-4
linguistic boundary 7.60794E-4
default value 7.5915E-4
criminative value 7.5915E-4
boundary situation 7.447350000000001E-4
many studies 7.407E-4
msr corpora 7.38767E-4
diﬀerent seg 7.38491E-4
such characters 7.384399999999999E-4
diﬀerent types 7.369900000000001E-4
segmented corpora 7.34915E-4
large vbe 7.29515E-4
iterative process 7.19042E-4
unsupervized seg 7.177430000000001E-4
diﬀerents values 7.13402E-4
right context 7.06646E-4
large part 7.02951E-4
clear boundaries 7.0287E-4
