word segmentation 0.00396449
chinese word 0.003806786
word sequence 0.00350575
first word 0.003422315
word length 0.003301772
current word 0.003292369
word level 0.003192776
word segmenta 0.003190196
word boundaries 0.003187662
word list 0.003186163
nese word 0.003181964
word disambiguation 0.003178587
word sequences 0.003154062
word segmen 0.003145885
independent word 0.003137427
word composition 0.003134909
character words 0.00262474
chinese character 0.002389326
character sequence 0.00208829
other features 0.002016645
first character 0.002004855
training corpus 0.001953101
tagging segmentation 0.001885177
training data 0.0018439810000000002
segmentation results 0.0018323319999999999
chinese characters 0.0018266060000000001
character level 0.001775316
character encodings 0.001762366
character sequences 0.001736602
large training 0.001657929
language model 0.0016438199999999998
top words 0.0016185709999999999
last words 0.001563342
new words 0.001555052
iob tag 0.001540198
training size 0.0015229240000000002
tagging approach 0.0015177709999999998
common words 0.0015044429999999998
segmentation process 0.0014988
rate training 0.001486575
frequent words 0.001481124
training corpora 0.001480389
character 0.00144884
entropy model 0.0014349529999999999
segmentation framework 0.001434381
features 0.00143137
segmented training 0.0014090630000000001
encodings training 0.001396296
tag sequences 0.001385702
training stage 0.001375828
mental segmentation 0.001372222
search method 0.001371434
segmentation signifi 0.001369762
binary feature 0.001368682
segmentation ambi 0.00136694
maxent model 0.001308451
guage model 0.0012990369999999998
language models 0.0012379370000000002
tagging approaches 0.001231547
same approach 0.0012303969999999998
iob tagging 0.001229245
ing data 0.001224187
chinese char 0.00122112
pku corpus 0.001211598
test data 0.001209101
tagging methods 0.001194506
measure approach 0.001193411
top results 0.001176813
words 0.0011759
iob tags 0.001164977
error rate 0.001163835
alphabetical characters 0.001157717
beam search 0.001156853
corpus statistics 0.001144317
corpus abbrev 0.001140668
entropy approach 0.001137787
good results 0.001132092
different lexicon 0.001103408
segmentation 0.00109819
standard sighan 0.0010878530000000002
training 0.00108277
optimal performance 0.001078111
experimental results 0.001068553
tation performance 0.0010588540000000001
subword tagging 0.001057265
feature 0.00105365
delta function 0.001039288
ging approach 0.001034339
minimum error 0.001031871
iob tagger 0.001031116
mentation results 0.001030629
ter results 0.001030625
other approaches 0.001029835
imental results 0.001028158
model 0.00102795
comparative results 0.001023363
ging method 0.001018092
crf approach 0.001009709
viterbi search 0.001009686
contrary results 0.001004444
