word segmentation 0.00406174
chinese word 0.00374719
training data 0.00342962
word segmenter 0.0031547560000000003
segmentation model 0.00292756
word seg 0.0028634700000000003
domain data 0.0028013590000000002
unsupervised word 0.0027822040000000004
data set 0.002766408
word formation 0.0027635610000000003
tag data 0.002754836
word segmenta 0.00272842
unlabeled data 0.002565725
labeled data 0.002546841
data size 0.002488804
pku data 0.002484291
development data 0.002442496
data changes 0.002421364
data tions 0.002411894
character sequence 0.0022628070000000004
same segmentation 0.0021874859999999998
segmentation performance 0.002182042
first character 0.002159163
training corpus 0.002152869
segmentation results 0.00212469
segmentation problem 0.002088885
segmentation standard 0.0020829209999999997
chinese characters 0.002064321
character labeling 0.002016764
last character 0.001939877
character instances 0.001936475
training set 0.001910368
segmentation problems 0.001909257
segmentation tools 0.0018952689999999998
poor segmentation 0.0018888919999999999
segmentation peng 0.0018888919999999999
chinese english 0.001758959
chinese seg 0.00172682
new words 0.001680011
single chinese 0.001661729
training process 0.001646868
segmentation 0.00161982
different methods 0.0016083010000000001
international chinese 0.001606784
character 0.00159233
different size 0.001491284
baseline feature 0.001440519
punctuation information 0.001407317
vocabulary words 0.001345213
news corpus 0.001339303
supervised learning 0.001322363
model 0.00130774
chinese 0.00130527
training 0.00128679
segmented corpus 0.001286625
tag sequence 0.001282483
new domain 0.0012717499999999999
useful information 0.001258747
feature templates 0.0012544540000000001
discrete features 0.001245026
tag set 0.001235584
pku corpus 0.00120754
puctuation information 0.0011851890000000001
simple approach 0.00118202
discrete feature 0.001176463
labeling approach 0.001168533
anced corpus 0.00116655
other tag 0.001158614
results method 0.001149579
enlarged corpus 0.0011364819999999999
sequence labeling 0.001094911
algorithm input 0.001091634
target domain 0.001069316
words 0.00106679
algorithm our 0.001061441
our algorithm 0.001061441
confident characters 0.0010595779999999998
opposite approach 0.001058329
baseline methods 0.001002989
source domain 9.94263E-4
whole work 9.86355E-4
tag sets 9.70877E-4
features 9.69084E-4
punctuation news 9.66162E-4
related work 9.64859E-4
number punctuation 9.63345E-4
supervised methods 9.578779999999999E-4
new fea 9.486779999999999E-4
pure sequence 9.44844E-4
label instances 9.44024E-4
english number 9.24096E-4
contrasting method 9.17813E-4
first line 9.15578E-4
information 9.14379E-4
new kind 9.05725E-4
tag balance 9.02172E-4
feature 9.00521E-4
main results 8.9941E-4
label ’n’ 8.97134E-4
labeling problem 8.934990000000001E-4
