character word 0.00444573
chinese word 0.00442878
word segmentation 0.004145533
english word 0.0038850969999999997
word problem 0.003867424
word frequency 0.0038217389999999998
informal word 0.003815017
word list 0.003760125
word detection 0.003754384
word example 0.003745646
word seg 0.003737088
formal word 0.0037203009999999996
nese word 0.0037135429999999997
word boundaries 0.003710551
word recognition 0.0036801499999999997
online word 0.0036765879999999997
unknown word 0.0036753769999999996
mal word 0.003660061
word annotations 0.003659645
stop word 0.003658547
word delimiters 0.0036529429999999996
word orthography 0.00365156
lexical features 0.002330777
new features 0.002086099
chinese character 0.00202997
statistical features 0.00202651
informal words 0.0019236569999999998
same feature 0.001898912
tion words 0.001854929
component words 0.0018308999999999999
formal words 0.001828941
novel features 0.001825971
mal words 0.001768701
noisy words 0.0017609779999999998
feature set 0.0017553680000000002
statistical feature 0.00172604
chinese language 0.0016897409999999998
feature functions 0.0016691140000000002
training data 0.0016394600000000001
features 0.00157799
this feature 0.001548916
character sequence 0.0015482199999999999
input feature 0.001546665
broad feature 0.00154457
feature classes 0.0015418710000000002
words 0.00153091
whole feature 0.0015304670000000002
differential feature 0.001507736
chinese characters 0.0015025759999999998
crf model 0.001468949
character lexicon 0.001451501
chinese pinyin 0.001445431
classification model 0.001435644
lcrf model 0.00139657
fcrf model 0.0013502380000000001
current character 0.001342881
character bigrams 0.001333497
chinese form 0.001321231
input character 0.0012926049999999998
feature 0.00127752
chinese microblogs 0.00127303
ing data 0.001272706
character bigram 0.00127024
same corpus 0.001267633
inconsistent character 0.0012659309999999999
chinese lan 0.001261
chinese microblog 0.001258145
chinese microtext 0.001254762
character doubling 0.0012539449999999998
english translation 0.0012500060000000001
chinese infor 0.00124476
training set 0.001238212
learning models 0.001224586
first work 0.001193965
unlabeled data 0.0011878309999999999
other task 0.001159104
data preparation 0.001108953
sequence label 0.00110513
model 0.00110377
other learning 0.001101047
same set 0.00109924
language use 0.001077322
same pinyin 0.001060313
previous work 0.001050475
standard pinyin 0.001039091
such informality 0.001023607
character 0.00102346
chinese 0.00100651
different pmi 0.001000456
language change 9.98903E-4
output label 9.98288E-4
training instances 9.9088E-4
fcrf models 9.84815E-4
other tasks 9.84282E-4
recent work 9.82671E-4
different variables 9.82243E-4
field models 9.81033E-4
mutual information 9.726050000000001E-4
language processing 9.70865E-4
same way 9.68379E-4
