clustering word 0.002824152
word clustering 0.002824152
analysis word 0.0027435669999999997
word classes 0.0027090129999999997
distributional word 0.002662576
unknown word 0.00264375
word frequency 0.0026162439999999998
function word 0.002606536
word classification 0.0026058879999999998
word sense 0.002593306
word cluster 0.002586143
oov word 0.002579782
features data 0.00257123
word types 0.002560681
word forms 0.002556441
unsupervised word 0.002554911
word segmentation 0.002518353
word clusters 0.002497978
word recognition 0.002472756
other words 0.002470873
word fre 0.002463263
ferent word 0.0024616209999999998
word partition 0.002454227
word unigrams 0.002453019
word bigrams 0.002453019
word formation 0.0024514009999999998
explicit word 0.0024497629999999998
agglomerative word 0.002449181
vest word 0.002449181
upervised word 0.002449181
word clus 0.002449181
tomatic word 0.002449181
such words 0.0024450879999999998
tagging model 0.002366385
new words 0.002335683
language model 0.0023015320000000002
parsing model 0.002223271
different pos 0.002212513
class words 0.002176003
different features 0.002156993
supervised model 0.0021547520000000002
words table 0.002119064
unknown words 0.00211764
similar words 0.0021057899999999997
training data 0.002090955
new data 0.002090353
pos tagger 0.002087131
function words 0.002080426
classification model 0.002076358
hmm model 0.002070974
same data 0.00206863
linguistic data 0.002057607
oov words 0.002053672
english words 0.002048794
crf model 0.002008358
bagging model 0.00200385
discriminative model 0.001989355
bigram model 0.001983686
related words 0.001979426
combination model 0.0019773810000000003
data chinese 0.00197468
infrequent words 0.001965514
final model 0.001960543
sequential model 0.001956539
text data 0.001937372
pos tag 0.001934971
pos tagging 0.001929965
ging model 0.0019276850000000002
content words 0.0019239259999999998
quent words 0.0019235889999999999
quency words 0.001923554
test data 0.001923123
data set 0.001888082
different feature 0.001874328
labeled data 0.00187299
new features 0.001840323
clustering features 0.001802682
data problem 0.001787126
chinese pos 0.00178017
data column 0.001775824
unlabeled data 0.00177143
popular data 0.001753541
classes pos 0.001743063
different time 0.001736589
data sets 0.0017192219999999999
data brown 0.001701206
pos tags 0.001699914
data setting 0.0016975789999999998
sampling data 0.001693647
development data 0.00169199
tagger parser 0.00168607
parser tagger 0.00168607
pos label 0.001684635
gigaword data 0.001682835
sparse data 0.001679474
data consortium 0.0016781489999999999
velopment data 0.0016781489999999999
baseline features 0.0016744449999999999
morphological features 0.001660028
many features 0.0016587809999999998
