labeled data 0.004168288
unlabeled data 0.004158714
data points 0.004021169
new data 0.0039853169999999995
data set 0.003984181
standard data 0.003838677
tagging data 0.003759975
data point 0.0037292609999999998
journal data 0.0037173889999999998
original data 0.003667384
available data 0.003661528
representative data 0.003649829
lar data 0.00363794
opment data 0.003636167
resentative data 0.003636167
learning algorithm 0.002162934
new learning 0.001991277
word sense 0.00198117
learning algorithms 0.0017440070000000001
learning curve 0.00165007
learning tech 0.001646111
cluster label 0.001626829
other words 0.001617121
classification accuracy 0.001600715
classification time 0.001532477
feature vector 0.001515635
classification algorithms 0.0014219279999999998
new training 0.001403716
training set 0.00140258
neighbor classification 0.0013909769999999998
accurate classification 0.0013395529999999998
several points 0.0013297
nonparametric classification 0.001322363
learning 0.00131905
ing algorithm 0.0013182810000000001
neighbor algorithm 0.0012378900000000002
cnn algorithm 0.001228603
vocabulary words 0.001206305
class labels 0.001188168
euclidean distance 0.001172831
algorithm store 0.00116923
class prediction 0.001169209
correct label 0.001160918
training sets 0.001158171
first column 0.001153514
similar points 0.001146403
condensed training 0.001145323
positive results 0.001133916
input tagger 0.001123715
condensed models 0.001118445
bayes probability 0.0011148619999999999
overall training 0.001101856
language processing 0.00109066
sense dis 0.001089926
set condensation 0.001087869
condensed set 0.001084925
natural language 0.001078543
national corpus 0.001074334
neighbor classifier 0.001037869
second experiment 0.001033209
brown corpus 0.00102875
intractable problem 0.001026669
unsupervised tag 0.0010192159999999999
speech tagger 0.001014341
model 0.00101292
good representatives 0.001010946
many algorithms 0.001004562
new wcnn 0.001002798
good representative 9.97566E-4
set con 9.97318E-4
classification 9.96971E-4
good deci 9.96429E-4
set conden 9.948309999999998E-4
second column 9.89225E-4
such algorithms 9.781030000000001E-4
large number 9.73875E-4
pervised tagger 9.56291E-4
several strate 9.54774E-4
feature 9.45676E-4
accuracy increases 9.31836E-4
selection criterion 9.186559999999999E-4
simple selection 9.02753E-4
words 8.78926E-4
pendency parsing 8.76864E-4
see figure 8.733839999999999E-4
simple technique 8.6792E-4
cluster 8.63655E-4
possible representatives 8.627820000000001E-4
error reduction 8.621239999999999E-4
third column 8.5228E-4
small subsets 8.50318E-4
red scatter 8.48612E-4
algorithm 8.43884E-4
condensation algorithms 8.41735E-4
substantial improvements 8.33599E-4
condensation techniques 8.299900000000001E-4
distance 8.26722E-4
scatter plot 8.2649E-4
densation technique 8.25724E-4
case unsu 8.24591E-4
