unlabeled data 0.0028821199999999997
training data 0.002802966
data set 0.0027387849999999997
labeled data 0.002724122
data sets 0.002532511
online data 0.0025213139999999998
noisy data 0.0024777939999999997
second data 0.002470577
newsgroups data 0.0024408959999999997
webkb data 0.0024217089999999998
data sources 0.002414627
data the 0.002406673
uci data 0.0024045119999999997
word probability 0.002018686
first word 0.0020106309999999997
word similarity 0.001976564
word keywords 0.0019485919999999999
title word 0.001831764
word weights 0.001805061
times word 0.00175355
word tmax 0.00173249
supervised method 0.001529375
following method 0.0014717089999999999
many words 0.00143843
text classifier 0.00142389
other categories 0.001421638
title words 0.0014170440000000001
learning classifier 0.001401809
similar words 0.001399963
new method 0.001368247
supervised learning 0.0013479009999999999
content words 0.001338863
clustering method 0.001338191
candidate words 0.0013298939999999999
feature selection 0.0013265759999999999
selection method 0.001305672
categorization method 0.0012894080000000001
statistical feature 0.001286727
category set 0.001279423
similarity value 0.001243741
extraction method 0.001237191
feature projection 0.001235921
feature dimension 0.00123185
feature projections 0.001200549
text classifiers 0.001200202
method ourmethod 0.001195101
hybrid method 0.001186149
labeled training 0.001182068
important information 0.001181245
enhancing method 0.001179956
frequency value 0.001160504
learning approach 0.001158699
empirical results 0.0011477599999999998
text categorization 0.0011300149999999998
unlabeled examples 0.001128794
other classifiers 0.001122053
learning approaches 0.001116567
category frequency 0.001106895
other approach 0.001102631
other clustering 0.001100649
standard information 0.001096138
words 0.00108486
automatic text 0.00108194
learning algorithms 0.001077563
first problem 0.00107367
output value 0.0010650149999999999
context similarity 0.001059155
new training 0.001052826
training examples 0.0010496400000000001
category title 0.001045332
other title 0.0010405190000000002
other study 0.001029036
threshold value 0.00102765
information bottleneck 0.001024809
accurate learning 0.001023714
test document 0.001022749
binary value 0.001018644
information student 0.001017436
high performance 0.001007542
statistics value 0.001004226
experiment results 0.001004083
different number 0.001003707
good features 0.001002571
numerical value 0.001001222
category names 9.95378E-4
first task 9.94305E-4
seed information 9.83048E-4
unlabeled documents 9.76647E-4
information retrieval 9.749870000000001E-4
contextual information 9.70175E-4
sequential information 9.69329E-4
same way 9.67589E-4
feature 9.66781E-4
frequent categories 9.620519999999999E-4
keywords first 9.600629999999999E-4
miscellaneous categories 9.58962E-4
semantic similarity 9.54447E-4
experimental results 9.53495E-4
performance measure 9.52841E-4
appropriate categories 9.51866E-4
