feature feature 0.0023021
text categories 0.00218265
automatic text 0.002120332
text classification 0.00208444
text categorization 0.002066596
text summarization 0.001941455
text representation 0.0019281939999999998
standard text 0.0019186379999999999
training data 0.001878617
length text 0.0018317399999999999
particular text 0.001814824
matic text 0.0018147369999999999
text cate 0.001784702
text categoriza 0.001770612
tomatic text 0.001770612
data set 0.001723482
other data 0.001687898
test data 0.0016608859999999999
feature value 0.001645655
feature selection 0.001563644
training set 0.001525399
input features 0.001501767
feature values 0.001489551
input feature 0.001446117
feature input 0.001446117
possible feature 0.001443295
ing data 0.0014320779999999998
data representation 0.0014296539999999998
boolean feature 0.0014217750000000001
information retrieval 0.001407183
feature vectors 0.001391007
feature inputs 0.0013848390000000001
test documents 0.001368698
newsgroups data 0.00131045
word class 0.001309016
test set 0.0013076680000000001
learning method 0.0012851310000000001
categorization documents 0.0012758679999999999
annotated data 0.0012756059999999999
new training 0.001248827
automatic keyword 0.001241352
efficient information 0.001236618
frequency value 0.001213562
features 0.0012067
manual keywords 0.001179858
term frequency 0.00117746
different types 0.001167469
first set 0.001166459
keyword extraction 0.001165685
word unigram 0.001160898
different systems 0.001160447
different ways 0.0011543319999999999
feature 0.00115105
evaluation method 0.00115069
sentence extraction 0.0011427450000000001
extraction system 0.001137005
different weights 0.001128853
other experiments 0.001128254
words representation 0.001121271
training phase 0.001116201
important words 0.0010988439999999999
whole training 0.0010900950000000001
learning algorithm 0.001087281
different repre 0.0010814169999999999
automatic summary 0.0010807030000000001
impact keywords 0.001073924
keywords this 0.001068463
small set 0.001067641
last set 0.0010663909999999999
only keywords 0.001060619
few documents 0.001053567
second set 0.0010476559999999999
assigned keywords 0.0010475810000000001
tracted keywords 0.0010461560000000001
signed keywords 0.001042601
length documents 0.001041012
categorization task 0.001029406
categorization experiments 0.001008412
automatic summaries 0.001001378
extraction algorithm 9.90927E-4
automatic summarization 9.88007E-4
reuters corpus 9.86579E-4
unseen documents 9.85005E-4
empty documents 9.84253E-4
map documents 9.81176E-4
information 9.80514E-4
machine learning 9.72433E-4
test sets 9.69405E-4
keyword unigram 9.622210000000001E-4
experimental set 9.57586E-4
keyword representations 9.513190000000001E-4
third set 9.492579999999999E-4
known categories 9.459900000000001E-4
specific document 9.43426E-4
intact keyword 9.4061E-4
fourth set 9.39209E-4
other approaches 9.30252E-4
old set 9.209159999999999E-4
selection approach 9.137870000000001E-4
keyword indexing 9.131850000000001E-4
