word frequency 0.002572942
word frequencies 0.002266064
specific word 0.0022627
vvt word 0.0022518110000000003
word vilt 0.002237582
vilt word 0.002237582
same data 0.00188411
training data 0.001776041
test data 0.001629167
same words 0.001599441
other words 0.001479019
textual data 0.001449928
data recognition 0.001424078
put data 0.001419581
weighting method 0.001275716
different words 0.001272539
new text 0.0012659
text dependency 0.0012621289999999999
average method 0.001244925
same category 0.001215556
method table 0.001202581
weighted words 0.00119513
square method 0.001193431
same frequency 0.001182332
text categorization 0.001180869
current method 0.001175641
training texts 0.0011708180000000001
method compar 0.001164328
words arc 0.001159783
other texts 0.001158465
method the 0.001153018
method air 0.001150954
vector model 0.001147267
tain words 0.001144466
automated method 0.0011384960000000001
uarc method 0.001137122
automatic text 0.0011356229999999999
similarity value 0.001113072
category training 0.0011074869999999999
conventional text 0.001089879
ferent text 0.0010898309999999999
text clas 0.001087693
ticular text 0.0010864009999999999
text chtegoriza 0.0010864009999999999
text catego 0.0010864009999999999
text vwuld 0.0010864009999999999
text categoriza 0.0010864009999999999
probabilistic model 0.001075407
test texts 0.001023944
new texts 9.89593E-4
recall value 9.601049999999999E-4
different texts 9.51985E-4
method 9.46009E-4
same domain 9.43448E-4
words 9.41491E-4
clustering texts 9.3867E-4
ing algorithm 9.34533E-4
document frequency 9.11484E-4
weighted value 9.1138E-4
same unit 8.89737E-4
constant value 8.80551E-4
mean value 8.775829999999999E-4
same sense 8.70028E-4
experiment category 8.67727E-4
deviation value 8.621869999999999E-4
tic information 8.53811E-4
ation value 8.49756E-4
decrease value 8.46516E-4
increment value 8.46516E-4
viation value 8.46516E-4
clustering algorithm 8.36624E-4
many categories 8.3625E-4
category name 8.32132E-4
probability theory 8.283909999999999E-4
textual information 8.25679E-4
other methods 8.229120000000001E-4
erent texts 8.09721E-4
similarity values 8.09378E-4
category assignment 8.030159999999999E-4
surface information 7.97182E-4
model 7.89021E-4
context dependency 7.873800000000001E-4
total frequency 7.78936E-4
first experiment 7.70555E-4
possible pairs 7.70393E-4
assigned categories 7.639610000000001E-4
suitable category 7.63171E-4
training documents 7.61932E-4
similarity measure 7.59324E-4
other approach 7.57965E-4
large number 7.51262E-4
journal corpus 7.4638E-4
term weighting 7.45736E-4
term weight 7.413610000000001E-4
training phase 7.37462E-4
other hand 7.33264E-4
wsj corpus 7.2882E-4
experiment term 7.261500000000001E-4
other mcthods 7.26024E-4
probabilistic models 7.2186E-4
