word counts 0.001786789
word dis 0.0017202209999999999
multinomial model 0.001494994
stochastic model 0.001453803
event model 0.0014426040000000001
nomial model 0.00143224
training documents 0.001227634
probability distribution 0.001215021
model 0.0012044
language process 0.0011994100000000001
other class 0.001183
text documents 0.0011768669999999998
natural language 0.001149206
same class 0.001143992
same distribution 0.001133462
different classes 0.001125066
frequent words 0.00112059
standard data 0.0011150230000000001
test documents 0.0010881979999999999
same length 0.001073027
average probability 0.001060648
different distributions 0.001058568
ing documents 0.001040273
machine learning 0.001034775
new feature 0.0010058340000000002
positive documents 9.935249999999999E-4
tive documents 9.68605E-4
data sets 9.63513E-4
feature selection 9.61406E-4
text classification 9.56554E-4
short documents 9.5501E-4
small error 9.29088E-4
probability distribu 9.28989E-4
mutual information 9.26423E-4
national corpus 9.1986E-4
language 9.08954E-4
feature selec 8.96986E-4
test document 8.83176E-4
words 8.69788E-4
new document 8.41428E-4
bayes classifier 8.3896E-4
individual training 8.37628E-4
similar distribution 8.260979999999999E-4
learning technique 8.01799E-4
recall values 7.853770000000001E-4
true class 7.748189999999999E-4
training doc 7.64332E-4
classification accuracy 7.61828E-4
previous section 7.58903E-4
likely class 7.56026E-4
vocabulary size 7.55985E-4
naive bayes 7.5533E-4
time requirement 7.51359E-4
uneven distribution 7.44868E-4
cation error 7.442060000000001E-4
random variables 7.41383E-4
document generation 7.411109999999999E-4
error bars 7.37187E-4
following considerations 7.34921E-4
ual document 7.300249999999999E-4
new method 7.23659E-4
equal weight 7.23238E-4
log p˜d 7.223069999999999E-4
popular machine 7.220169999999999E-4
additional experiments 7.16498E-4
documents 7.05686E-4
previous studies 7.02417E-4
probability 7.00164E-4
selection score 6.993430000000001E-4
microaveraged recall 6.99182E-4
macroaveraged recall 6.96462E-4
eraged recall 6.929270000000001E-4
est recall 6.85857E-4
prior probabilities 6.745270000000001E-4
feature 6.6507E-4
many applications 6.6433E-4
ing tasks 6.59722E-4
future work 6.566580000000001E-4
known classes 6.56639E-4
corpus 6.53107E-4
average divergence 6.47346E-4
tive classes 6.47211E-4
random vari 6.46481E-4
maximum likelihood 6.408360000000001E-4
common practice 6.39219E-4
web content 6.390930000000001E-4
many domains 6.35731E-4
true classes 6.33724E-4
additional com 6.27056E-4
ment probabilities 6.25605E-4
total number 6.23057E-4
mutual infor 6.17816E-4
homogeneous classes 6.16046E-4
gle classifier 6.15848E-4
ing functions 6.08759E-4
reuters dataset 6.08213E-4
strong connection 6.002690000000001E-4
ditional probabilities 5.95458E-4
token num 5.93267E-4
alphabetic characters 5.9245E-4
