english wikipedia 0.0021713469999999997
article revision 0.002157263
wikipedia revision 0.002003933
wikipedia quality 0.001958343
article structure 0.001949727
article topics 0.0019233450000000001
article sets 0.001876772
article versions 0.001872001
good article 0.00185048
available article 0.001840151
lexical article 0.0018391850000000001
featured article 0.001836917
article scope 0.001830613
flawed article 0.0018104610000000002
whole article 0.001807384
arbitrary article 0.001788868
unseen article 0.001784784
article criteria 0.00178425
wikipedia xml 0.00175091
news articles 0.001738308
training data 0.001738033
wikipedia manual 0.001720697
java wikipedia 0.001699592
negative articles 0.0016911040000000001
wikipedia community 0.001650656
wikipedia revi 0.00163369
wikipedia library 0.0016329539999999998
other text 0.0015823349999999998
articles templates 0.0015742820000000002
article 0.0015567
labeled articles 0.001525094
description articles 0.001511523
featured articles 0.001500087
excellent articles 0.0014817320000000001
flawed articles 0.001473631
whole articles 0.001470554
untagged articles 0.001468785
unseen articles 0.001447954
articles avail 0.001445275
affected articles 0.001445275
wikipedia 0.00140337
many features 0.001397975
other template 0.001394575
category feature 0.001384142
template information 0.0013765119999999999
training quality 0.001350142
classification approach 0.001347703
classification performance 0.001344081
learning approach 0.001340532
quality information 0.001328354
training set 0.0013193419999999998
data selection 0.001267439
negative training 0.001266403
language detection 0.0012481789999999999
articles 0.00121987
annotated data 0.001219219
data min 0.001187854
first revision 0.0011850530000000001
training instances 0.001184887
feature type 0.001183938
weka data 0.001179508
topic set 0.0011647229999999999
work topic 0.001163735
other templates 0.001145856
human performance 0.001139717
same topic 0.0011354770000000002
reliable training 0.001130763
work template 0.001126316
positive training 0.001126134
similarity flaw 0.001123651
meta features 0.001123054
learner english 0.001121194
training sets 0.0011152409999999999
other sets 0.001111516
information page 0.0011106929999999998
quality flaw 0.0011036969999999998
stylometric features 0.001089984
cleanup template 0.0010664910000000001
text classifica 0.001061118
text classifi 0.001058451
native language 0.001057672
word length 0.001053283
performance evaluation 0.001052
same quality 0.0010498999999999999
same flaw 0.001043651
other hand 0.001031388
training instance 0.001029668
training phase 0.001026192
plain text 0.001022933
information management 0.001019529
topic bias 0.0010190520000000001
different dataset 0.001016971
many cleanup 0.001016964
topical similarity 0.001013422
topic distribution 0.001010866
unstructured information 0.001005902
miscellaneous information 0.001005247
machine learning 0.001001816
ifiable information 0.001000964
information effi 0.001000964
