data features 0.0025017900000000003
category features 0.0022930700000000004
classification features 0.002100164
different feature 0.001962284
feature set 0.001908593
data set 0.0018853329999999999
category classification 0.0018605940000000001
same data 0.0018467000000000002
training data 0.0018315879999999999
language features 0.0017795110000000001
feature value 0.001769647
edit category 0.0017658090000000001
data language 0.0017486609999999999
markup features 0.001746334
test data 0.0017353070000000002
other category 0.0016976830000000002
category set 0.001676613
new features 0.001656895
entity features 0.001654274
category classifier 0.001634706
feature selection 0.001616994
edit categories 0.001613162
single feature 0.001602346
feature space 0.001600765
standard data 0.001597614
edit classification 0.001572903
data mining 0.001564633
individual features 0.0015612500000000001
ratio feature 0.0015591910000000001
other categories 0.001545036
ing data 0.001542052
feature reduction 0.001540647
data source 0.00153206
textual features 0.001527137
novel feature 0.001511437
sification features 0.001509788
labeled data 0.001507892
history data 0.001504747
meta data 0.001499028
traditional feature 0.001489403
huge data 0.001489389
feature spaces 0.0014872940000000001
feature groups 0.001483953
feature reduc 0.001480611
java data 0.001480123
category corpus 0.0014757210000000002
wikipedia edit 0.0014515980000000001
different edit 0.0014426130000000001
classification algorithm 0.001429006
same text 0.0014222990000000001
different wikipedia 0.001416093
category distribution 0.001391562
text similarity 0.001388443
word list 0.0013667689999999999
single word 0.0013567059999999998
overall category 0.0013559320000000002
category system 0.001344495
category base 0.0013207800000000001
category clas 0.0013164700000000001
majority category 0.0013135780000000001
word count 0.001306671
binary category 0.001304554
other classifier 0.001278889
diff pos 0.001275845
factual category 0.001273449
word lists 0.001271167
category taxonomy 0.001270462
category classifica 0.001270359
features 0.00126632
new categories 0.001264678
vandalism word 0.0012615259999999998
feature 0.00125873
category mapping 0.0012546360000000002
category baselines 0.001248502
classification task 0.001242998
vulgarism word 0.001238126
classification accuracy 0.001233939
wikipedia articles 0.0012336959999999998
different articles 0.0012247109999999999
label classification 0.001221585
same classifier 0.0012191860000000001
single categories 0.001217719
pos tags 0.001208087
pos tag 0.001198548
classification scores 0.0011944430000000001
different diff 0.0011797700000000001
individual categories 0.0011690329999999999
basic classification 0.001166754
overall classification 0.0011630260000000002
classification process 0.001149843
test set 0.0011497
english wikipedia 0.001149079
false pos 0.001146819
semantic similarity 0.0011433609999999999
classification tasks 0.001143248
type pos 0.001137707
classification systems 0.001134895
frequent edit 0.001132975
pair classification 0.001129568
pos tagger 0.001129449
