training data 0.00280004
unlabeled data 0.002472284
labeled data 0.002286651
ing data 0.002168157
annotated data 0.002082278
word sense 0.00205845
development data 0.002025561
balanced data 0.002015236
ment data 0.002013556
data scarcity 0.002007136
other words 0.001812148
positive class 0.001803422
training set 0.001734898
training examples 0.001729458
classifier training 0.00171847
minority class 0.0016512290000000002
class distribution 0.00164087
first word 0.0016335590000000001
text classification 0.001633307
class prediction 0.001625703
class distributions 0.001620603
training instances 0.001588204
skewed class 0.0015802160000000002
ity class 0.001556461
class skewness 0.001538502
labeled training 0.001533011
classification task 0.0015300869999999999
many words 0.001523199
learning algorithm 0.001523179
classifier performance 0.001464023
possible sense 0.001438043
small training 0.001411294
unlabeled set 0.0014071420000000001
supervised learning 0.0014065269999999999
label text 0.001387613
classification problem 0.001375495
new text 0.001374562
poor word 0.0013741600000000001
multiple words 0.001373277
training process 0.001359595
candidate word 0.0013587500000000001
svm learning 0.00135314
training instance 0.0013454909999999999
other system 0.001333924
learning algorithms 0.001314543
available training 0.001314286
class 0.00130962
binary classification 0.001304566
word choice 0.0013007910000000001
seed words 0.0012959760000000001
large set 0.0012689189999999999
bootstrapping algorithm 0.001268224
word usage 0.001265335
classification tasks 0.001263628
sense disambigua 0.001260896
itive sense 0.001257523
training exam 0.0012568479999999999
ative training 0.001256521
word selections 0.0012564750000000002
expansion words 0.001252724
augmented training 0.001251927
tial training 0.0012514499999999999
synthetic training 0.00125008
sentiment classification 0.001246797
tify words 0.0012420760000000002
such methods 0.00123484
many text 0.001224275
labeled set 0.001221509
labeled examples 0.001216069
learning techniques 0.001209986
supervised method 0.001208018
classification threshold 0.001207508
bootstrapping methods 0.001205829
machine learning 0.0012024240000000001
positive examples 0.00120006
classes system 0.00119579
review classification 0.001183684
classification problems 0.001157928
learning liter 0.001140359
negative examples 0.001130084
new problem 0.0011167500000000001
poor performance 0.0011141529999999999
first baseline 0.001110547
ing set 0.001103015
small set 0.001099792
other phenomena 0.001081732
binary classifier 0.001081435
new knowledge 0.001074354
other anything 0.001063995
other comments 0.001062864
positive instances 0.001058806
large number 0.001056346
other hand 0.001040084
identification task 0.001033749
sense 0.00102969
unlabeled reports 0.0010274070000000001
other people 0.001027099
training 0.0010232
multiple classes 0.001019631
words 0.00101383
