word features 0.00524149
word embeddings 0.004834771
word feature 0.004714894999999999
different word 0.0046140619999999995
word representations 0.00461317
word representation 0.0045030009999999995
chinese word 0.004479481
word clusters 0.004441298999999999
hmm word 0.00442386
word type 0.004411162999999999
unsupervised word 0.004399609
many word 0.004399182999999999
word types 0.004358501
distributional word 0.004357275
last word 0.004349781
certain word 0.004349113
particular word 0.004328688999999999
word segmentation 0.004288386999999999
rare word 0.004277965
word class 0.00427023
word contexts 0.0042682399999999995
tional word 0.004263045
word rep 0.004256055
word bigram 0.00425096
current word 0.004250055
word repre 0.0042474439999999995
word tokens 0.004247201999999999
word representa 0.0042435929999999995
extra word 0.004233506999999999
millions word 0.004231196
word usage 0.004228963
conclusions word 0.004228963
training data 0.0021564690000000003
vocabulary words 0.002017697
clusters words 0.002009309
first words 0.001999897
unlabeled words 0.00191701
related words 0.0018806670000000001
frequent words 0.001857762
neural model 0.0018488610000000003
rare words 0.0018459750000000001
lexical features 0.001845856
language model 0.001827254
tween words 0.001811231
millions words 0.001799206
representation features 0.001763611
baseline features 0.0017606090000000002
training corpus 0.001738937
model parameters 0.001700869
linear model 0.001690836
tag features 0.00168267
brown features 0.00166978
embedding features 0.001664055
model evaluation 0.0016606160000000002
supervised model 0.001657244
training set 0.001656102
sentence training 0.001626292
model com 0.001612137
new features 0.00160381
unsupervised training 0.001570219
many training 0.001569793
corresponding model 0.001567683
predictive model 0.001567124
large training 0.001565619
words 0.00155845
final model 0.001546026
model updates 0.001545706
model output 0.001544344
training loss 0.001541462
features templates 0.001538612
perceptron model 0.001527719
binary features 0.001522005
training improvements 0.001516378
training sentences 0.001514554
state features 0.001512298
input features 0.00150974
extra features 0.001494117
compound features 0.0014931010000000002
different models 0.001492476
unigram features 0.0014908030000000002
pound features 0.0014908030000000002
training strategy 0.001439682
labeled training 0.001428571
training criterion 0.001420883
training updates 0.0014179560000000002
training examples 0.0014175120000000001
training corpora 0.001411633
training partition 0.001409024
curriculum training 0.001407805
language models 0.001407308
sufficient training 0.001403013
training regime 0.001401419
semantic knowledge 0.001370973
unlabeled data 0.001353979
semantic dependency 0.001299995
feature vector 0.001294759
model 0.0012888
data sparsity 0.001275233
validation data 0.001265331
labeled data 0.00126294
