language model 0.0026088099999999996
output word 0.0025536359999999998
other words 0.0025054079999999998
input word 0.0024443
training data 0.00243656
input words 0.00237269
words input 0.00237269
special word 0.002291558
word embeddings 0.002237374
word representations 0.002229263
word types 0.002202787
word alignments 0.002178144
group words 0.002089932
test data 0.002055319
new model 0.002022916
model parameters 0.0020072049999999998
language models 0.0018452209999999998
words 0.00184165
ing data 0.001807806
machine translation 0.0017674649999999997
translation system 0.0017649649999999998
neural language 0.0016784119999999998
development data 0.001666315
data sparsity 0.0016640889999999999
target language 0.001654201
different language 0.001636172
model 0.00154942
translation quality 0.0015391519999999998
training time 0.00153075
spanish translation 0.0015126279999999998
output layer 0.0014871020000000001
hidden layer 0.001414524
language pairs 0.0013793899999999999
input layer 0.0013777659999999999
nist bleu 0.001355457
training example 0.001332518
training examples 0.00132591
probabilistic language 0.0013213399999999998
ral language 0.0013175379999999998
nce training 0.001312684
bilistic language 0.001305676
mle training 0.00128578
test baseline 0.0012823
probability distribution 0.00127017
translation 0.00126594
neural network 0.0012561
such models 0.001233341
feature space 0.001216269
second layer 0.00119589
other tasks 0.001173435
large corpus 0.0011631649999999999
softmax layer 0.0011433300000000001
discriminative models 0.001115737
max layer 0.001102895
previous work 0.001072398
guage models 0.001069305
work parameters 0.001067421
language 0.00105939
training 0.00103882
other scores 0.001033804
additional set 0.001025807
large vocabulary 9.84143E-4
network architecture 9.83926E-4
neural lms 9.72696E-4
linear units 9.49252E-4
other nplms 9.476389999999999E-4
context matrices 9.46831E-4
standard maximum 9.258889999999999E-4
network com 9.25005E-4
neural net 9.233399999999999E-4
test dev 9.16341E-4
dev test 9.16341E-4
other strategies 9.15093E-4
vocabulary size 9.145310000000001E-4
commentary test 9.12232E-4
related work 9.10212E-4
output biases 8.99927E-4
ent output 8.95183E-4
learning rate 8.925459999999999E-4
unnormalized output 8.920110000000001E-4
neural proba 8.909020000000001E-4
output activations 8.892500000000001E-4
large vocabularies 8.617379999999999E-4
various values 8.61611E-4
average score 8.56824E-4
input embeddings 8.55154E-4
hidden layers 8.527739999999999E-4
layer 8.46726E-4
gram probabilities 8.271540000000001E-4
fbis corpus 8.23977E-4
sparse matrix 8.127710000000001E-4
new variant 8.068540000000001E-4
noise distribution 7.98702E-4
models 7.85831E-4
same datasets 7.84372E-4
alternative estimation 7.816979999999999E-4
matrix multiplications 7.778770000000001E-4
probability 7.74799E-4
unigram distribution 7.69917E-4
cabulary size 7.69617E-4
