environment model 0.002563431
model performance 0.002428359
transition model 0.002405323
previous model 0.002395212
relevant model 0.002324214
ment model 0.002256658
model con 0.002246701
ronment model 0.002235913
true model 0.002227567
model outper 0.002217503
vironment model 0.002217503
state information 0.002149662
environment state 0.002111591
model 0.00198482
current state 0.001982833
new state 0.001978143
state space 0.001958569
state transition 0.0019534829999999998
mapping state 0.001887789
decision state 0.001887333
state transitions 0.001835708
state parts 0.001809508
state samples 0.001808528
ment state 0.001804818
ronment state 0.001784073
state changes 0.001783723
feature function 0.0017774309999999999
complete state 0.001771323
state observed 0.001770398
known state 0.0017677959999999999
state transi 0.001767341
state tran 0.001765546
state 0.00153298
language learning 0.001497747
language document 0.0014850050000000002
document action 0.00146655
action document 0.00146655
feature representation 0.0013589750000000001
local features 0.001354846
feature space 0.001341486
card game 0.0013380179999999998
learning algorithm 0.001331615
reward function 0.001328753
policy function 0.001327118
key feature 0.001319199
document words 0.001298958
transition function 0.001282037
other document 0.001274685
action sequence 0.001228279
arate features 0.001227822
action performance 0.001208496
tion function 0.001193597
environment states 0.0011881880000000002
feature functions 0.0011848890000000002
start word 0.001182008
policy learning 0.001179919
such states 0.001177151
document set 0.0011736820000000001
language analysis 0.0011621420000000001
same time 0.001160351
action accuracy 0.001146048
correct action 0.00114197
learning rate 0.001138435
language documents 0.001136253
high action 0.001134269
learning process 0.001133697
natural language 0.001124179
mapping action 0.001119766
instruction text 0.001110321
word span 0.001098193
word spans 0.001093578
struction word 0.0010894189999999999
linguistic information 0.001087829
initial training 0.001086418
overall action 0.001078519
environment command 0.001070185
text instructions 0.001070146
dataset action 0.001067158
text analysis 0.00106676
training documents 0.0010658669999999999
language interpretation 0.001065491
other work 0.001063617
new algorithm 0.001062443
gradient learning 0.00105933
such command 0.0010591480000000002
action selection 0.001058368
first approach 0.0010560769999999999
learning parameters 0.001050525
environment reward 0.00104583
mapping text 0.0010428389999999998
transition information 0.001037185
learning our 0.001034511
environment knowledge 0.0010327840000000001
reinforcement learning 0.001026375
incorrect action 0.001024743
annotated data 0.001023763
learning techniques 0.0010211769999999998
above action 0.0010207950000000001
automatic document 0.001018454
ural language 0.001016512
