translation model 0.00416741
language model 0.00386041
model training 0.0034291499999999997
training data 0.0033742599999999996
model text 0.0032516050000000003
model feature 0.003091862
different data 0.003009005
test data 0.002940736
other data 0.0029384249999999997
data selection 0.002856053
domain data 0.002816422
data set 0.0027982529999999997
model probabilities 0.0027523590000000002
data tokens 0.0026882769999999998
language models 0.00266687
ing data 0.002655903
random data 0.002643318
gigaword data 0.002639244
gram model 0.002609413
experimental data 0.002606345
data source 0.0025963429999999997
much data 0.0025692019999999996
available data 0.002554005
unigram model 0.002549961
guage model 0.002539647
data sets 0.0025262419999999997
machine translation 0.002518726
translation system 0.002486471
data sources 0.002473996
translation task 0.002425168
language test 0.002330236
model 0.0022629
translation application 0.002235546
domain language 0.002205922
overall translation 0.0021916310000000003
cal translation 0.002181504
translation objective 0.00216695
ing language 0.002045403
target language 0.002034794
gigaword language 0.002028744
training corpus 0.001990941
word corpus 0.001944111
separate language 0.0019066959999999998
translation 0.00190451
latter language 0.001884937
unigram language 0.0018845709999999998
language mod 0.0018669339999999998
auxiliary language 0.0018561379999999998
modifed language 0.0018557609999999998
training set 0.001756493
corpus sentence 0.001628063
language 0.00159751
candidate text 0.0015057830000000001
training sets 0.001484482
parallel text 0.0014771370000000001
europarl training 0.0014684519999999999
feature function 0.0014531840000000002
different selection 0.001449038
sentence probability 0.001412554
text source 0.0013770380000000001
corpus size 0.0013678470000000002
unigram models 0.001356421
binary text 0.0013535250000000002
guage models 0.001346107
test set 0.0013229689999999998
text classifiers 0.0013185660000000002
feature weight 0.001312878
text segments 0.001311494
text segment 0.0012828680000000002
gigaword sentences 0.001273941
other methods 0.001265184
feature scores 0.001256493
gigaword corpus 0.001255925
text seg 0.00125339
sentence log 0.001244575
gigaword sentence 0.001234606
other tokens 0.001210682
different vocabulary 0.001199504
same vocabulary 0.0011955310000000001
size selection 0.001191199
selection method 0.001190657
selection methods 0.001182812
same distribution 0.0011790260000000001
training 0.00116625
specific corpus 0.001150926
task performance 0.001145196
sentence count 0.001138567
separate feature 0.001138148
set size 0.001133399
europarl corpus 0.001126893
short sentences 0.001102571
domain perplexity 0.001094931
feature func 0.001094831
sentence length 0.001086272
random selection 0.001083351
set perplexity 0.001076762
english side 0.001075853
set tokens 0.00107051
models 0.00106936
same curve 0.001058811
