language model 0.00333995
language models 0.00296795
backoff model 0.002733142
order model 0.002348873
new model 0.0023216169999999998
other models 0.002279025
model schema 0.0021608499999999998
gram language 0.002009804
distinct word 0.001985943
order models 0.001976873
interpolation models 0.001952181
word types 0.001897011
smoothing method 0.001858387
model 0.00185294
statistical language 0.001851224
general language 0.001844715
language mod 0.00178091
natural language 0.001777601
mle language 0.001771786
english data 0.001766246
mle models 0.001765716
language technology 0.001759377
guage models 0.0017521210000000001
other words 0.001746539
same count 0.0016783319999999998
training corpus 0.0016532130000000002
previous smoothing 0.001631488
new smoothing 0.001608567
probability distribution 0.0015870239999999998
smoothing methods 0.0015721819999999999
corpus counts 0.0015321570000000001
language 0.00148701
same context 0.001482714
models 0.00148094
data structure 0.001449327
ous smoothing 0.001416394
smoothing inappropri 0.00139952
possible probability 0.0013938599999999998
count parameters 0.001392467
probability values 0.001389303
discount parameters 0.001386207
same way 0.001366185
discount value 0.0013579500000000001
context counts 0.0013433920000000001
backoff methods 0.001312494
single discount 0.001290017
gram count 0.001282306
preceding words 0.001243945
unigram probability 0.0012389929999999999
probability estimates 0.001235671
total probability 0.001234798
distinct counts 0.001227311
katz backoff 0.001224122
same dis 0.001223141
lion words 0.0012187510000000001
probability mass 0.00120678
machine translation 0.001200825
conditional probability 0.001199088
actual training 0.001198046
new method 0.001187174
probability dis 0.001186603
probability estimate 0.001184872
ing corpus 0.001175332
pure backoff 0.0011749809999999999
unknown probability 0.001153866
probability val 0.001143143
single value 0.0011414630000000001
smoothing 0.00113989
different way 0.001116846
such methods 0.001116003
interpolation parameters 0.001104196
quantization error 0.001065558
discount parame 0.001060551
europarl corpus 0.001055816
actual corpus 0.001050151
corresponding discount 0.0010499020000000001
pus counts 0.001047149
ordinary counts 0.001044552
known method 0.001044166
discount ratio 0.001037687
additional constraints 0.001032676
conditional distribution 0.001021548
subtractive discount 0.0010143840000000001
timized discount 0.0010143840000000001
overestimation error 9.94106E-4
counting method 9.8404E-4
method our 9.83381E-4
fourth method 9.81108E-4
text string 9.75995E-4
large corpora 9.74549E-4
words 9.48454E-4
unique value 9.3055E-4
previous methods 9.238899999999999E-4
mle probabilities 9.23721E-4
interpolation methods 9.03533E-4
polation parameters 9.02069E-4
formula discounts 9.01001E-4
training 9.00554E-4
probability 8.82282E-4
backoff 8.80202E-4
