language model 0.0051342
model parameters 0.004324447
initial model 0.004191008
probabilistic model 0.004175733
typesetting model 0.004164201
model order 0.004160195
generative model 0.0041426150000000005
inking model 0.004102363
noise model 0.004060121
model noise 0.004060121
logistic model 0.004040634
guage model 0.004040201
fine model 0.004038388
model param 0.004022158
model most 0.00402157
model 0.00379334
word error 0.002180339
language models 0.002018359
word errors 0.001954313
first language 0.001934122
gold word 0.00181531
different language 0.00179818
word regions 0.001794817
duce word 0.001757736
domain language 0.0016274599999999998
unigram language 0.001596511
language mod 0.00158604
nyt language 0.001572513
strong language 0.001571256
test set 0.001415686
character error 0.001379416
same character 0.001378021
language 0.00134086
large set 0.001300164
same probability 0.0012710730000000002
translation models 0.0012194179999999999
first test 0.001207497
character segmentation 0.001186222
same time 0.00117475
historical data 0.001121212
width distribution 0.00110986
character box 0.001080838
development set 0.001065906
corresponding character 0.00106499
set rep 0.001039962
global set 0.001029399
character length 0.001027901
generative models 0.001026774
first baseline 0.001024969
relative error 0.001024627
special character 0.001014845
same way 0.001014638
character sequences 0.001009406
bailey corpus 0.001009094
error features 9.98283E-4
null character 9.95456E-4
character type 9.90433E-4
stop character 9.89443E-4
ith character 9.86733E-4
each character 9.80832E-4
random variables 9.80524E-4
such parameters 9.74934E-4
error rate 9.72746E-4
character boxes 9.69283E-4
error con 9.650259999999999E-4
bailey test 9.64393E-4
character tokens 9.51946E-4
parameter values 9.49672E-4
error analysis 9.36697E-4
whole words 9.1902E-4
error reduction 9.16746E-4
machine translation 9.162829999999999E-4
first step 9.15377E-4
initial parameter 9.13479E-4
same amount 9.13185E-4
parameter matrix 9.109610000000001E-4
text baseline 9.0798E-4
noise distribution 9.05966E-4
test documents 9.05737E-4
glyph width 9.01425E-4
possible width 8.97957E-4
ing typesetting 8.9743E-4
first line 8.9485E-4
error rates 8.94033E-4
first dataset 8.932969999999999E-4
font figure 8.92126E-4
nyt corpus 8.90589E-4
font parameters 8.863639999999999E-4
uniform distribution 8.72832E-4
parameters pixel 8.67195E-4
background distribution 8.66899E-4
first document 8.66529E-4
first sampling 8.66264E-4
ocr system 8.64688E-4
finereader system 8.57413E-4
corresponding parameter 8.55834E-4
several documents 8.55797E-4
test sets 8.53533E-4
test doc 8.50055E-4
multinomial parameters 8.489769999999999E-4
