bigram word 0.001615809
data set 0.001554806
language model 0.001551376
word pairs 0.001535105
bigram model 0.001516859
word distributions 0.001490897
word order 0.0014233079999999999
times word 0.001417329
word history 0.001409813
next word 0.001399065
word types 0.001382903
word type 0.001379805
previous model 0.001340866
word orders 0.001340778
unused word 0.001340778
word histogram 0.001340778
word permuta 0.001340778
generative model 0.001315734
model estimate 0.001282552
training set 0.00126067
guage model 0.001250972
language models 0.001234078
training corpus 0.001233447
prior models 0.001228045
other models 0.001216819
other results 0.001151776
sampling distribution 0.001149675
document accuracy 0.00114649
partial document 0.001129882
log probability 0.001125554
state probability 0.00112117
bigram information 0.00111961
document end 0.001109599
document length 0.001102422
same corpus 0.0010869550000000001
document privacy 0.00108247
short document 0.001065956
uniform probability 0.001063121
document count 0.0010582690000000001
training documents 0.001058103
unigram models 0.0010554940000000001
initial probability 0.001049289
computational problem 0.001046684
test set 0.0010442379999999999
particular document 0.001040751
document repre 0.0010307
complete document 0.0010284790000000001
document recovery 0.0010272200000000001
empty document 0.00102663
basic document 0.001026577
model 0.00102418
document zij 0.001020391
bigram language 0.001019875
training bow 0.001015668
prior bigram 0.001013842
document membership 0.001013575
mulate document 0.001013575
bow corpus 0.001012667
gram probability 0.001004507
optimization problem 9.737190000000001E-4
recovery algorithm 9.66831E-4
set perplexity 9.50934E-4
prior lms 9.368270000000001E-4
guage models 9.33674E-4
order information 9.27109E-4
corpus length 9.21474E-4
validation results 9.203039999999999E-4
next set 9.18381E-4
other values 9.171190000000001E-4
problem formulation 9.11259E-4
bigram lms 9.08343E-4
bigram pairs 9.04654E-4
corpus frequency 8.94923E-4
correct set 8.92517E-4
reverse problem 8.745389999999999E-4
same bow 8.691759999999999E-4
ing words 8.68719E-4
sampling approximation 8.67547E-4
timization problem 8.6264E-4
sufficient information 8.60991E-4
set perplexities 8.58883E-4
objective function 8.53883E-4
bigram log 8.51971E-4
switchboard corpus 8.48636E-4
count vector 8.42407E-4
test documents 8.41671E-4
bow documents 8.373230000000001E-4
target distribution 8.351719999999999E-4
importance sampling 8.34152E-4
seventh corpus 8.32348E-4
sumtime corpus 8.32348E-4
multiple bow 8.1601E-4
average sentence 8.14195E-4
sampling approxi 8.12097E-4
history words 8.006599999999999E-4
only words 7.99753E-4
current bigram 7.982709999999999E-4
marginal distribution 7.96962E-4
document 7.96171E-4
natural language 7.928599999999999E-4
