word distribution 0.00361842
language model 0.003512899
word segmentation 0.0033316630000000003
unigram model 0.0032294209999999997
bigram model 0.0031486559999999997
other word 0.00312319
new model 0.003118559
unigram word 0.003107571
previous model 0.0031066939999999997
bayesian word 0.003091857
probabilistic model 0.0030580769999999998
word boundary 0.003043746
word dependencies 0.00303572
bigram word 0.003026806
possible model 0.0030155039999999996
same word 0.003011583
generative model 0.002999006
word boundaries 0.002985137
own model 0.00297704
hdp model 0.0029507839999999997
ngs model 0.0029401809999999996
model our 0.0029277229999999997
mbdp model 0.0029128469999999997
word tokens 0.002911545
cache model 0.002879136
word type 0.0028788530000000002
model incorpo 0.0028786909999999996
igram model 0.0028786909999999996
ible model 0.0028786909999999996
special word 0.002841058
unsupervised word 0.002829785
ditional word 0.002814062
explicit word 0.002810195
ith word 0.002799246
unique word 0.002796011
successful word 0.0027941850000000002
particular word 0.002787664
novel word 0.002787215
accurate word 0.002779258
word types 0.002773446
word segmenta 0.002768731
word frequencies 0.002762974
model 0.00264434
different distribution 0.001833322
probability distribution 0.001830824
language models 0.0017980029999999998
other words 0.00177333
single words 0.001570944
dirichlet distribution 0.001519498
beta distribution 0.001452266
segment words 0.0014382940000000001
novel words 0.001437355
base distribution 0.001432582
sible words 0.001431975
adjacent words 0.001409663
fact words 0.001409288
uniform distribution 0.001392103
previous models 0.001391798
such models 0.001384427
posterior distribution 0.001373353
bigram language 0.001372875
same corpus 0.001372811
multinomial distribution 0.001356754
empirical distribution 0.001350937
probabilistic models 0.001343181
terior distribution 0.001334165
phoneme distribution 0.001332607
stationary distribution 0.0013311269999999999
human language 0.001268195
trigram language 0.001236384
prior probability 0.001228979
gram models 0.001227385
abilistic models 0.00121959
input corpus 0.001193512
words 0.00117263
gram language 0.0011665
extensible models 0.001164328
segmented corpus 0.0011641540000000001
true corpus 0.001160802
language learners 0.001154863
artificial corpus 0.001149276
mented corpus 0.001144271
local context 0.001134727
unigram process 0.001132967
segmentation accuracy 0.0011286080000000001
standard inference 0.00112732
permuted corpus 0.001121585
tificial corpus 0.001119328
different values 0.001109519
ural language 0.001106955
standard unigram 0.001106268
ngs segmentation 0.001105014
distribution 0.00109593
different modeling 0.001090073
segmentation systems 0.0010798490000000001
unigram probabilities 0.0010680870000000001
ability segmentation 0.00105444
segmentation hypothesis 0.00104877
segmentation accuracies 0.0010473140000000001
inference procedure 0.00103701
