word model 0.00568895
word segmentation 0.00408976
different word 0.003922608
word distribution 0.00382938
word models 0.0038210099999999997
language model 0.0037094800000000002
japanese word 0.003420832
single word 0.003409909
word length 0.0033970989999999998
unigram word 0.003367804
hidden word 0.003356417
last word 0.003353437
word type 0.003342909
word boundary 0.003335632
bayesian model 0.003332901
word types 0.0033199469999999997
unsupervised word 0.003313558
additional word 0.003303744
word hpylm 0.003288677
word seg 0.003284581
word dependencies 0.00328072
times word 0.003267637
explicit word 0.003259813
final word 0.003258477
accurate word 0.003233463
word boundaries 0.003229596
probable word 0.003223617
word unigrams 0.003223457
word spellings 0.0032216849999999997
model time 0.003040405
guage model 0.002997347
markov model 0.00299027
generative model 0.002989402
spelling model 0.002982366
ter model 0.002979644
model msr 0.00297244
gram model 0.0029530850000000003
accurate model 0.002939293
elaborate model 0.002927713
model struc 0.002927713
model 0.00269739
words context 0.002489219
unknown words 0.00235076
possible words 0.0023290200000000002
frequency words 0.00225151
long words 0.002221414
ble words 0.00218073
ous words 0.002177983
neighboring words 0.002165116
preceding words 0.0021637690000000003
virtual words 0.002163274
segmentation data 0.001941773
words 0.00193215
character models 0.001711718
new segmentation 0.001667876
bayesian language 0.0016476009999999998
test data 0.0015887800000000001
segmentation result 0.001440107
unsupervised segmentation 0.001420198
hierarchical language 0.001395419
training data 0.00138677
supervised segmentation 0.001359222
segmentation accuracies 0.0013284619999999999
same probability 0.001308026
character string 0.001301331
character strings 0.001261025
novel language 0.001250361
yor language 0.001246405
english character 0.001244236
arbitrary language 0.001243702
trigram distribution 0.001190588
prior distribution 0.001185143
trigram models 0.001182218
character hpylm 0.001179385
bigram distribution 0.001171926
unsupervised data 0.001165571
posterior distribution 0.001154786
poisson distribution 0.00114777
dimensional distribution 0.001134888
accurate character 0.001124171
raw character 0.001121311
childes data 0.001113096
uniform distribution 0.001104531
hierarchical chinese 0.001103506
segmentation 0.0010982
chinese restaurant 0.001082522
series data 0.001078373
true distribution 0.001073203
gamma distribution 0.001073197
terior distribution 0.001070161
language 0.00101209
probability distri 9.89732E-4
same algorithm 9.780700000000002E-4
enclosed test 9.76308E-4
marginal probability 9.673959999999999E-4
discrete probability 9.55996E-4
traditional chinese 9.50701E-4
same time 9.29256E-4
standard datasets 9.1689E-4
sentence boundary 9.14885E-4
