new word 0.00463094
other word 0.00453645
text word 0.004498237
chinese word 0.0044956430000000006
english word 0.00442184
use word 0.004300446
word recall 0.00427512
word berry 0.00425482
word sequences 0.004249785
word mean 0.0042438580000000005
true word 0.0042433340000000005
word counts 0.004240954
word frequencies 0.0042260200000000005
word watermelon 0.004218014
word red 0.004214156
component word 0.004211872
word lengths 0.004204671
root word 0.004200635
word wan 0.004198422
kank word 0.004198422
word moon 0.004198422
new words 0.00246942
different words 0.0024653260000000003
many words 0.002437557
other words 0.00237493
ing words 0.002270545
few words 0.00213074
extra words 0.00209722
true words 0.002081814
component words 0.002050352
words duck 0.002040025
unfound words 0.002037208
words blackbird 0.002037208
words 0.00183087
learning algorithm 0.001235632
text segmentation 0.001159664
character sequence 0.001122456
parsing algorithm 0.001068559
search algorithm 0.001046909
ing language 0.001035793
same structure 0.001022366
induction algorithm 0.001020437
lexical structure 9.95358E-4
input text 9.7467E-4
ing context 9.74443E-4
brown corpus 9.61475E-4
segmentation program 9.60102E-4
underlying model 9.52342E-4
sound model 9.51387E-4
maximization algorithm 9.45778E-4
same results 9.45031E-4
same time 9.447570000000001E-4
such patterns 9.39968E-4
many times 9.24989E-4
segmentation performance 9.20936E-4
phoneme model 9.19676E-4
same way 9.16471E-4
segmentation tree 9.061900000000001E-4
computer segmentation 9.05182E-4
input sentence 9.04442E-4
mutual information 8.97584E-4
such classes 8.97331E-4
many patterns 8.97244E-4
description length 8.88458E-4
minimum length 8.79869E-4
syntactic structure 8.76855E-4
code length 8.727940000000001E-4
unsupervised language 8.700699999999999E-4
semantic properties 8.58798E-4
programming language 8.58119E-4
lexical representation 8.5646E-4
length encoding 8.521710000000001E-4
other units 8.43641E-4
other experiments 8.34282E-4
many mistakes 8.330029999999999E-4
length components 8.29662E-4
length standpoint 8.287360000000001E-4
second test 8.19753E-4
other processes 7.935679999999999E-4
linguistic units 7.93202E-4
learning algorithms 7.91506E-4
text utterances 7.79486E-4
text compression 7.774850000000001E-4
character 7.74933E-4
unsupervised learning 7.735839999999999E-4
learning framework 7.729779999999999E-4
first experiment 7.71982E-4
information retrieval 7.70458E-4
speech compression 7.68929E-4
meaning symbols 7.63559E-4
frequency perturbation 7.63519E-4
text sequences 7.63242E-4
same thing 7.611299999999999E-4
considerable information 7.61078E-4
idiosyncratic information 7.61078E-4
other app 7.537059999999999E-4
learning mean 7.511E-4
other perturbations 7.50928E-4
true meaning 7.50702E-4
original input 7.482260000000001E-4
