word segmentation 0.003289218
word sequence 0.003114234
word boundaries 0.003107538
word sequences 0.003064811
word length 0.00301155
average word 0.003003088
word type 0.002998165
word types 0.002997304
word boundary 0.00299717
real word 0.002963747
merged word 0.002962665
word ngram 0.002961038
reliable word 0.002957572
word segmenter 0.002954824
word segmentations 0.002947346
word tokens 0.002946868
word fragments 0.002931742
adjacent word 0.0029287560000000002
preliminary word 0.002926927
age word 0.00292413
batable word 0.00292413
language model 0.002890803
statistical model 0.002563106
simple model 0.0024837
other words 0.002357005
bigram model 0.002350753
ngram model 0.002291318
binary model 0.002277387
segmental model 0.0022768190000000002
feeble model 0.002254788
criminative model 0.002254788
viable model 0.002254788
ods model 0.002254788
function words 0.00224441
average words 0.002122638
component words 0.002121535
segment words 0.002094994
whole words 0.002084388
reliable words 0.002077122
short words 0.002065679
words rec 0.002064018
complex words 0.002063697
familiar words 0.002062467
foreign words 0.002056361
long words 0.00205593
infrequent words 0.002051287
adjacent words 0.002048306
partial words 0.002044442
unfamiliar words 0.002044442
inside words 0.002044442
ring words 0.002044442
model 0.00203292
words 0.00182219
training data 0.0017214139999999999
dictionary data 0.001619662
corpus language 0.001614961
first language 0.001441229
such speech 0.001422701
arabic data 0.001379384
japanese data 0.0013657539999999998
greek data 0.001356705
test results 0.001329469
input data 0.0013253359999999999
large corpus 0.001280403
real data 0.001274997
test set 0.001271431
data averages 0.00123621
character phone 0.001228815
single character 0.001196222
current speech 0.001175658
english dictionary 0.001175501
phonetic features 0.001174608
adult speech 0.001165869
phonetic version 0.0011626359999999999
small test 0.001154598
test dataset 0.001148827
training set 0.001147352
standard models 0.001147346
language transcription 0.00114576
spanish corpus 0.00114253
test corpora 0.0011406160000000001
separate test 0.001138282
speech recognition 0.001134502
segment speech 0.001130891
corpus size 0.0011279200000000001
standard dictionary 0.001124829
arabic corpus 0.001122572
conversational speech 0.001120351
real speech 0.001119194
character sequences 0.001118398
ticular language 0.001115782
child language 0.001109552
actual speech 0.001100006
speech understanding 0.0010989419999999999
language modelling 0.001098929
ment speech 0.001097495
unigram language 0.0010966770000000001
fast speech 0.001095015
tional speech 0.001094634
foreign language 0.001092054
