word segmentation 0.0047102310000000005
different word 0.00469399
chinese word 0.004678858
word lexicon 0.004311835
lexicon word 0.004311835
word class 0.00427727
unknown word 0.004246306
word sequence 0.004209332
trigram word 0.004179956
possible word 0.004153163
word types 0.004128729
word type 0.004120581
overall word 0.004101807
word classes 0.004084216
word forms 0.004079364
word segmen 0.004057858
word seg 0.004057675
greedy word 0.0040516120000000004
word segmentations 0.004050757
plausible word 0.004050728
word boundaries 0.004049199
word segmenta 0.00404049
word segmentor 0.004038402
probable word 0.00403224
word definition 0.004031708
word breaking 0.004026925
word identi 0.004026008
character words 0.00307549
chinese words 0.002691358
model probability 0.0026733620000000003
language model 0.002599554
other words 0.002451166
context model 0.002338953
class model 0.00232794
lexicon words 0.002324335
new words 0.002321395
unknown words 0.002258806
source model 0.002242839
single words 0.002242436
model weight 0.002201597
model probabilities 0.002186616
bigram model 0.002183396
chinese character 0.002150628
model estimation 0.002142498
generative model 0.00213768
stochastic model 0.002125056
channel model 0.002123453
seed model 0.002121701
model weights 0.002119475
model prob 0.002081532
cache model 0.00207672
separate words 0.002055262
character string 0.001971532
model 0.00184628
words 0.00180811
name character 0.001733975
chinese language 0.001636522
character list 0.001613069
character bigram 0.001604496
character strings 0.001587948
chinese sentence 0.00147358
segmentation system 0.001471105
chinese characters 0.001471084
training data 0.001451974
probability distribution 0.001421627
class segmentation 0.001396281
different context 0.001391053
training corpus 0.00137651
segmentation problem 0.001324694
different function 0.001309769
chinese text 0.001303536
segmentation factoid 0.0012909850000000001
different systems 0.0012853869999999998
character 0.00126738
correct segmentation 0.001250814
different types 0.001231499
segmentation performance 0.00122367
such names 0.001216783
large corpus 0.001204
language input 0.001189725
test set 0.00118369
chinese person 0.001182981
data problem 0.001178271
different annotation 0.0011779849999999999
such problems 0.001171313
statistical models 0.001170438
different ways 0.001159199
segmentation ambiguities 0.001155653
chinese nouns 0.001143189
input string 0.001140603
different lexicons 0.0011386249999999999
same system 0.001138263
trigram language 0.00113762
statistical features 0.001134879
chinese due 0.001134249
chinese lan 0.001131898
different mechanism 0.0011300869999999999
class models 0.001128017
such approaches 0.001120821
chinese mor 0.001120058
