word model 0.0055003299999999995
word models 0.004215119999999999
word clustering 0.003936052
other word 0.0037450509999999997
unknown word 0.003740945
word classes 0.003693783
word rate 0.0035286389999999996
word length 0.003518305
word type 0.003475018
word transition 0.0034356369999999996
oov word 0.003398941
word types 0.00338266
language model 0.0033643230000000002
group word 0.0033475719999999996
frequent word 0.0033335699999999997
word emission 0.00330412
tory word 0.003295433
quent word 0.003294466
cluster model 0.0031931530000000002
class model 0.0029940210000000004
morphological model 0.002978446
unknown words 0.0028906550000000002
trigram model 0.002879931
cal model 0.002837
model groups 0.002754578
several words 0.002730788
model perplexities 0.002715691
ter model 0.002713989
model outper 0.002708798
model uses 0.002705794
oov words 0.0025486510000000003
rare words 0.002537657
regular words 0.002505507
groups words 0.002491678
model 0.00245647
tween words 0.00244707
quency words 0.0024442250000000004
short words 0.002442699
pos models 0.00235483
words 0.00219357
language models 0.002079113
pos clustering 0.002075762
cluster models 0.001907943
pos classes 0.001833493
pos tags 0.0018191330000000001
pos class 0.0017211210000000002
class models 0.001708811
pos distribution 0.0015437550000000002
unsupervised pos 0.0015406830000000002
training corpus 0.001482646
right pos 0.001455244
dominant pos 0.0014327090000000001
morphological language 0.001429829
different probability 0.001426619
statistical language 0.001424192
factored models 0.001422762
character features 0.0014217169999999999
morphological clustering 0.001414168
training set 0.0014067110000000002
test set 0.00132353
possible clustering 0.001321716
other languages 0.001316329
morphological features 0.001313331
context clustering 0.00128949
cal language 0.0012883830000000001
cal clustering 0.001272722
length features 0.0012658
same cluster 0.001258281
first character 0.001237437
insufficient data 0.0012171600000000001
language modeling 0.001211719
complete clustering 0.001202948
tive language 0.001196759
language mod 0.001186579
morphological information 0.001181307
clustering solutions 0.001181277
morphological classes 0.001171899
models 0.00117126
inferior clustering 0.001160443
factored language 0.001159355
clustering process 0.00115531
binary features 0.001138304
transition probability 0.001118656
speech information 0.0011094339999999999
learning algorithm 0.001067782
different frequency 0.0010637910000000001
context information 0.001056629
good perplexity 0.001040638
morphological learning 0.0010354029999999998
cluster count 0.001034473
unknown nouns 0.001011157
perplexity improvement 0.0010068310000000001
unknown histories 0.001004722
probability dis 0.001000302
overall perplexity 9.90837E-4
emission probability 9.871390000000002E-4
other suffixes 9.85705E-4
special character 9.8132E-4
ing set 9.804100000000001E-4
suffix feature 9.71554E-4
