language model 0.002989003
probability model 0.002930103
bayesian model 0.002691907
morphology model 0.002625655
word corpus 0.002617829
generative model 0.002602162
model our 0.002583769
trigram model 0.002582421
probabilistic model 0.002577271
mixture model 0.002534765
graphical model 0.002531897
present model 0.002525258
markov model 0.002522456
model clusters 0.00249097
model variant 0.002485812
channel model 0.002480602
guage model 0.002473946
exchangeable model 0.002457574
model irreg 0.0024509749999999998
different word 0.002399836
model 0.0022299
many word 0.002160099
word form 0.002125455
word tokens 0.002118031
next word 0.002030783
word types 0.002003907
inflected word 0.0019597729999999997
likely word 0.001959171
pos tag 0.001573255
tag sequence 0.001489635
dirichlet distribution 0.001463508
prior distribution 0.001447586
lexicon corpus 0.001424906
training corpus 0.001410657
segments words 0.001391078
posterior distribution 0.0013825460000000001
text corpus 0.001378073
tag bigram 0.0013691889999999998
uniform distribution 0.001354152
paradigms corpus 0.001349061
clusters words 0.00132246
novel words 0.00131052
marginal distribution 0.0013011350000000001
base distribution 0.001296894
related words 0.001291067
corpus tokens 0.0012875999999999999
unobserved words 0.001286242
group words 0.0012833389999999999
collapsed distribution 0.001280689
large corpus 0.001255957
corpus our 0.001247568
ity distribution 0.0012472910000000002
corpus con 0.001244144
separate feature 0.001239923
token data 0.0012306040000000002
full corpus 0.001216432
morphological lexicon 0.0012093260000000002
other forms 0.001189437
corpus size 0.0011885939999999998
morphological forms 0.0011880390000000001
other variables 0.0011470550000000001
morphological paradigms 0.001133481
unannotated corpus 0.00111553
wacky corpus 0.00111553
other lexeme 0.001106563
lexical features 0.001089295
morphological form 0.0010794440000000001
other tokens 0.001073418
words 0.00106139
different parameters 0.0010605599999999999
type data 0.0010580470000000001
different frequency 0.001057805
modeling morphological 0.001032993
human language 0.001029962
distribution 0.00102219
morphological paradigm 0.001019028
natural language 0.001014588
seed data 0.001012307
development data 0.001011413
trigram features 0.00100371
language technology 0.001000498
celex data 9.94296E-4
bayesian approach 9.88361E-4
scientific data 9.858970000000001E-4
data interpretation 9.81256E-4
morphological knowledge 9.775E-4
other customers 9.663650000000001E-4
other statistics 9.617580000000001E-4
same set 9.58794E-4
specific features 9.48943E-4
probability mass 9.425410000000001E-4
other hand 9.42097E-4
positive probability 9.40498E-4
other relationships 9.364660000000001E-4
lexeme set 9.32703E-4
low probability 9.290510000000001E-4
different train 9.27758E-4
other analyses 9.23706E-4
morphological transformations 9.22705E-4
other inflections 9.221090000000001E-4
