corpus similarity 0.002948025
domain corpus 0.002889353
test corpus 0.0028855
general corpus 0.0028134149999999997
high corpus 0.0027473479999999997
small corpus 0.002722652
corpus size 0.0026964759999999997
corpus linguistics 0.002683384
homogeneous corpus 0.002678973
corpus homogeneity 0.002628652
sublanguage corpus 0.0025989309999999996
corpus similariti 0.0025989309999999996
lob corpus 0.0025989309999999996
corpus 0.00235274
different language 0.0022113049999999998
different corpora 0.002134055
language model 0.002053229
same language 0.0018661149999999998
word frequency 0.001828661
large corpora 0.001665444
single language 0.0016549079999999999
general corpora 0.001640965
statistical language 0.0016112029999999999
language researchers 0.0015921339999999998
specific language 0.001588199
different text 0.0015697929999999999
language varieties 0.0015602959999999999
small corpora 0.001550202
language variety 0.001548014
common word 0.001539994
language modelling 0.001539967
language modelliug 0.001517222
specific corpora 0.0015109490000000001
language rnodelling 0.001504032
perfect language 0.001504032
corpora arc 0.001476107
tween corpora 0.001465891
such words 0.001462035
ksc corpora 0.001461798
word frequencies 0.001447943
tvw corpora 0.001427162
different measures 0.001417968
next word 0.001400122
word senscs 0.001384214
different purposes 0.001319919
other work 0.0013179989999999998
human similarity 0.001293371
different characteristics 0.001281809
language 0.00125754
common words 0.0012527999999999999
different tenth 0.00124277
different domains 0.001240307
different ways 0.00123595
test data 0.001219583
different styles 0.001209852
different theories 0.001207213
different perplexities 0.001200999
different sizes 0.001200999
such work 0.001199551
same measure 0.001198106
similarity measure 0.001184816
corpora 0.00118029
frequency measures 0.001154954
same method 0.001152613
unknown words 0.001138098
training data 0.001127673
target text 0.001127041
candidate words 0.001115228
frequent words 0.001114553
various text 0.001054284
latter model 0.0010448620000000001
nlp system 0.0010285
similar text 0.001018383
other end 0.001017547
data all 0.0010126850000000001
frequency lists 9.965249999999998E-4
enough data 9.93932E-4
data points 9.92983E-4
single measure 9.86899E-4
human intuitions 9.864729999999999E-4
other columns 9.7745E-4
human judgements 9.639480000000001E-4
distinct text 9.54814E-4
human response 9.46578E-4
subtree frequency 9.386209999999999E-4
frequency jist 9.386209999999999E-4
text types 9.38598E-4
recognition system 9.35892E-4
related work 9.32039E-4
same varieties 9.11331E-4
speech recognition 9.08097E-4
further work 9.079049999999999E-4
large distance 9.03608E-4
new domain 8.974300000000001E-4
possible measures 8.9232E-4
text genres 8.85392E-4
speech signal 8.810129999999999E-4
speech reeognition 8.810129999999999E-4
text type 8.63072E-4
test material 8.61452E-4
