web data 0.002175154
corpus frequency 0.0018537949999999999
web search 0.001792681
data set 0.001767129
corpus counts 0.001714029
web counts 0.00170675
small data 0.001613007
noun bigrams 0.001589577
data sets 0.001525508
word sense 0.0015154020000000001
data sparseness 0.001485183
unseen bigrams 0.0014816999999999999
acute data 0.001457741
corpus frequencies 0.0014369979999999999
web frequencies 0.001429719
corpus evidence 0.0013727129999999998
search term 0.0013617149999999999
high corpus 0.001351132
web queries 0.001349095
google counts 0.0013277660000000002
national corpus 0.0013189249999999999
bigram counts 0.001316519
bnc frequency 0.001312071
web searches 0.00126849
conventional corpus 0.0012652829999999999
sampling bigrams 0.001263707
actual corpus 0.001262329
balanced corpus 0.001257858
first noun 0.001240638
frequency estimates 0.001195223
useful frequency 0.0011812609999999999
search terms 0.001180945
bnc counts 0.001172305
argument bigrams 0.001161435
human plausibility 0.0011523879999999998
other stimuli 0.001151205
training set 0.001149659
search engine 0.00114881
frequency bands 0.001147433
bigrams altavista 0.0011431549999999999
adjacent words 0.00113962
object bigrams 0.001109645
machine translation 0.0011065760000000002
correct translation 0.001098293
search engines 0.0010949
matching words 0.001084618
altavista counts 0.0010801880000000002
search heuristics 0.001076282
second noun 0.00106262
zero bigrams 0.001057214
bigram frequencies 0.001039488
work group 0.001022682
zero counts 9.94247E-4
missing counts 9.8958E-4
tavista counts 9.88205E-4
corpus 9.79783E-4
unseen predicate 9.66929E-4
same order 9.40629E-4
linguistic stimuli 9.389299999999999E-4
proper nouns 9.38775E-4
nlp tasks 9.328E-4
google coefficient 9.32602E-4
previous work 9.31915E-4
language processing 9.285159999999999E-4
plausibility judgments 9.27393E-4
google query 9.24226E-4
google coefficients 9.16224E-4
syntactic patterns 9.12907E-4
mean plausibility 9.092060000000001E-4
training sets 9.08038E-4
current nlp 9.05386E-4
noisy text 9.041820000000001E-4
different types 8.953520000000001E-4
bnc frequencies 8.952739999999999E-4
different ways 8.92374E-4
particular task 8.90509E-4
human judg 8.90022E-4
plausibility judg 8.86948E-4
large amount 8.84876E-4
man plausibility 8.76843E-4
linguistic judg 8.7551E-4
frequency 8.74012E-4
small number 8.69161E-4
nlp algorithms 8.58925E-4
computational linguistics 8.58822E-4
google ranges 8.5088E-4
ticular task 8.486100000000001E-4
future work 8.42037E-4
intuitive plausibility 8.39379E-4
empirical methods 8.38619E-4
plausibility ratings 8.36948E-4
lected plausibility 8.298400000000001E-4
linguistic phenomenon 8.2649E-4
same criteria 8.26051E-4
smoothing algorithms 8.24504E-4
set increases 8.24015E-4
minimal sentence 8.23418E-4
listener table 8.22816E-4
search 8.20177E-4
high correlation 8.198529999999999E-4
