word boundaries 0.003744036
general word 0.003733988
word dictionaries 0.003689004
word bigram 0.003669145
word delimiters 0.003622621
word subset 0.00361677
word collocations 0.003604719
word combination 0.003601838
word combinations 0.003595057
appropriate word 0.0035945350000000003
other words 0.002221807
separate words 0.001966542
frequent words 0.0018918960000000001
neighboring words 0.001804833
constituent words 0.001799302
words 0.00157818
different algorithms 0.001183382
probabilistic algorithm 0.001125041
same time 0.00109683
data source 0.001053844
linguistic structure 0.0010407559999999999
data compression 0.001039509
linguistic level 0.001023058
information results 0.001021804
such patterns 0.001016059
such languages 0.001003537
same content 9.85583E-4
such cases 9.63557E-4
same del 9.504790000000001E-4
noun noun 9.46194E-4
linguistic resources 9.38213E-4
noun phrases 9.31477E-4
text compression 9.27132E-4
linguistic filter 9.2668E-4
text corpora 9.215880000000001E-4
such efforts 9.20868E-4
corpus frequencies 9.17197E-4
regard text 9.12052E-4
unconstrained corpus 9.00593E-4
performance rules 8.95577E-4
other hand 8.93759E-4
mutual information 8.93453E-4
second categories 8.56495E-4
markov models 8.54836E-4
frequency approach 8.5324E-4
gold standard 8.376029999999999E-4
segmentation process 8.288659999999999E-4
angeles new 8.275240000000001E-4
general noun 8.19465E-4
dictionary headwords 8.102299999999999E-4
york new 8.05706E-4
new york 8.05706E-4
estate new 8.04021E-4
new jersey 8.00989E-4
mwu dictionary 7.958989999999999E-4
algorithm 7.92883E-4
semantic analysis 7.891230000000001E-4
latent semantic 7.786760000000001E-4
semantic relationships 7.76271E-4
patterns noun 7.75736E-4
proper noun 7.736360000000001E-4
several algorithms 7.68079E-4
formula frequency 7.66886E-4
many others 7.63683E-4
approaches method 7.59661E-4
separate evaluation 7.53391E-4
separate gold 7.49148E-4
specialized dictionary 7.45393E-4
performance gains 7.37396E-4
expected frequency 7.33437E-4
algorithmic performance 7.32262E-4
section algorithms 7.31759E-4
evaluation gold 7.25815E-4
induction algorithms 7.24239E-4
noun mwus 7.23192E-4
language 7.18311E-4
modest performance 7.05319E-4
produce results 6.98126E-4
corpus 6.93835E-4
method formula 6.87351E-4
standard selection 6.83335E-4
adj noun 6.78683E-4
description length 6.7429E-4
previous approaches 6.647420000000001E-4
street term 6.602660000000001E-4
time wall 6.60243E-4
unambiguous meaning 6.5421E-4
second columns 6.52438E-4
above rules 6.47199E-4
probabilistic approaches 6.45182E-4
significant improvement 6.45125E-4
school high 6.39535E-4
high school 6.39535E-4
significant sources 6.349299999999999E-4
second mode 6.3484E-4
human input 6.347950000000001E-4
evaluation wordnet 6.322319999999999E-4
similar standards 6.28696E-4
induction methods 6.27463E-4
space character 6.19975E-4
