system sequence 0.001428817
probabilistic model 0.001319089
sequence extraction 0.0013074270000000001
correct sequence 0.001298675
candidate sequence 0.001280492
sequence frame 0.001243813
sequence extractions 0.001241889
sequence frames 0.001204259
likely sequence 0.001203536
sequence items 0.001202568
valid sequence 0.001197662
sequence name 0.001194934
density features 0.001181308
sequence regularities 0.00117399
sequence names 0.001168339
generating sequence 0.0011617700000000001
training data 0.001159517
sequence scope 0.0011581130000000001
functionality features 0.001109085
html features 0.001108329
tactic features 0.001087064
test set 0.001026233
test data 0.00101519
information extraction 0.001011897
first step 9.95284E-4
information entropy 9.53051E-4
many values 9.511719999999999E-4
sequence 9.42125E-4
other types 9.415690000000001E-4
model 9.07067E-4
other examples 8.87753E-4
input corpus 8.877280000000001E-4
final corpus 8.83119E-4
previous work 8.73951E-4
classical information 8.71811E-4
features 8.55899E-4
web text 7.917849999999999E-4
many countries 7.87004E-4
general sequences 7.86798E-4
coherent set 7.863499999999999E-4
extraction task 7.81904E-4
set expan 7.809099999999999E-4
page corpus 7.75766E-4
seq system 7.72178E-4
experimental results 7.52251E-4
significant number 7.52112E-4
structured text 7.46654E-4
pos tags 7.413680000000001E-4
feature 7.36466E-4
high values 7.25964E-4
candidate values 7.215100000000001E-4
high scores 7.20381E-4
maximum value 7.1509E-4
candidate sequences 7.13998E-4
poor performance 7.07781E-4
candidate extraction 7.036690000000001E-4
average value 7.02816E-4
patterns seq 7.02251E-4
web search 6.91317E-4
related work 6.83942E-4
confidence scores 6.80596E-4
distinct values 6.73147E-4
unstructured text 6.71385E-4
total number 6.71183E-4
rect value 6.68534E-4
extensive work 6.62935E-4
scores localconf 6.62862E-4
random sample 6.61635E-4
pattern example 6.58393E-4
correct extractions 6.563140000000001E-4
possible extractions 6.55486E-4
significant amount 6.54789E-4
value ofh 6.50957E-4
multiple sentences 6.500620000000001E-4
effective approach 6.47186E-4
information 6.46595E-4
ordinal number 6.46106E-4
common cause 6.40756E-4
confidence measure 6.38771E-4
candidate extractions 6.38131E-4
distinct sentences 6.36766E-4
next section 6.31966E-4
high coverage 6.30029E-4
extraction errors 6.27949E-4
good measure 6.27885E-4
common causes 6.27866E-4
significant margin 6.264829999999999E-4
additional sentences 6.24136E-4
following section 6.213379999999999E-4
measure localconf 6.21037E-4
localconf figure 6.09832E-4
baseline systems 6.088459999999999E-4
training 6.0618E-4
consistent way 6.05046E-4
mous values 6.01755E-4
web page 6.0149E-4
candidate items 5.9881E-4
search engine 5.98249E-4
coherent sequences 5.97601E-4
tinct sequences 5.94869E-4
