wikipedia entity 0.00360897
wikipedia features 0.00314878
wikipedia data 0.0031469000000000002
wikipedia category 0.00311099
wikipedia feature 0.00310939
wikipedia entities 0.0029651599999999997
wikipedia article 0.002876625
wikipedia articles 0.002813592
entity category 0.0026171
wikipedia fea 0.002430693
wikipedia tag 0.00238239
wikipedia entry 0.0023595779999999998
wikipedia matches 0.0023352159999999998
wikipedia match 0.002335205
wikipedia syntax 0.002313296
entity name 0.002067804
wikipedia 0.00205143
other features 0.002039231
category label 0.0019494949999999999
entity names 0.0019383030000000002
ambiguous entity 0.0018640150000000001
entity recognition 0.001845247
entity cate 0.00184454
entity recog 0.0018340890000000001
entity regions 0.001812772
such category 0.001809555
training data 0.0017510300000000002
surface features 0.001719901
surface feature 0.001680511
form features 0.00164108
feature set 0.001633032
baseline features 0.0015812629999999999
baseline model 0.001572903
ner model 0.001561789
entity 0.00155754
category labels 0.001544801
baseline feature 0.001541873
word form 0.00149599
candidate word 0.001490778
following features 0.00145367
crf model 0.001439103
category section 0.001432593
article pages 0.001429313
gazetteer model 0.0014276319999999999
node features 0.0014252969999999998
gazetteer category 0.0013982019999999999
gazetteer feature 0.0013966019999999998
edge features 0.001394049
node feature 0.001385907
text search 0.001378715
word sequence 0.00137122
possible word 0.001364733
many entities 0.0013609199999999998
combination features 0.001360226
feature comparison 0.001359767
language process 0.0013571070000000002
last word 0.001355185
edge feature 0.0013546589999999998
good category 0.001354617
markup language 0.0013527460000000002
data snapshots 0.001351881
appropriate category 0.001350568
thewikipedia features 0.001350521
ﬁnal model 0.001342202
distinct category 0.001333917
such approach 0.00131849
category frequency 0.001318072
finding category 0.001311847
word sequences 0.001242235
training set 0.0012306320000000002
ambiguous entities 0.001220205
other studies 0.0012113739999999999
other hand 0.001208799
word comb 0.001205739
current label 0.001203648
difference entities 0.001189204
distinct entities 0.001188087
possible articles 0.001174635
gory label 0.00116645
label ﬁnding 0.001143878
egory label 0.001143878
natural language 0.001143206
disambiguation pages 0.00114208
article title 0.001135848
language version 0.001128796
english dataset 0.001120907
training corpus 0.0011087129999999999
features 0.00109735
model 0.00108899
stantive article 0.001078398
simple sentence 0.0010723389999999998
sentence segmentation 0.001062791
category 0.00105956
feature 0.00105796
ﬁrst sentence 0.001043965
usual articles 0.00103847
such head 0.0010382870000000002
english version 0.001026447
newspaper articles 0.001017233
english newspaper 0.001012505
