domain information 0.004497249
domain dictionary 0.004317703
domain tags 0.004284011
domain table 0.004223978
previous domain 0.00420595
domain estimation 0.004156985
domain resources 0.004106653
available domain 0.004093361
particular domain 0.004091003
use domain 0.004090738
components domain 0.00407992
dominant domain 0.0040686939999999994
domain resource 0.0040605039999999995
domain esti 0.004036283999999999
namic domain 0.004034039
domain 0.00380215
unknown words 0.002543687
fundamental words 0.0021934
content words 0.002105307
extract words 0.002092669
representative words 0.0020792700000000002
damental words 0.002078656
words cult 0.002078656
tal words 0.002078656
unknown word 0.0020084070000000002
words 0.00184654
compound word 0.00154542
training data 0.001261061
evaluation method 0.001233711
main tag 0.0011228240000000001
simple method 0.001109703
previous text 0.00109989
unlabeled data 0.00107151
condition data 0.001068749
categorization method 0.00106081
seed information 0.00104228
text categorization 9.85846E-4
correct domains 9.70907E-4
particular domains 9.690860000000001E-4
appropriate domains 9.570360000000001E-4
frequency accuracy 9.51865E-4
text col 9.42976E-4
text cate 9.32795E-4
own domains 9.21742E-4
assign domains 9.143580000000001E-4
other proposals 8.78401E-4
module frequency 8.77297E-4
good accuracy 8.757260000000001E-4
many nlp 8.63733E-4
estimation results 8.530759999999999E-4
distinctive features 8.45332E-4
learning techniques 8.332929999999999E-4
machine learning 8.15888E-4
wikipedia articles 8.10706E-4
corpus 8.09871E-4
initial training 7.977290000000001E-4
method 7.71054E-4
web search 7.6874E-4
use web 7.62236E-4
result web 7.61776E-4
own dictionary 7.57062E-4
small amount 7.54463E-4
corporate web 7.37873E-4
snippets module 7.36639E-4
web pages 7.29982E-4
same construction 7.27133E-4
large amount 7.2324E-4
feature 7.21344E-4
japanese web 7.1795E-4
categorization table 7.11584E-4
web sites 7.09884E-4
information 6.95099E-4
blog articles 6.89674E-4
calculation figure 6.87497E-4
wikipedia article 6.86292E-4
categorization methods 6.81014E-4
domains 6.80233E-4
total number 6.73048E-4
mation module 6.71837E-4
components module 6.69874E-4
wikipedia modules 6.63205E-4
related work 6.62433E-4
strict wikipedia 6.59704E-4
nlp tasks 6.59088E-4
wikipedia arti 6.5567E-4
failure wikipedia 6.5567E-4
media health 6.51296E-4
approach 6.48767E-4
good clues 6.44758E-4
previous ones 6.43179E-4
score 6.40912E-4
health education 6.40694E-4
module this 6.37277E-4
articles increases 6.28793E-4
snippets components 6.22305E-4
features 6.11598E-4
corporate snippets 6.0876E-4
only neces 6.07832E-4
fundamental content 6.056270000000001E-4
daily basis 6.04092E-4
