Query
expansion
is
an
effective
technique
to
improve
the
performance
of
information
retrieval
systems
.
Although
hand-crafted
lexical
resources
,
such
as
WordNet
,
could
provide
more
reliable
related
terms
,
previous
studies
showed
that
query
expansion
using
only
WordNet
leads
to
very
limited
performance
improvement
.
One
of
the
main
challenges
is
how
to
assign
appropriate
weights
to
expanded
terms
.
In
this
paper
,
we
re-examine
this
problem
using
recently
proposed
axiomatic
approaches
and
find
that
,
with
appropriate
term
weighting
strategy
,
we
are
able
to
exploit
the
information
from
lexical
resources
to
significantly
improve
the
retrieval
performance
.
Our
empirical
results
on
six
TREC
collections
show
that
query
expansion
using
only
hand-crafted
lexical
resources
leads
to
significant
performance
improvement
.
The
performance
can
be
further
improved
if
the
proposed
method
is
combined
with
query
expansion
using
co-occurrence-based
resources
.
1
Introduction
Zhai
,
2006
;
Qiu
and
Frei
,
1993
;
Bai
et
al.
,
2005
;
Cao
et
al.
,
2005
)
is
a
commonly
used
strategy
to
bridge
the
vocabulary
gaps
by
expanding
original
queries
with
related
terms
.
Expanded
terms
are
often
selected
from
either
co-occurrence-based
thesauri
(
Qiu
and
Frei
,
1993
;
Bai
et
al.
,
2005
;
Jing
and
Croft
,
1994
;
Peat
and
Willett
,
1991
;
Smeaton
and
van
Rijsbergen
,
1983
;
Fang
and
Zhai
,
2006
)
or
handcrafted
thesauri
(
Voorhees
,
1994
;
Liu
et
al.
,
2004
)
or
both
(
Cao
et
al.
,
2005
;
Mandala
et
al.
,
1999b
)
.
Intuitively
,
compared
with
co-occurrence-based
thesauri
,
hand-crafted
thesauri
,
such
as
WordNet
,
could
provide
more
reliable
terms
for
query
expansion
.
However
,
previous
studies
failed
to
show
any
significant
gain
in
retrieval
performance
when
queries
are
expanded
with
terms
selected
from
WordNet
(
Voorhees
,
1994
;
Stairmand
,
1997
)
.
Although
some
researchers
have
shown
that
combining
terms
from
both
types
of
resources
is
effective
,
the
benefit
of
query
expansion
using
only
manually
created
lexical
resources
remains
unclear
.
The
main
challenge
is
how
to
assign
appropriate
weights
to
the
expanded
terms
.
In
this
paper
,
we
re-examine
the
problem
of
query
expansion
using
lexical
resources
with
the
recently
proposed
axiomatic
approaches
(
Fang
and
Zhai
,
2006
)
.
The
major
advantage
of
axiomatic
approaches
in
query
expansion
is
to
provide
guidance
on
how
to
weight
related
terms
based
on
a
given
term
similarity
function
.
In
our
previous
study
,
a
cooccurrence-based
term
similarity
function
was
proposed
and
studied
.
In
this
paper
,
we
study
several
term
similarity
functions
that
exploit
various
information
from
two
lexical
resources
,
i.e.
,
WordNet
and
dependency-thesaurus
constructed
by
Lin
(
Lin
,
1998
)
,
and
then
incorporate
these
similarity
functions
into
the
axiomatic
retrieval
framework
.
We
conduct
empirical
experiments
over
several
TREC
standard
collections
to
systematically
evaluate
the
effectiveness
of
query
expansion
based
on
these
similarity
functions
.
Experiment
results
show
that
all
the
similarity
functions
improve
the
retrieval
performance
,
although
the
performance
improvement
varies
for
different
functions
.
We
find
that
the
most
effective
way
to
utilize
the
information
from
WordNet
is
to
compute
the
term
similarity
based
on
the
overlap
of
synset
definitions
.
Using
this
similarity
function
in
query
expansion
can
significantly
improve
the
retrieval
performance
.
According
to
the
retrieval
performance
,
the
proposed
similarity
function
is
significantly
better
than
simple
mutual
information
based
similarity
function
,
while
it
is
comparable
to
the
function
proposed
in
(
Fang
and
Zhai
,
2006
)
.
Furthermore
,
we
show
that
the
retrieval
performance
can
be
further
improved
if
the
proposed
similarity
function
is
combined
with
the
similarity
function
derived
from
co-occurrence-based
resources
.
The
main
contribution
of
this
paper
is
to
reexamine
the
problem
of
query
expansion
using
lexical
resources
with
a
new
approach
.
Unlike
previous
studies
,
we
are
able
to
show
that
query
expansion
using
only
manually
created
lexical
resources
can
significantly
improve
the
retrieval
performance
.
The
rest
of
the
paper
is
organized
as
follows
.
We
discuss
the
related
work
in
Section
2
,
and
briefly
review
the
studies
of
query
expansion
using
axiomatic
approaches
in
Section
3
.
We
then
present
our
study
of
using
lexical
resources
,
such
as
WordNet
,
for
query
expansion
in
Section
4
,
and
discuss
experiment
results
in
Section
5
.
Finally
,
we
conclude
in
Section
6
.
2
Related
Work
Although
the
use
of
WordNet
in
query
expansion
has
been
studied
by
various
researchers
,
the
improvement
of
retrieval
performance
is
often
limited
.
Voorhees
(
Voorhees
,
1994
)
expanded
queries
using
a
combination
of
synonyms
,
hypernyms
and
hyponyms
manually
selected
from
WordNet
,
and
achieved
limited
improvement
(
i.e.
,
around
—
2
%
to
+2
%
)
on
short
verbose
queries
.
Stairmand
(
Stair-mand
,
1997
)
used
WordNet
for
query
expansion
,
but
they
concluded
that
the
improvement
was
restricted
by
the
coverage
of
the
WordNet
and
no
empirical
results
were
reported
.
2005
)
focused
on
extending
language
models
.
Although
they
were
able
to
improve
the
performance
,
it
remains
unclear
whether
using
only
information
from
hand-crafted
thesauri
would
help
to
improve
the
retrieval
performance
.
Another
way
to
improve
retrieval
performance
using
WordNet
is
to
disambiguate
word
senses
.
Voorhees
(
Voorhees
,
1993
)
showed
that
using
WordNet
for
word
sense
disambiguation
degrade
the
retrieval
performance
.
Liu
et
.
al.
(
Liu
et
al.
,
2004
)
used
WordNet
for
both
sense
disambiugation
and
query
expansion
and
achieved
reasonable
performance
improvement
.
However
,
the
computational
cost
is
high
and
the
benefit
of
query
expansion
using
only
WordNet
is
unclear
.
Ruch
et
.
al.
(
Ruch
et
al.
,
2006
)
studied
the
problem
in
the
domain
of
biology
literature
and
proposed
an
argumentative
feedback
approach
,
where
expanded
terms
are
selected
from
only
sentences
classified
into
one
of
four
disjunct
argumentative
categories
.
The
goal
of
this
paper
is
to
study
whether
query
expansion
using
only
manually
created
lexical
resources
could
lead
to
the
performance
improvement
.
The
main
contribution
of
our
work
is
to
show
query
expansion
using
only
hand-crafted
lexical
resources
is
effective
in
the
recently
proposed
axiomatic
framework
,
which
has
not
been
shown
in
the
previous
studies
.
3
Query
Expansion
in
Axiomatic
Retrieval
Axiomatic
approaches
have
recently
been
proposed
and
studied
to
develop
retrieval
functions
(
Fang
and
Zhai
,
2005
;
Fang
and
Zhai
,
2006
)
.
The
main
idea
is
to
search
for
a
retrieval
function
that
satisfies
all
the
desirable
retrieval
constraints
,
i.e.
,
axioms
.
The
underlying
assumption
is
that
a
retrieval
function
sat
-
isfying
all
the
constraints
would
perform
well
empirically
.
Unlike
other
retrieval
models
,
axiomatic
retrieval
models
directly
model
the
relevance
with
term
level
retrieval
constraints
.
In
(
Fang
and
Zhai
,
2005
)
,
several
axiomatic
retrieval
functions
have
been
derived
based
on
a
set
of
basic
formalized
retrieval
constraints
and
an
inductive
definition
of
the
retrieval
function
space
.
The
derived
retrieval
functions
are
shown
to
perform
as
well
as
the
existing
retrieval
functions
with
less
parameter
sensitivity
.
One
of
the
components
in
the
inductive
definition
is
primitive
weighting
function
,
which
assigns
the
retrieval
score
to
a
single
term
document
{
d
}
for
a
single
term
query
{
q
}
based
on
where
o
(
q
)
is
a
term
weighting
function
of
q.
A
limitation
of
the
primitive
weighting
function
described
in
Equation
1
is
that
it
can
not
bridge
vocabulary
gaps
between
documents
and
queries
.
To
overcome
this
limitation
,
in
(
Fang
and
Zhai
,
2006
)
,
we
proposed
a
set
of
semantic
term
matching
constraints
and
modified
the
previously
derived
axiomatic
functions
to
make
them
satisfy
these
additional
constraints
.
In
particular
,
the
primitive
weighting
function
is
generalized
as
where
s
(
q
,
d
)
is
a
semantic
similarity
function
between
two
terms
q
and
d
,
and
f
is
a
monotonically
increasing
function
defined
as
where
f3
is
a
parameter
that
regulates
the
weighting
ofthe
original
query
terms
and
the
semantically
similar
terms
.
We
have
shown
that
the
proposed
generalization
can
be
implemented
as
a
query
expansion
method
.
Specifically
,
the
expanded
terms
are
selected
based
on
a
term
similarity
function
s
and
the
weight
of
an
expanded
term
t
is
determined
by
its
term
similarity
with
a
query
term
q
,
i.e.
,
s
(
q
,
t
)
,
as
well
as
the
weight
of
the
query
term
,
i.e.
,
o
(
q
)
.
Note
that
the
weight
of
an
expanded
term
t
is
o
(
t
)
in
traditional
query
expansion
methods
.
In
our
previous
study
(
Fang
and
Zhai
,
2006
)
,
term
similarity
function
s
is
derived
based
on
the
mutual
information
of
terms
over
collections
that
are
constructed
under
the
guidance
of
a
set
of
term
semantic
similarity
constraints
.
The
focus
of
this
paper
is
to
study
and
compare
several
term
similarity
functions
exploiting
the
information
from
lexical
resources
,
and
evaluate
their
effectiveness
in
the
axiomatic
retrieval
models
.
4
Term
Similarity
based
on
Lexical
Resources
In
this
section
,
we
discuss
a
set
of
term
similarity
functions
that
exploit
the
information
stored
in
two
lexical
resources
:
WordNet
(
Miller
,
1990
)
and
dependency-based
thesaurus
(
Lin
,
1998
)
.
The
most
commonly
used
lexical
resource
is
WordNet
(
Miller
,
1990
)
,
which
is
a
hand-crafted
lexical
system
developed
at
Princeton
University
.
Words
are
organized
into
four
taxonomies
based
on
different
parts
of
speech
.
Every
node
in
the
WordNet
is
a
synset
,
i.e.
,
a
set
of
synonyms
.
The
definition
of
a
synset
,
which
is
referred
to
as
gloss
,
is
also
provided
.
For
a
query
term
,
all
the
synsets
in
which
the
term
appears
can
be
returned
,
along
with
the
definition
of
the
synsets
.
We
now
discuss
six
possible
term
similarity
functions
based
on
the
information
provided
by
WordNet
.
Since
the
definition
provides
valuable
information
about
the
semantic
meaning
of
a
term
,
we
can
use
the
definitions
of
the
terms
to
measure
their
semantic
similarity
.
The
more
common
words
the
definitions
of
two
terms
have
,
the
more
similar
these
terms
are
(
Banerjee
and
Pedersen
,
2005
)
.
Thus
,
we
can
compute
the
term
semantic
similarity
based
on
synset
definitions
in
the
following
way
:
where
D
(
t
)
is
the
concatenation
of
the
definitions
for
all
the
synsets
containing
term
t
and
|
D
|
is
the
number
of
words
of
the
set
D.
Within
a
taxonomy
,
synsets
are
organized
by
their
lexical
relations
.
Thus
,
given
a
term
,
related
terms
can
be
found
in
the
synsets
related
to
the
synsets
containing
the
term
.
In
this
paper
,
we
consider
the
following
five
word
relations
.
•
Synonym
(
Syn
)
:
X
and
Y
are
synonyms
if
they
are
interchangeable
in
some
context
.
•
Hypernym
(
Hyper
)
:
Y
is
a
hypernym
of
X
if
X
is
a
(
kind
of
)
Y.
a
(
kind
of
)
Y.
•
Holonym
(
Holo
)
:
Y
is
a
holonym
of
Y
if
X
is
a
part
of
Y.
•
Meronym
(
Mero
)
:
X
is
a
meronym
of
Y
if
X
is
a
part
of
Y.
Since
these
relations
are
binary
,
we
define
the
term
similarity
functions
based
on
these
relations
in
the
following
way
.
where
R
G
{
syn
,
hyper
,
hypo
,
holo
,
mero
}
,
TR
(
t
)
is
a
set
of
words
that
are
related
to
term
t
based
on
the
relation
R
,
and
as
are
non-zero
parameters
to
control
the
similarity
between
terms
based
on
different
relations
.
However
,
since
the
similarity
values
for
all
term
pairs
are
same
,
the
values
of
these
parameters
can
be
ignored
when
we
use
Equation
2
in
query
expansion
.
Another
lexical
resource
we
study
in
the
paper
is
the
dependency-based
thesaurus
provided
by
Lin
1
(
Lin
,
1998
)
.
The
thesaurus
provides
term
similarities
that
are
automatically
computed
based
on
dependency
relationships
extracted
from
a
parsed
corpus
.
We
define
a
similarity
function
that
can
utilize
this
thesaurus
as
follows
:
where
L
(
t1
,
t2
)
is
the
similarity
of
terms
stored
in
the
dependency-based
thesaurus
and
TPLin
is
a
set
of
all
the
term
pairs
stored
in
the
thesaurus
.
The
similarity
of
two
terms
would
be
assigned
to
zero
if
we
can
not
find
the
term
pair
in
the
thesaurus
.
Since
all
the
similarity
functions
discussed
above
capture
different
perspectives
of
term
relations
,
we
1Available
at
http
:
/
/
www.cs.ualberta.ca
/
lindek
/
downloads.htm
propose
a
simple
strategy
to
combine
these
similarity
functions
so
that
the
similarity
of
a
term
pair
is
the
highest
similarity
value
of
these
two
terms
of
all
the
above
similarity
functions
,
which
is
shown
as
follows
.
Rset
=
{
def
,
syn
,
hyper
,
hypo
,
holo
,
mero
,
Lin
}
.
In
summary
,
we
have
discussed
eight
possible
similarity
functions
that
exploit
the
information
from
the
lexical
resources
.
We
then
incorporate
these
similarity
functions
into
the
axiomatic
retrieval
models
based
on
Equation
2
,
and
perform
query
expansion
based
on
the
procedure
described
in
Section
3
.
The
empirical
results
are
reported
in
Section
5
.
5
Experiments
In
this
section
,
we
experimentally
evaluate
the
effectiveness
of
query
expansion
with
the
term
similarity
functions
discussed
in
Section
4
in
the
axiomatic
framework
.
Experiment
results
show
that
the
similarity
function
based
on
synset
definitions
is
most
effective
.
By
incorporating
this
similarity
function
into
the
axiomatic
retrieval
models
,
we
show
that
query
expansion
using
the
information
from
only
WordNet
can
lead
to
significant
improvement
of
retrieval
performance
,
which
has
not
been
shown
in
the
previous
studies
(
Voorhees
,
1994
;
Stairmand
,
1997
)
.
5.1
Experiment
Design
We
conduct
three
sets
of
experiments
.
First
,
we
compare
the
effectiveness
of
term
similarity
functions
discussed
in
Section
4
in
the
context
of
query
expansion
.
Second
,
we
compare
the
best
one
with
the
term
similarity
functions
derived
from
co-occurrence-based
resources
.
Finally
,
we
study
whether
the
combination
of
term
similarity
functions
from
different
resources
can
further
improve
the
performance
.
Table
1
:
Statistics
of
Test
Collections
Collection
Description
#Voc
.
#Doc
.
news
articles
technical
reports
government
documents
ad
hoc
data
web
collections
the
vocabulary
size
,
the
number
of
documents
and
the
number
of
queries
.
The
preprocessing
only
involves
stemming
with
Porter
's
stemmer
.
We
use
WordNet
3.0
2
,
Lemur
Toolkit
3
and
TrecWN
library
4
in
experiments
.
The
results
are
evaluated
with
both
MAP
(
mean
average
precision
)
and
gMAP
(
geometric
mean
average
precision
)
(
Voorhees
,
2005
)
,
which
emphasizes
the
performance
of
difficulty
queries
.
There
is
one
parameter
f3
in
the
query
expansion
method
presented
in
Section
3
.
We
tune
the
value
of
f3
and
report
the
best
performance
.
The
parameter
sensitivity
is
similar
to
the
observations
described
in
(
Fang
and
Zhai
,
2006
)
and
will
not
be
discussed
in
this
paper
.
In
all
the
result
tables
,
\
and
f
indicate
that
the
performance
difference
is
statistically
significant
according
to
Wilcoxon
signed
rank
test
at
the
level
of
0.05
and
0.1
respectively
.
We
now
explain
the
notations
of
different
methods
.
BL
is
the
baseline
method
without
query
expansion
.
In
this
paper
,
we
use
the
best
performing
function
derived
in
axiomatic
retrieval
models
,
i.e
,
F2-EXP
in
(
Fang
and
Zhai
,
2005
)
with
a
fixed
parameter
value
(
b
=
0.5
)
.
QEX
is
the
query
expansion
method
with
term
similarity
function
sX
,
where
X
could
be
Def
,
Syn
.
,
Hyper
.
,
Hypo
.
,
Mero
.
,
Holo
.
,
Lin
and
Combined
.
Furthermore
,
we
examine
the
query
expansion
method
using
co-occurrence-based
resources
.
In
particular
,
we
evaluate
the
retrieval
performance
using
the
following
two
similarity
functions
:
sMIBL
and
sMiimp
.
Both
functions
are
based
on
the
mutual
information
of
terms
in
a
set
of
documents
.
sMIBL
uses
the
collection
itself
to
compute
the
mutual
information
,
while
sMIImp
uses
the
working
sets
con
-
Xti
is
a
binary
random
variable
corresponding
to
the
presence
/
absence
of
term
ti
in
each
document
of
collection
C.
5.2
Effectiveness
of
Lexical
Resources
We
first
compare
the
retrieval
performance
of
query
expansion
with
different
similarity
functions
using
short
keyword
(
i.e.
,
title-only
)
queries
,
because
query
expansion
techniques
are
often
more
effective
for
shorter
queries
(
Voorhees
,
1994
;
Fang
and
Zhai
,
2006
)
.
The
results
are
presented
in
Table
2
.
It
is
clear
that
query
expansion
with
these
functions
can
improve
the
retrieval
performance
,
although
the
performance
gains
achieved
by
different
functions
vary
a
lot
.
In
particular
,
we
make
the
following
observations
.
First
,
the
similarity
function
based
on
synset
definitions
is
the
most
effective
one
.
QEdef
significantly
improves
the
retrieval
performance
for
all
the
data
sets
.
For
example
,
in
trec7
,
it
improves
the
performance
from
0.186
to
0.216
.
As
far
as
we
know
,
none
of
the
previous
studies
showed
such
significant
performance
improvement
by
using
only
WordNet
as
query
expansion
resource
.
Second
,
the
similarity
functions
based
on
term
relations
are
less
effective
compared
with
definition-based
similarity
function
.
We
think
that
the
worse
performance
is
related
to
the
following
two
reasons
:
(
1
)
The
similarity
functions
based
on
relations
are
binary
,
which
is
not
a
good
way
to
model
term
similarities
.
(
2
)
The
relations
are
limited
by
the
part
Table
2
:
Performance
of
query
expansion
using
lexical
resources
(
short
keyword
queries
)
Q
Eq
ambined
Table
3
:
Performance
comparison
of
hand-crafted
and
co-occurrence-based
thesauri
(
short
keyword
queries
)
of
speech
of
the
terms
,
because
two
terms
in
WordNet
are
related
only
when
they
have
the
same
part
of
speech
tags
.
However
,
definition-based
similarity
function
does
not
have
such
a
limitation
.
Third
,
the
similarity
function
based
on
Lin
's
thesaurus
is
more
effective
than
those
based
on
term
relations
from
the
WordNet
,
while
it
is
less
effective
compared
with
the
definition-based
similarity
function
,
which
might
be
caused
by
its
smaller
coverage
.
Finally
,
combining
different
WordNet-based
similarity
functions
does
not
help
,
which
may
indicate
that
the
expanded
terms
selected
by
different
functions
are
overlapped
.
5.3
Comparison
with
Co-occurrence-based
Resources
As
shown
in
Table
2
,
the
similarity
function
based
on
synset
definitions
,
i.e.
,
sdef
,
is
most
effective
.
We
now
compare
the
retrieval
performance
of
using
this
similarity
function
with
that
of
using
the
mutual
information
based
functions
,
i.e.
,
sMIBL
and
sMIImp
.
The
experiments
are
conducted
over
two
types
of
queries
,
i.e.
short
keyword
(
keyword
title
)
and
short
verbose
(
one
sentence
description
)
queries
.
The
results
for
short
keyword
queries
are
shown
in
Table
3
.
The
retrieval
performance
of
query
expansion
based
on
sdef
is
significantly
better
than
that
based
on
sMIBL
on
almost
all
the
data
sets
,
while
it
is
slightly
worse
than
that
based
on
sMIImp
on
some
data
sets
.
We
can
make
the
similar
observation
from
the
results
for
short
verbose
queries
as
shown
in
Table
4
.
One
advantage
of
sdef
over
sMIImp
is
the
computational
cost
,
because
sdef
can
be
computed
offline
in
advance
while
sMIImp
has
to
be
computed
online
from
query-dependent
working
sets
which
takes
much
more
time
.
The
low
computa
-
tional
cost
and
high
retrieval
performance
make
sdef
more
attractive
in
the
real
world
applications
.
Since
both
types
of
similarity
functions
are
able
to
improve
retrieval
performance
,
we
now
study
whether
combining
them
could
lead
to
better
performance
.
Table
5
shows
the
retrieval
performance
of
combining
both
types
of
similarity
functions
for
short
keyword
queries
.
The
results
for
short
verbose
queries
are
similar
.
Clearly
,
combining
the
similarity
functions
from
different
resources
could
further
improve
the
performance
.
6
Conclusions
Query
expansion
is
an
effective
technique
in
information
retrieval
to
improve
the
retrieval
performance
,
because
it
often
can
bridge
the
vocabulary
gaps
between
queries
and
documents
.
Intuitively
,
hand-crafted
thesaurus
could
provide
reliable
related
terms
,
which
would
help
improve
the
performance
.
However
,
none
of
the
previous
studies
is
able
to
show
significant
performance
improvement
through
query
expansion
using
information
only
from
manually
created
lexical
resources
.
In
this
paper
,
we
re-examine
the
problem
ofquery
expansion
using
lexical
resources
in
recently
proposed
axiomatic
framework
and
find
that
we
are
able
to
significantly
improve
retrieval
performance
through
query
expansion
using
only
hand-crafted
lexical
resources
.
In
particular
,
we
first
study
a
few
term
similarity
functions
exploiting
the
information
from
two
lexical
resources
:
WordNet
and
dependency-based
thesaurus
created
by
Lin
.
We
then
incorporate
the
similarity
functions
with
the
query
expansion
method
in
the
axiomatic
retrieval
Table
4
:
Performance
Comparison
(
MAP
,
short
verbose
queries
)
Table
5
:
Additive
Effect
(
MAP
,
short
keyword
queries
)
QEdef+mibl
QEdef+MIImp
models
.
Systematical
experiments
have
been
conducted
over
six
standard
TREC
collections
and
show
promising
results
.
All
the
proposed
similarity
functions
improve
the
retrieval
performance
,
although
the
degree
of
improvement
varies
for
different
similarity
functions
.
Among
all
the
functions
,
the
one
based
on
synset
definition
is
most
effective
and
is
able
to
significantly
and
consistently
improve
retrieval
performance
for
all
the
data
sets
.
This
similarity
function
is
also
compared
with
some
similarity
functions
using
mutual
information
.
Furthermore
,
experiment
results
show
that
combining
similarity
functions
from
different
resources
could
further
improve
the
performance
.
Unlike
previous
studies
,
we
are
able
to
show
that
query
expansion
using
only
manually
created
thesauri
can
lead
to
significant
performance
improvement
.
The
main
reason
is
that
the
axiomatic
approach
provides
guidance
on
how
to
appropriately
assign
weights
to
expanded
terms
.
There
are
many
interesting
future
research
directions
based
on
this
work
.
First
,
we
will
study
the
same
problem
in
some
specialized
domain
,
such
as
biology
literature
,
to
see
whether
the
proposed
approach
could
be
generalized
to
the
new
domain
.
Second
,
the
fact
that
using
axiomatic
approaches
to
incorporate
linguistic
information
can
improve
retrieval
performance
is
encouraging
.
We
plan
to
extend
the
axiomatic
approach
to
incorporate
more
linguistic
information
,
such
as
phrases
and
word
senses
,
into
retrieval
models
to
further
improve
the
performance
.
Acknowledgments
We
thank
ChengXiang
Zhai
,
Dan
Roth
,
Rodrigo
de
Salvo
Braz
for
valuable
discussions
.
We
also
thank
three
anonymous
reviewers
for
their
useful
comments
.
