We
present
results
that
show
that
incorporating
lexical
and
structural
semantic
information
is
effective
for
word
sense
disambiguation
.
We
evaluated
the
method
by
using
precise
information
from
a
large
treebank
and
an
ontology
automatically
created
from
dictionary
sentences
.
Exploiting
rich
semantic
and
structural
information
improves
precision
2-3
%
.
The
most
gains
are
seen
with
verbs
,
with
an
improvement
of
5.7
%
over
a
model
using
only
bag
of
words
and
n-gram
features
.
1
Introduction
Recently
,
significant
improvements
have
been
made
in
combining
symbolic
and
statistical
approaches
to
various
natural
language
processing
tasks
.
In
parsing
,
for
example
,
symbolic
grammars
are
being
combined
with
stochastic
models
(
Riezler
et
al.
,
2002
;
Oepen
et
al.
,
2002
;
Malouf
and
van
Noord
,
2004
)
.
Statistical
techniques
have
also
been
shown
to
be
useful
for
word
sense
disambiguation
(
Stevenson
,
2003
)
.
However
,
to
date
,
there
have
been
few
combinations
of
sense
information
together
with
symbolic
grammars
and
statistical
models
.
Klein
and
Manning
(
2003
)
show
that
much
of
the
gain
in
statistical
parsing
using
lexicalized
models
comes
from
the
use
of
a
small
set
of
function
words
.
Features
based
on
general
relations
provide
little
improvement
,
presumably
because
the
data
is
too
sparse
:
in
the
Penn
treebank
normally
used
to
train
and
test
statistical
parsers
stocks
and
skyrocket
never
appear
together
.
They
note
that
this
should
motivate
the
use
of
similarity
and
/
or
class
based
approaches
:
the
superordinate
concepts
capital
(
D
stocks
)
and
move
upward
(
D
sky
rocket
)
frequently
appear
together
.
However
,
there
has
been
little
success
in
this
area
to
date
.
For
example
,
Xiong
et
al.
(
2005
)
use
semantic
knowledge
to
parse
Chinese
,
but
gain
only
a
marginal
improvement
.
Focusing
on
WSD
,
Stevenson
(
2003
)
and
others
have
shown
that
the
use
of
syntactic
information
(
predicate-argument
relations
)
improve
the
quality
of
word
sense
disambiguation
the
effectiveness
of
the
selectional
preference
information
for
WSD
.
However
,
there
is
still
little
work
on
combining
WSD
and
parse
selection
.
We
hypothesize
that
one
of
the
reasons
for
the
lack
of
success
is
that
there
has
been
no
resource
annotated
with
both
syntactic
(
or
structural
semantic
information
)
and
lexical
semantic
information
.
For
English
,
there
is
the
SemCor
corpus
(
Fellbaum
,
1998
)
which
is
annotated
with
parse
trees
and
WordNet
senses
,
but
it
is
fairly
small
,
and
does
not
explicitly
include
any
structural
semantic
information
.
Therefore
,
we
decided
to
construct
and
use
a
tree-bank
with
both
syntactic
information
(
e.g.
HPSG
parses
)
and
lexical
semantic
information
(
e.g.
sense
tags
)
:
the
Hinoki
treebank
(
Bond
et
al.
,
2004
)
.
This
can
be
used
to
train
word
sense
disambiguation
and
parse
ranking
models
using
both
syntactic
and
lexical
semantic
features
.
In
this
paper
,
we
discuss
only
word
sense
disambiguation
.
Parse
ranking
is
discussed
in
Fujita
et
al.
(
2007
)
.
2
The
Hinoki
Corpus
The
Hinoki
corpus
consists
of
the
Lexeed
Semantic
Database
of
Japanese
(
Kasahara
et
al.
,
2004
)
and
corpora
annotated
with
syntactic
and
semantic
infor
-
Proceedings
of
the
2007
Joint
Conference
on
Empirical
Methods
in
Natural
Language
Processing
and
Computational
Natural
Language
Learning
,
pp.
477-485
,
Prague
,
June
2007
.
©
2007
Association
for
Computational
Linguistics
mation
.
Lexeed
is
a
database
built
from
on
a
dictionary
,
which
defines
word
senses
used
in
the
Hinoki
corpus
and
has
around
49,000
dictionary
definition
sentences
and
46,000
example
sentences
which
are
syntactically
and
semantically
annotated
.
Lexeed
consists
of
all
words
with
a
familiarity
greater
than
or
equal
to
five
on
a
scale
of
one
to
seven
.
This
gives
a
fundamental
vocabulary
of
28,000
words
,
divided
into
46,347
different
senses
.
Each
sense
has
a
definition
sentence
and
example
sentence
written
using
only
these
28,000
familiar
words
(
and
some
function
words
)
.
Many
senses
have
more
than
one
sentence
in
the
definition
:
there
are
75,000
defining
sentences
in
all
.
A
(
simplified
)
example
of
the
entry
for
Mfi
"
f
-
un-tenshu
"
chauffeur
"
is
given
in
Figure
1
.
Each
word
contains
the
word
itself
,
its
part
of
speech
(
POS
)
and
lexical
type
(
s
)
in
the
grammar
,
and
the
familiarity
score
.
Each
sense
then
contains
definition
and
example
sentences
,
links
to
other
senses
in
the
lexicon
(
such
as
hypernym
)
,
and
links
to
other
resources
,
such
as
the
Goi-Taikei
(
Ikehara
et
al.
,
1997
)
and
WordNet
(
Fellbaum
,
1998
)
.
Each
content
word
in
the
definition
and
example
sentences
is
annotated
with
sense
tags
from
the
same
lexicon
.
2.2
Lexical
Semantics
Annotation
The
lexical
semantic
annotation
uses
the
sense
inventory
from
Lexeed
.
All
words
in
the
fundamental
vocabulary
are
tagged
with
their
sense
.
For
example
,
the
word
ookii
"
big
"
(
in
ookiku
naru
"
grow
up
"
)
is
tagged
as
sense
5
in
the
example
sentence
(
Figure
1
)
,
with
the
meaning
"
elder
,
older
"
.
Each
word
was
annotated
by
five
annotators
.
We
use
the
majority
choice
in
case
of
disagreements
(
Tanaka
et
al.
,
2006
)
.
Inter-annotator
agreements
among
the
five
annotators
range
from
78.7
%
to
83.3
%
:
the
lowest
agreement
is
for
the
Lexeed
definition
sentences
and
the
highest
is
for
Kyoto
corpus
(
newspaper
text
)
.
These
agreements
reflect
the
difficulties
in
disambiguating
word
sense
over
each
corpus
and
can
be
considered
as
the
upper
bound
of
precision
for
WSD
.
Table
1
shows
the
distribution
of
word
senses
according
to
the
word
familiarity
in
Lexeed
.
Table
i
:
Word
Senses
in
Lexeed
The
Hinoki
corpus
comes
with
an
ontology
semi-automatically
constructed
from
the
parse
results
of
definitions
in
Lexeed
(
Nichols
and
Bond
,
2005
)
.
The
ontology
includes
more
than
80
thousand
relationships
between
word
senses
,
e.g.
synonym
,
hyper-nym
,
abbreviation
,
etc.
The
hypernym
relation
for
untenshu
"
chauffeur
"
is
shown
in
Figure
1
.
Hypernym
or
synonym
relations
exist
for
almost
all
content
words
.
As
part
of
the
ontology
verification
,
all
nominal
and
most
verbal
word
senses
in
Lexeed
were
linked
to
semantic
classes
in
the
Japanese
thesaurus
,
Nihongo
Goi-Taikei
(
Ikehara
et
al.
,
1997
)
.
These
were
then
hand
verified
.
Goi-Taikei
has
about
400,000
words
including
proper
nouns
,
most
nouns
are
classified
into
about
2,700
semantic
classes
.
These
semantic
classes
are
arranged
in
a
hierarchical
structure
(
11
levels
)
.
The
Goi-Taikei
Semantic
Class
for
M
untenshu
"
chauffeur
"
is
shown
in
Figure
1
:
(
C292
:
driver
)
at
level
9
which
is
subordinate
to
(
C4
:
person
)
.
2.5
Syntactic
and
Structural
Semantics
Annotation
Syntactic
annotation
is
done
by
selecting
the
best
parse
(
or
parses
)
from
the
full
analyses
derived
by
a
broad-coverage
precision
grammar
.
The
grammar
is
an
HPSG
implementation
(
JACY
:
Siegel
and
Bender
,
2002
)
,
which
provides
a
high
level
of
detail
,
marking
not
only
dependency
and
constituent
structure
but
also
detailed
semantic
relations
.
As
the
grammar
is
based
on
a
monostratal
theory
of
grammar
(
HPSG
:
Pollard
and
Sag
,
1994
)
it
is
possible
to
simultaneously
annotate
syntactic
and
semantic
structure
without
overburdening
the
annotator
.
Using
a
grammar
enforces
treebank
consistency
—
all
sentences
annotated
are
guaranteed
to
have
well
-
Index
POS
Lex-Type
Familiarity
jSIs^
untenshu
noun
Definition
Hypernym
Sem
.
Class
WordNet
I
dream
of
growing
up
and
becoming
a
train
driver
Figure
1
:
Dictionary
Entry
for
jHk^
untenshu
"
chauffeur
"
formed
parses
.
The
flip
side
to
this
is
that
any
sentences
which
the
parser
cannot
parse
remain
unan-notated
,
at
least
unless
we
were
to
fall
back
on
full
manual
mark-up
of
their
analyses
.
The
actual
annotation
process
uses
the
same
tools
as
the
Redwoods
treebank
of
English
(
Oepen
et
al.
,
2002
)
.
There
were
4
parses
for
the
definition
sentence
shown
in
Figure
1
.
The
correct
parse
,
shown
as
a
phrase
structure
tree
,
is
shown
in
Figure
2
.
The
two
sources
of
ambiguity
are
the
conjunction
and
the
relative
clause
.
The
parser
also
allows
the
conjunction
to
join
to
densha
and
hito
.
In
Japanese
,
relative
clauses
can
have
gapped
and
non-gapped
readings
.
In
the
gapped
reading
(
selected
here
)
,
hito
is
the
subject
of
unten
"
drive
"
.
In
the
non-gapped
reading
there
is
some
underspecified
relation
between
the
thing
and
the
verb
phrase
.
This
is
similar
to
the
difference
in
the
two
readings
of
the
day
he
knew
in
English
:
"
the
day
that
he
knew
about
"
(
gapped
)
vs
"
the
day
on
which
he
knew
(
something
)
"
(
non-gapped
)
.
Such
semantic
ambiguity
is
resolved
by
selecting
the
correct
derivation
tree
that
includes
the
applied
rules
in
building
the
tree
.
The
parse
results
can
be
automatically
given
by
Japanese
grammar
JACY
.
The
current
parse
ranking
model
has
an
accuracy
of
70
%
:
the
correct
tree
is
ranked
first
70
%
of
the
time
(
for
Lexeed
definition
sentences
)
(
Fujita
et
al.
,
2007
)
.
The
full
parse
is
an
HPSG
sign
,
containing
both
syntactic
and
semantic
information
.
A
view
of
the
semantic
information
is
given
in
Figure
31
.
The
specific
meaning
representation
language
used
in
UTTERANCE
densha
ya
jidousha
o
unten
suru
hito
train
or
car
acc
drive
do
person
Figure
2
:
Syntactic
View
of
the
Definition
of
WM
.
1
untenshu
"
chauffeur
"
The
semantic
view
shows
some
ambiguity
has
been
resolved
that
is
not
visible
in
the
purely
syntactic
view
.
The
semantic
view
can
be
further
simplified
into
a
dependency
representation
,
further
abstracting
away
from
quantification
,
as
shown
in
Figure
4
.
One
of
the
advantages
of
the
HPSG
sign
is
that
it
contains
all
this
information
,
making
it
possible
to
extract
the
particular
view
needed
.
In
order
to
make
linking
to
other
resources
(
such
as
the
sense
annotation
)
easier
,
predicates
are
labeled
with
pointers
back
to
their
position
in
the
original
surface
string
.
For
example
,
the
predicate
densha_n_l
links
to
the
surface
characters
between
positions
0
and
3
:
.
JACY
is
Minimal
Recursion
Semantics
(
Copestake
et
al.
,
2005
)
.
TEXT
TOP
"
unknown_rel
udef
_rel
Figure
3
:
Semantic
View
of
the
Definition
of
jSIk^i
untenshu
"
chauffeur
"
Figure
4
:
Dependency
View
of
the
Definition
of
jSIk^i
untenshu
"
chauffeur
"
We
define
the
task
in
this
paper
as
"
allocating
the
word
sense
tags
for
all
content
words
included
in
Lexeed
as
headwords
,
in
each
input
sentence
"
.
This
task
is
a
kind
of
all-words
task
,
however
,
a
unique
point
is
that
we
focus
on
fundamental
vocabulary
(
basic
words
)
in
Lexeed
and
ignore
other
words
.
We
use
Lexeed
as
the
sense
inventory
.
There
are
two
problems
in
resolving
the
task
:
how
to
build
the
model
and
how
to
assign
the
word
sense
by
using
the
model
for
disambiguating
the
senses
.
We
describe
the
word
sense
selection
model
we
use
in
section
4
and
the
method
of
word
sense
assignment
in
section
5
.
4
Word
Sense
Selection
Model
All
content
words
(
i.e.
basic
words
)
in
Lexeed
are
classified
into
six
groups
by
part-of-speech
:
noun
,
verb
,
verbal
noun
,
adjective
,
adverb
,
others
.
We
treat
the
first
five
groups
as
targets
of
disambiguating
senses
.
We
build
five
words
sense
models
corresponding
to
these
groups
.
A
model
contains
senses
for
various
words
,
however
,
features
for
a
word
are
discriminated
from
those
for
other
words
so
that
the
senses
irrelevant
to
a
target
word
are
not
selected
.
For
example
,
an
n-gram
feature
following
a
target
word
"
has-a-tail
"
for
dog
is
distinct
from
that
for
cat
.
In
the
remainder
of
this
section
,
we
describe
the
features
used
in
the
word
sense
disambiguation
.
First
we
used
simple
n-gram
collocations
,
then
a
bag
of
words
of
all
words
occurring
in
the
sentence
.
This
was
then
enhanced
by
using
ontological
information
and
predicate
argument
relations
.
4.1
Word
Collocations
Word
collocations
(
WORD-Col
)
are
basic
and
effective
cues
for
WSD
.
They
can
be
modelled
by
n-gram
and
bag
of
words
features
,
which
are
easily
extracted
from
a
corpus
.
We
used
all
unigrams
,
bi-grams
and
trigrams
which
precede
and
follow
the
target
words
(
N-gram
)
and
all
content
words
in
the
sentences
where
the
target
words
occur
(
BOW
)
.
#
sample
features
Table
2
:
Example
semantic
collocation
features
(
SEM-Col
)
extracted
from
the
word
sense
tagged
corpus
and
the
dictionary
(
Lexeed
and
GoiTaikei
)
and
the
ontology
which
have
the
word
senses
and
the
semantic
classes
linked
to
the
semantic
tags
.
4.2
Semantic
Features
We
use
the
semantic
information
(
sense
tags
and
ontologies
)
in
two
ways
.
One
is
to
enhance
the
collocations
and
the
other
is
to
enhance
dependency
relations
.
Word
surface
features
like
N-gram
and
BOW
inevitably
suffer
from
data
sparseness
,
therefore
,
we
generalize
them
to
more
abstract
words
or
concepts
and
also
consider
words
having
the
same
meanings
.
We
used
the
ontology
described
in
Section
2.3
to
get
hypernyms
and
synonyms
and
the
Goi-Taikei
thesaurus
to
abstract
the
words
to
the
semantic
classes
.
The
superordinate
classes
at
level
3
,
4
and
5
are
also
added
in
addition
to
the
original
semantic
class
.
For
example
,
densha
"
train
"
and
§
Uj
M
jidousha
"
automobile
"
are
both
generalized
to
the
semantic
class
(
C988
:
land
vehicle
)
(
level
7
)
.
The
superordinate
classes
are
also
used
:
(
C706
:
inanimate
)
(
level
3
)
,
(
C760
:
artifact
)
(
level
4
)
and
(
C986
:
vehicle
)
(
level
5
)
.
The
semantic
dependency
features
are
based
on
a
predicate
and
its
arguments
taken
from
the
elementary
dependencies
.
For
example
,
consider
the
semantic
dependency
representation
for
densha
ya
Table
3
:
Example
semantic
features
extracted
from
the
dependency
tree
in
Figure
4
.
The
first
column
numbers
the
feature
template
corresponding
to
each
example
.
jidousha-wo
unten
suru
hito
"
a
person
who
drives
a
train
or
car
"
given
in
Figure
4
.
The
predicate
unten
"
drive
"
,
has
two
arguments
:
arg1
hito
"
person
"
and
arg2
ya
"
or
"
.
The
coordinate
conjunction
is
expanded
out
into
its
children
,
giving
arg2
densha
"
train
"
and
jidousha
"
automobile
"
.
From
these
,
we
produce
several
features
,
a
sample
of
them
are
shown
in
Table
3
.
One
has
all
arguments
and
their
labels
(
D11
)
.
We
also
produce
various
back
offs
,
for
example
the
predicate
with
only
one
argument
at
a
time
(
D1-D3
)
.
Each
combination
of
predicate
and
its
related
argument
(
s
)
becomes
a
feature
.
For
the
next
class
of
features
,
we
used
the
sense
information
from
the
corpus
combined
with
the
semantic
classes
in
the
dictionary
to
replace
each
pred
-
icate
by
its
disambiguated
sense
,
its
hypernym
,
its
synonym
(
if
any
)
and
its
semantic
class
.
The
semantic
classes
for
and
g
iS^a
are
both
(
988
:
land
vehicle
)
,
while
jSfjh
is
(
2003
:
motion
)
and
A4
is
(
4
:
human
)
.
We
also
expand
@f
§
jj^4
into
its
synonym
^
—
9
—
ij
—
i
motaka
"
motor
car
"
.
The
semantic
class
features
provide
a
semantic
smoothing
,
as
words
are
binned
into
the
2,700
classes
.
The
hypernym
/
synonym
features
provide
even
more
smoothing
.
Both
have
the
effect
of
making
more
training
data
available
for
the
disambigua-tor
.
Domain
information
is
a
simple
and
sometimes
strong
cue
for
disambiguating
the
target
words
(
Gliozzo
et
al.
,
2005
)
.
For
instance
,
the
sense
of
the
word
"
record
"
is
likey
to
be
different
in
the
musical
context
,
which
is
recalled
by
domain-specific
words
like
"
orchestra
"
,
"
guitar
"
,
than
in
the
sporting
context
.
We
use
12
domain
categories
like
"
culture
/
art
"
,
"
sport
"
,
etc.
which
are
similar
to
ones
used
in
directory
search
web
sites
.
About
6,000
words
are
automatically
classified
into
one
of
12
domain
categories
by
distributions
in
web
sites
(
Hashimoto
and
Kurohashi
,
2007
)
and
10
%
of
them
are
manually
checked
.
Polysemous
words
which
belong
to
multiple
domains
and
neutral
words
are
not
classified
into
any
domain
.
5
Search
Algorithm
The
conditional
probability
of
the
word
sense
for
each
word
is
given
by
the
word
sense
selection
model
described
in
Section
4
.
In
the
initial
state
,
some
of
the
semantic
features
,
e.g.
semantic
collocations
(
SEM-Col
)
and
word
sense
extensions
for
semantic
dependencies
(
SEM-Dep
)
are
not
available
,
since
no
word
senses
for
polysemous
words
have
been
determined
.
It
is
not
practical
to
count
all
combinations
of
word
senses
for
target
words
,
therefore
,
we
first
try
to
decide
the
sense
for
that
word
which
is
most
plausible
among
all
the
ambiguous
words
,
then
,
disambiguate
the
next
word
by
using
the
sense
.
We
use
the
beam
search
algorithm
,
which
is
similar
to
that
used
for
decoder
in
statistical
machine
translation
(
Watanabe
,
2004
)
,
for
finding
the
plausible
combination
of
word
sense
tags
.
Create
an
initial
node
N0
=
[
T0
,
W0
]
(
T0
=
{
}
,
W0
=
{
}
)
and
insert
the
node
into
an
initial
queue
Q0
.
For
each
node
N
in
the
queue
Q
,
do
the
following
steps
.
If
the
top
node
W
in
the
queue
Q
is
empty
,
adopt
T
as
the
combination
of
word
senses
and
terminate
.
Otherwise
,
pick
out
the
top
b
nodes
from
Q
and
insert
them
into
new
queue
Q
,
then
go
back
to
2
6
Evaluation
We
trained
and
tested
on
the
Lexeed
Dictionary
Definition
(
LXD-DEF
)
and
Example
sections
(
LXD-EX
)
of
the
Hinoki
corpus
(
Bond
et
al.
,
2007
)
.
These
have
about
75,000
definition
and
46,000
example
sentences
respectively
.
Some
54,000
and
36,000
sentences
of
them
are
treebanked
,
i.e.
,
they
have
the
syntactic
trees
and
structural
semantic
information
.
We
used
these
sentences
with
the
complete
information
and
selected
1,000
sentences
out
of
each
sentence
class
as
test
sets
(
LXD-DEFtest
,
LXD-EXtest
)
,
and
the
remainder
is
combined
and
used
as
a
training
set
(
LXD-ALL
)
.
We
also
tested
1,000
sentences
from
the
Kyoto
Corpus
of
newspaper
text
(
KYOTOtest
)
.
These
sentences
have
between
3.4
(
LXD-EXtest
)
-
5.2
(
KYOTOtest
)
polysemous
words
per
sentence
on
average
.
We
use
a
maximum
entropy
/
minimum
divergence
(
MEMD
)
modeler
to
train
the
word
sense
selection
model
.
We
use
the
open-source
Maximun
Entropy
Modeling
Toolkit2
for
training
,
determining
best-performing
convergence
thresholds
and
prior
sizes
experimentally
.
The
models
for
five
different
POSs
were
trained
with
each
training
sets
:
the
base
model
is
word
collocation
model
(
WORD-Col
)
,
and
the
semantic
models
built
by
semantic
collocation
(
SEM-Col
)
,
semantic
dependency
(
SEM-Dep
)
or
domain
with
WORD-Col
(
+SEM-Col
,
+SEM-Dep
and
+DOMAIN
)
.
size
of
training
corpus
(
partition
)
Figure
5
:
Learning
Curve
7
Results
and
Discussion
Table
4
shows
the
precision
as
the
results
ofthe
word
sense
disambiguation
on
the
combination
of
LXD-DEF
and
LXD-EX
(
LXD-ALL
)
.
The
baseline
method
selects
the
senses
occurring
most
frequently
in
the
training
corpus
.
Each
row
indicates
the
results
using
the
baseline
,
word
collocation
(
WORD-Col
)
,
the
combinations
of
WORD-Col
and
one
of
the
semantic
features
(
+SEM-Col
,
+SEM-Dep
and
+DOMAIN
)
,
e.g
,
+SEM-Col
gives
the
results
using
WORD-Col
and
SEM-Col
,
and
all
features
(
FULL
)
.
There
are
significant
improvements
over
the
baseline
and
the
other
results
on
all
corpora
.
Basic
word
collocation
features
(
WORD-Col
)
give
a
vast
improvement
.
Extending
this
by
using
the
ontological
information
(
+SEM-Col
)
gives
a
further
improvement
over
the
WORD-Col
.
Adding
the
predicate-argument
relationships
(
+SEM-Dep
)
improves
the
results
even
more
.
Table
6
shows
the
statistics
of
the
target
corpora
.
The
best
result
of
LXD-DEFtest
(
80.7
%
)
surpasses
the
inter-annotator
agreement
(
78.7
%
)
in
building
the
Hinoki
Sensebank
.
However
,
there
is
a
wide
gap
between
the
best
results
of
KYOTOtest
(
60.4
%
)
and
the
inter-annotator
agreement
(
83.3
%
)
,
this
suggests
other
information
such
as
the
semantic
classes
for
named
entities
(
including
proper
nouns
and
multiword
expressions
(
MWE
)
)
and
broader
contexts
are
required
.
However
,
a
model
built
on
dictionary
sentences
lacks
these
features
.
Even
,
so
there
is
some
improvement
.
The
domain
features
(
+DOMAIN
)
give
small
contribution
to
the
precision
,
since
only
intra-sentence
context
is
counted
in
this
experiment
.
Unfortunately
dictiory
definition
and
example
sentences
do
not
really
have
a
useful
context
.
We
expect
broader
context
should
make
the
domain
features
more
effective
for
the
newspaper
text
(
e.g.
as
in
Stevenson
(
2003
)
)
,
Table
5
shows
comparison
of
results
of
different
POSs
.
The
semantic
features
(
+SEM-Col
and
+SEM-Dep
)
are
particularly
effective
for
verb
and
also
give
moderate
improvements
on
the
results
of
the
other
POSs
.
Figure
5
shows
the
precisions
of
LXD-DEFtest
in
changing
the
size
of
a
training
corpus
,
which
is
divided
into
five
partitions
.
The
precision
is
saturated
in
using
four
partitions
(
264,000
tokens
)
.
These
results
of
the
dictionary
sentences
are
close
to
the
best
published
results
for
the
SENSEVAL-2
task
(
79.3
%
by
Murata
et
al.
(
2003
)
using
a
combination
of
simple
Bayes
learners
)
.
However
,
we
are
using
a
different
sense
inventory
(
Lexeed
not
Iwanami
(
Nishio
et
al.
,
1994
)
)
and
testing
over
a
different
corpus
,
so
the
results
are
not
directly
comparable
.
In
future
work
,
we
will
test
over
SENSEVAL-2
data
so
that
we
can
compare
directly
.
None
of
the
SENSEVAL-2
systems
used
onto-logical
information
,
despite
the
fact
that
the
dictionary
definition
sentences
were
made
available
,
and
there
are
several
algorithms
describing
how
to
extract
such
information
from
MRDs
(
Tsurumaru
LXD-DEFtest
LXD-EXtest
KYOTOtest
Table
4
:
The
Precision
ofWSD
Baseline
WORD-Col
et
al.
,
1991
;
Wilkes
et
al.
,
1996
;
Nichols
et
al.
,
2005
)
.
We
hypothesize
that
this
is
partly
due
to
the
way
the
task
is
presented
:
there
was
not
enough
time
to
extract
and
debug
an
ontology
as
well
as
build
a
disambiguation
system
,
and
there
was
no
ontology
distributed
.
The
CRL
system
(
Murata
et
al.
,
2003
)
used
a
syntactic
dependency
parser
as
one
source
of
features
(
KNP
:
Kurohashi
and
Nagao
(
2003
)
)
,
removing
it
decreased
performance
by
around
0.6
%
.
8
Conclusions
We
used
the
Hinoki
corpus
to
test
the
importance
of
lexical
and
structural
information
in
word
sense
disambiguation
.
We
found
that
basic
n-gram
features
and
collocations
provided
a
great
deal
of
useful
information
,
but
that
better
results
could
be
gained
by
using
ontological
information
and
semantic
dependencies
.
Acknowledgements
We
would
like
to
thank
the
other
members
of
the
NTT
Natural
Language
Research
Group
NTT
Communication
Science
laboratories
for
their
support
.
We
would
also
like
to
express
gratitude
to
the
reviewers
for
their
valuable
comments
and
Professor
Zeng
Guangping
,
Wang
Daliang
and
Shen
Bin
of
the
University
of
Science
and
Technology
Beijing
(
USTB
)
for
building
the
demo
system
.
