This
paper
describes
a
probabilistic
model
for
coordination
disambiguation
integrated
into
syntactic
and
case
structure
analysis
.
Our
model
probabilistically
assesses
the
parallelism
of
a
candidate
coordinate
structure
using
syntactic
/
semantic
similarities
and
cooccurrence
statistics
.
We
integrate
these
probabilities
into
the
framework
of
fully-lexicalized
parsing
based
on
large-scale
case
frames
.
This
approach
simultaneously
addresses
two
tasks
of
coordination
disambiguation
:
the
detection
of
coordinate
conjunctions
and
the
scope
disambiguation
of
coordinate
structures
.
Experimental
results
on
web
sentences
indicate
the
effectiveness
of
our
approach
.
1
Introduction
Coordinate
structures
are
a
potential
source
of
syntactic
ambiguity
in
natural
language
.
Since
their
interpretation
directly
affects
the
meaning
of
the
text
,
their
disambiguation
is
important
for
natural
language
understanding
.
Coordination
disambiguation
consists
of
the
following
two
tasks
:
•
the
detection
of
coordinate
conjunctions
,
•
and
finding
the
scope
of
coordinate
structures
.
In
English
,
for
example
,
coordinate
structures
are
triggered
by
coordinate
conjunctions
,
such
as
and
and
or
.
In
a
coordinate
structure
that
consists
of
more
than
two
conjuncts
,
commas
,
which
have
various
usages
,
also
function
like
coordinate
conjunctions
.
Recognizing
true
coordinate
conjunctions
from
such
possible
coordinate
conjunctions
is
a
task
of
coordination
disambiguation
(
Kurohashi
,
1995
)
.
The
other
is
the
task
of
identifying
the
range
of
coordinate
phrases
or
clauses
.
Previous
work
on
coordination
disambiguation
has
focused
on
the
task
of
addressing
the
scope
ambiguity
(
e.g.
,
(
Agarwal
and
Boggess
,
1992
;
Goldberg
,
1999
;
Resnik
,
1999
;
Chantree
et
al.
,
2005
)
)
.
Kurohashi
and
Nagao
proposed
a
similarity-based
method
to
resolve
both
of
the
two
tasks
for
Japanese
(
Kurohashi
and
Nagao
,
1994
)
.
Their
method
,
however
,
heuristically
detects
coordinate
conjunctions
by
considering
only
similarities
between
possible
conjuncts
,
and
thus
cannot
disambiguate
the
following
cases1
:
b.
kanojo-to
watashi-ga
goukaku-shita
she-cnj
I-nom
passed
an
exam
In
sentence
(
1a
)
,
postposition
"
to
"
is
used
as
a
comi-tative
case
marker
,
but
in
sentence
(
1b
)
,
postposition
"
to
"
is
used
as
a
coordinate
conjunction
.
To
resolve
this
ambiguity
,
predicative
case
frames
are
required
.
Case
frames
describe
what
kinds
of
1In
this
paper
,
we
use
the
following
abbreviations
:
nom
(
nominative
)
,
acc
(
accusative
)
,
abl
(
ablative
)
,
cmi
(
comi-tative
)
,
cnj
(
conjunction
)
and
TM
(
topic
marker
)
.
Proceedings
of
the
2007
Joint
Conference
on
Empirical
Methods
in
Natural
Language
Processing
and
Computational
Natural
Language
Learning
,
pp.
3G6-314
,
Prague
,
June
2GG7
.
©
2GG7
Association
for
Computational
Linguistics
Table
1
:
Case
frame
examples
(
Examples
are
written
in
English
.
Numbers
following
each
example
represent
its
frequency
.
)
.
Examples
(
have
difficulty
)
nouns
are
related
to
each
predicate
.
For
example
,
a
case
frame
of
"
iku
"
(
go
)
has
a
"
fo
"
case
slot
filled
with
the
examples
such
as
"
kanojo
"
(
she
)
or
human
.
On
the
other
hand
,
"
goukaku-suru
"
(
pass
an
exam
)
does
not
have
a
"
fo
"
case
slot
but
does
have
a
"
ga
"
case
slot
filled
with
"
kanojo
"
(
she
)
and
"
watashi
"
(
I
)
.
These
case
frames
provide
the
information
for
disambiguating
the
postpositions
"
to
"
in
sentences
(
1a
)
and
(
1b
)
:
(
1a
)
is
not
coordinate
and
(
1b
)
is
coordinate
.
This
paper
proposes
a
method
for
integrating
coordination
disambiguation
into
probabilistic
syntactic
and
case
structure
analysis
.
This
method
simultaneously
addresses
the
two
tasks
of
coordination
disambiguation
by
utilizing
syntactic
/
semantic
parallelism
in
possible
coordinate
structures
and
lexical
preferences
in
large-scale
case
frames
.
We
use
the
case
frames
that
were
automatically
constructed
from
the
web
(
Table
1
)
.
In
addition
,
cooccurrence
statistics
of
coordinate
conjuncts
are
incorporated
into
this
model
.
2
Related
Work
Previous
work
on
coordination
disambiguation
has
focused
mainly
on
finding
the
scope
of
coordinate
structures
.
Agarwal
and
Boggess
proposed
a
method
for
identifying
coordinate
conjuncts
(
Agarwal
and
Boggess
,
1992
)
.
Their
method
simply
matches
parts
of
speech
and
hand-crafted
semantic
tags
of
the
head
words
of
the
coordinate
conjuncts
.
They
tested
their
method
using
the
Merck
Veterinary
Manual
and
found
their
method
had
an
accuracy
of
81.6
%
.
Resnik
described
a
similarity-based
approach
for
coordination
disambiguation
of
nominal
compounds
(
Resnik
,
1999
)
.
He
proposed
a
similarity
measure
based
on
the
notion
of
shared
information
content
.
He
conducted
several
experiments
using
the
Penn
Treebank
and
reported
an
F-measure
of
approximately
70
%
.
Goldberg
applied
a
cooccurrence-based
probabilistic
model
to
determine
the
attachments
of
ambiguous
coordinate
phrases
with
the
form
"
n1
p
n2
cc
n3
"
(
Goldberg
,
1999
)
.
She
collected
approximately
120K
unambiguous
pairs
of
two
coordinate
words
from
a
raw
newspaper
corpus
for
a
one-year
period
and
estimated
parameters
from
these
statistics
.
Her
method
achieved
an
accuracy
of
72
%
using
the
Penn
Treebank
.
Chantree
et
al.
presented
a
binary
classifier
for
coordination
ambiguity
(
Chantree
et
al.
,
2005
)
.
Their
model
is
based
on
word
distribution
information
obtained
from
the
British
National
Corpus
.
They
achieved
an
F-measure
(
J3
=
0.25
)
of
47.4
%
using
their
own
test
set
.
The
previously
described
methods
focused
on
coordination
disambiguation
.
Some
research
has
been
undertaken
that
integrated
coordination
disambiguation
into
parsing
.
Kurohashi
and
Nagao
proposed
a
Japanese
parsing
method
that
included
coordinate
structure
detection
(
Kurohashi
and
Nagao
,
1994
)
.
Their
method
first
detects
coordinate
structures
in
a
sentence
,
and
then
heuristically
determines
the
dependency
structure
of
the
sentence
under
the
constraints
of
the
detected
coordinate
structures
.
Their
method
correctly
analyzed
97
Japanese
sentences
out
of
150
.
Charniak
and
Johnson
used
some
features
of
syntactic
parallelism
in
coordinate
structures
for
their
MaxEnt
reranking
parser
(
Charniak
and
Johnson
,
2005
)
.
The
reranker
achieved
an
F-measure
of
91.0
%
,
which
is
higher
than
that
of
their
generative
parser
(
89.7
%
)
.
However
,
they
used
a
numerous
number
of
features
,
and
the
contribution
of
the
Table
2
:
Expressions
that
indicate
coordinate
structures
.
(
a
)
coordinate
noun
phrase
:
,
(
comma
)
to
ya
toka
katsu
oyobi
ka
aruiwa
.
.
.
(
b
)
coordinate
predicative
clause
:
-
shi
ga
oyobi
ka
aruiwa
matawa
.
.
.
(
c
)
incomplete
coordinate
structure
:
,
(
comma
)
oyobi
narabini
aruiwa
.
.
.
parallelism
features
is
unknown
.
Dubey
et
al.
proposed
an
unlexicalized
PCFG
parser
that
modified
PCFG
probabilities
to
condition
the
existence
of
syntactic
parallelism
(
Dubey
et
al.
,
2006
)
.
They
obtained
an
F-measure
increase
of
0.4
%
over
their
baseline
parser
(
73.0
%
)
.
Experiments
with
a
lexicalized
parser
were
not
conducted
in
their
work
.
A
number
of
machine
learning-based
approaches
to
Japanese
parsing
have
been
developed
.
Among
them
,
the
best
parsers
are
the
SVM-based
dependency
analyzers
(
Kudo
and
Matsumoto
,
2002
;
Sas-sano
,
2004
)
.
In
particular
,
Sassano
added
some
features
to
improve
his
parser
by
enabling
it
to
detect
coordinate
structures
(
Sassano
,
2004
)
.
However
,
the
added
features
did
not
contribute
to
improving
the
parsing
accuracy
.
This
failure
can
be
attributed
to
the
inability
to
consider
global
parallelism
.
3
Coordination
Ambiguity
in
Japanese
In
Japanese
,
the
bunsetsu
is
a
basic
unit
of
dependency
that
consists
of
one
or
more
content
words
and
the
following
zero
or
more
function
words
.
A
bun-setsu
corresponds
to
a
base
phrase
in
English
and
"
eojeol
"
in
Korean
.
Coordinate
structures
in
Japanese
are
classified
into
three
types
.
The
first
type
is
the
coordinate
noun
phrase
.
(
2
)
nagai
enpitsu-to
keshigomu-wo
katta
long
pencil-cnj
eraser-acc
bought
(
bought
a
long
pencil
and
an
eraser
)
We
can
find
these
phrases
by
referring
to
the
words
listed
in
Table
2-a
.
The
second
type
is
the
coordinate
predicative
clause
,
in
which
two
or
more
predicates
form
a
coordinate
structure
.
j
-
"
An
\
Partial
matrix
Figure
1
:
Method
using
triangular
matrix
.
(
3
)
kanojo-to
kekkon-shi
ie-wo
katta
she-cmi
married
house-acc
bought
(
married
her
and
bought
a
house
)
We
can
find
these
clauses
by
referring
to
the
words
and
ending
forms
listed
in
Table
2-b
.
The
third
type
is
the
incomplete
coordinate
structure
,
in
which
some
parts
of
coordinate
predicative
clauses
are
present
.
We
can
find
these
structures
by
referring
to
the
words
listed
in
Table
2-c
and
also
the
correspondence
of
case-marking
postpositions
.
For
all
of
these
types
,
we
can
detect
the
possibility
of
a
coordinate
structure
by
looking
for
a
coordination
key
bunsetsu
that
accompanies
one
of
the
words
listed
in
Table
2
(
in
total
,
we
have
52
coordination
expressions
)
.
That
is
to
say
,
the
left
and
right
sides
of
a
coordination
key
bunsetsu
constitute
possible
pre-and
post-conjuncts
,
and
the
key
bunsetsu
is
located
at
the
end
of
the
pre-conjunct
.
The
size
of
the
con-juncts
corresponds
to
the
scope
of
the
coordination
.
4
Calculating
Similarity
between
Possible
Coordinate
Conjuncts
We
assess
the
parallelism
of
potential
coordinate
structures
in
a
probabilistic
parsing
model
.
In
this
arugorizumu-wo
0
2
hyogen
dekiru
0
kijutsuryoku-to
post-conjunct
(
Programming
language
requires
descriptive
power
to
express
an
algorithm
for
solving
problems
and
a
framework
to
sufficiently
drive
functions
of
a
computer
.
)
Figure
2
:
Example
of
calculating
path
scores
.
section
,
we
describe
a
method
for
calculating
similarities
between
potential
coordinate
conjuncts
.
To
measure
the
similarity
between
potential
pre-and
post-conjuncts
,
a
lot
of
work
on
the
coordination
disambiguation
used
the
similarity
between
conjoined
heads
.
However
,
not
only
the
conjoined
heads
but
also
other
components
in
conjuncts
have
some
similarity
and
furthermore
structural
parallelism
.
Therefore
,
we
use
a
method
to
calculate
the
similarity
between
two
whole
coordinate
conjuncts
(
Kurohashi
and
Nagao
,
1994
)
.
The
remainder
of
this
section
contains
a
brief
description
of
this
method
.
To
calculate
similarity
between
two
series
of
bun-setsus
,
a
triangular
matrix
,
A
,
is
used
(
illustrated
in
Figure
1
)
.
where
l
is
the
number
of
bunsetsus
in
a
sentence
,
diagonal
element
a
(
i
,
j
)
is
the
i-th
bunsetsu
,
and
element
a
(
i
,
j
)
(
i
&lt;
j
)
is
the
similarity
value
between
bunsetsus
bi
and
bj.
A
similarity
value
between
two
bunsetsus
is
calculated
on
the
basis
of
POS
matching
,
exact
word
matching
,
and
their
semantic
closeness
in
a
thesaurus
tree
(
Kurohashi
and
Nagao
,
1994
)
.
We
use
the
Bunruigoihyo
thesaurus
,
which
contains
96,000
Japanese
words
(
The
National
Institute
for
Japanese
Language
,
2004
)
.
To
detect
a
coordinate
structure
involving
a
key
bunsetsu
,
bn
,
we
consider
only
a
partial
matrix
(
denoted
An
)
,
that
is
,
the
upper
right
part
of
bn
(
Figure
1
)
.
potential
pre
-
and
post-conjuncts
,
a
path
is
defined
as
follows
:
where
n
+
1
&lt;
m
&lt;
l
,
a
(
p1
}
m
)
=
0
,
p1
=
n
,
pi
&gt;
Pi+i
,
(
1
&lt;
i
&lt;
m
—
n
—
1
)
.
That
is
,
a
path
represents
a
series
of
elements
from
a
non-zero
element
in
the
lowest
row
in
An
to
an
element
in
the
leftmost
column
in
An
.
The
path
has
an
only
element
in
each
column
and
extends
toward
the
upper
left
.
The
series
of
bunsetsus
on
the
left
side
of
the
path
and
the
series
under
the
path
are
potential
conjuncts
for
key
bn.
Figure
2
shows
an
example
of
a
path
.
A
path
score
is
defined
based
on
the
following
criteria
:
•
the
sum
of
each
element
's
points
on
the
path
•
penalty
points
when
the
path
extends
non-diagonally
(
which
causes
conjuncts
of
unbalanced
lengths
)
•
bonus
points
on
expressions
signaling
the
beginning
or
ending
of
a
coordinate
structure
,
such
as
"
kaku
"
(
each
)
and
nado
"
(
and
so
on
)
•
the
total
score
of
the
above
criteria
is
divided
by
the
square
root
of
the
number
of
bunsetsus
covered
by
the
path
for
normalization
The
score
of
each
path
is
calculated
using
a
dynamic
programming
method
.
We
consider
each
path
as
a
candidate
of
pre
-
and
post-conjuncts
.
5
Integrated
Probabilistic
Model
for
Syntactic
,
Coordinate
and
Case
Structure
Analysis
This
section
describes
a
method
of
integrating
coordination
disambiguation
into
a
probabilistic
parsing
model
.
The
integrated
model
is
based
on
a
fully-lexicalized
probabilistic
model
for
Japanese
syntactic
and
case
structure
analysis
(
Kawahara
and
Kuro-hashi
,
2006b
)
.
This
model
gives
a
probability
to
each
possible
dependency
structure
,
T
,
and
case
structure
,
L
,
of
the
input
sentence
,
S
,
and
outputs
the
syntactic
,
coordinate
and
case
structure
that
have
the
highest
probability
.
That
is
to
say
,
the
model
selects
the
syntactic
structure
,
Tbest
,
and
the
case
structure
,
Lbest
,
that
maximize
the
probability
,
P
(
T
,
L
\
S
)
:
The
last
equation
is
derived
because
P
(
S
)
is
constant
.
The
model
considers
a
clause
as
a
generation
unit
and
generates
the
input
sentence
from
the
end
of
the
sentence
in
turn
.
The
probability
P
(
T
,
L
,
S
)
is
defined
as
the
product
of
probabilities
for
generating
clause
Ci
as
follows
:
where
n
is
the
number
of
clauses
in
S
,
Chi
is
Ci
's
modifying
clause
,
and
relihi
is
the
dependency
relation
between
Ci
and
Chi
.
The
main
clause
,
Cn
,
at
the
end
of
a
sentence
does
not
have
a
modifying
head
,
but
a
virtual
clause
Chn
=
EOS
(
End
Of
Sentence
)
is
inserted
.
Dependency
relation
relihi
is
first
classified
into
two
types
C
(
coordinate
)
and
D
(
normal
dependency
)
,
and
C
is
further
divided
into
five
classes
according
to
the
binned
similarity
(
path
score
)
of
conjuncts
.
Therefore
,
relihi
can
be
one
of
the
following
six
classes
.
relihi
=
{
D
,
C0
,
C1
,
C2
,
C3
,
C4
}
(
6
)
For
instance
,
C0
represents
a
coordinate
relation
with
a
similarity
of
less
than
1
,
and
C4
represents
a
coordinate
relation
with
a
similarity
of
4
or
more
.
Dependency
structure
Ti
,
T1
Dependency
structure
T3
,
T4
bentou-wa_
(
lunchbox
)
Vjlunchbox
)
tabete-te
Figure
3
:
Example
of
probability
calculation
.
For
example
,
consider
the
sentence
shown
in
Figure
3
.
There
are
four
possible
dependency
structures
in
this
figure
,
and
the
product
of
the
probabilities
for
each
structure
indicated
below
the
tree
is
calculated
.
Finally
,
the
model
chooses
the
structure
with
the
highest
probability
(
in
this
case
T1
is
chosen
)
.
Clause
Ci
is
decomposed
into
its
clause
type
,
f
i
,
(
including
the
predicate
's
inflection
and
function
words
)
and
its
remaining
content
part
Ci
'
.
Clause
Chi
is
also
decomposed
into
its
content
part
,
Chi
and
its
clause
type
,
fhi
.
Equation
(
7
)
is
derived
because
the
content
part
,
Ci
'
,
is
usually
independent
of
its
modifying
head
type
,
fhi
,
and
in
most
cases
,
the
type
,
fi
,
is
independent
of
the
content
part
of
its
modifying
head
,
Chi
.
We
call
P
(
Ci
'
,
relihi
\
f
i
,
Chi
'
)
generative
probability
of
a
case
and
coordinate
structure
,
and
P
(
fi
\
f
hi
)
generative
probability
of
a
clause
type
.
The
latter
is
the
probability
of
generating
function
words
including
topic
markers
and
punctuation
marks
,
and
is
estimated
using
a
syntactically
annotated
corpus
in
the
same
way
as
(
Kawahara
and
Kurohashi
,
2006b
)
.
The
generative
probability
of
a
case
and
coordinate
structure
can
be
rewritten
as
follows
:
Equation
(
8
)
is
derived
because
dependency
relations
(
coordinate
or
not
)
heavily
depend
on
modifier
's
types
including
coordination
keys
.
We
call
P
(
Ci
'
relihi
,
fi
,
Chi
'
)
generative
probability
ofa
case
structure
,
and
P
(
relihi
fi
)
generative
probability
ofa
coordinate
structure
.
The
following
two
subsections
describe
these
probabilities
.
5.2
Generative
Probability
of
Coordinate
Structure
The
most
important
feature
to
decide
whether
two
clauses
are
coordinate
is
coordination
keys
.
Therefore
,
we
consider
a
coordination
key
,
ki
,
as
clause
type
fi
.
The
generative
probability
of
a
coordinate
structure
,
P
(
relihi
\
f
i
)
,
is
defined
as
follows
:
We
classified
coordination
keys
into
52
classes
according
to
the
classification
proposed
by
(
Kurohashi
and
Nagao
,
1994
)
.
If
type
f
i
does
not
contain
a
coordination
key
,
the
relation
is
always
D
(
normal
dependency
)
,
that
is
P
(
relihi
\
f
i
)
=
P
(
D
\
4
&gt;
)
=
1
.
The
generative
probability
of
a
coordinate
structure
was
estimated
from
a
syntactically
annotated
corpus
using
maximum
likelihood
.
We
used
the
Kyoto
Text
Corpus
(
Kurohashi
and
Nagao
,
1998
)
,
which
consists
of
40K
Japanese
newspaper
sentences
.
5.3
Generative
Probability
of
Case
Structure
We
consider
that
a
case
structure
consists
of
a
predicate
,
vi
,
a
case
frame
,
CFi
,
and
a
case
assignment
,
CAk
.
Case
assignment
CAk
represents
correspondences
between
the
input
case
components
and
the
case
slots
shown
in
Figure
4
.
Thus
,
the
generative
probability
of
a
case
structure
is
decomposed
as
follows
:
Dependency
Structure
of
S
Case
Frame
CF
,
-
(
no
correspondence
)
Figure
4
:
Example
of
case
assignment
.
The
above
approximation
is
given
because
it
is
natural
to
consider
that
the
predicate
vi
depends
on
its
modifying
head
whi
instead
of
the
whole
modifying
clause
,
that
the
case
frame
CFi
only
depends
on
the
predicate
vi
,
and
that
the
case
assignment
CAk
depends
on
the
case
frame
CFi
and
the
clause
type
fi
.
The
generative
probabilities
of
case
frames
and
case
assignments
are
estimated
from
case
frames
themselves
in
the
same
way
as
(
Kawahara
and
Kuro-hashi
,
2006b
)
.
The
remainder
of
this
section
describes
the
generative
probability
of
a
predicate
,
P
(
vi
relihi
,
fi
,
whi
)
.
The
generative
probability
of
a
predicate
captures
cooccurrences
of
coordinate
or
non-coordinate
phrases
.
This
kind
of
information
is
not
handled
in
case
frames
,
which
aggregate
only
predicate-argument
relations
.
The
generative
probability
of
a
predicate
mainly
depends
on
a
coordination
key
in
the
clause
type
,
fi
,
as
well
as
the
generative
probability
of
a
coordinate
structure
.
We
define
this
probability
as
follows
:
If
Ci
'
is
a
nominal
clause
and
consists
of
a
noun
ni
,
we
consider
the
following
probability
in
stead
of
equation
(
10
)
:
This
is
because
a
noun
does
not
have
a
case
frame
and
any
case
components
in
the
current
framework
.
To
estimate
these
probabilities
,
we
first
applied
a
conventional
parsing
system
with
coordination
disambiguation
to
a
huge
corpus
,
and
collected
coordinate
bunsetsus
from
the
parses
.
We
used
KNP2
(
Kurohashi
and
Nagao
,
1994
)
as
the
parser
and
a
web
corpus
consisting
of
470M
Japanese
sentences
(
Kawahara
and
Kurohashi
,
2006a
)
.
The
generative
probability
of
a
predicate
was
estimated
from
the
collected
coordinate
bunsetsus
using
maximum
likelihood
.
The
proposed
model
considers
all
the
possible
dependency
structures
including
coordination
ambiguities
.
To
reduce
this
high
computational
cost
,
we
introduced
the
CKY
framework
to
the
search
.
Each
parameter
in
the
model
is
smoothed
by
using
several
back-off
levels
in
the
same
way
as
(
Collins
,
1999
)
.
Smoothing
parameters
are
optimized
using
a
development
corpus
.
6
Experiments
We
evaluated
the
coordinate
structures
and
dependency
structures
that
were
outputted
by
our
model
.
The
case
frames
used
in
this
paper
were
automatically
constructed
from
470M
Japanese
sentences
obtained
from
the
web
.
Some
examples
of
the
case
frames
are
listed
in
Table
1
(
Kawahara
and
Kuro-hashi
,
2006a
)
.
In
this
work
,
the
parameters
related
to
unlexical
types
are
calculated
from
a
small
tagged
corpus
of
newspaper
articles
,
and
lexical
parameters
are
obtained
from
a
huge
web
corpus
.
To
evaluate
the
effectiveness
of
our
fully-lexicalized
model
,
our
experiments
are
conducted
using
web
sentences
.
As
the
test
corpus
,
we
prepared
759
web
sentences
3
.
The
web
sentences
were
manually
annotated
using
the
same
criteria
as
the
Kyoto
Text
Corpus
.
We
also
used
the
Kyoto
Text
Corpus
as
a
development
corpus
to
optimize
the
smoothing
parameters
.
The
system
input
was
automatically
tagged
using
the
JUMAN
morphological
analyzer
4
.
We
used
two
baseline
systems
for
comparative
purposes
:
the
rule-based
dependency
parser
,
KNP
(
Kurohashi
and
Nagao
,
1994
)
,
and
the
probabilistic
model
of
syntactic
and
case
structure
analysis
(
Kawahara
and
Kurohashi
,
2006b
)
,
in
which
coordination
disambiguation
is
the
same
as
that
of
KNP
.
6.1
Evaluation
of
Detection
of
Coordinate
Structures
First
,
we
evaluated
detecting
coordinate
structures
,
namely
whether
a
coordination
key
bunsetsu
triggers
3The
test
set
was
not
used
to
construct
case
frames
and
estimate
probabilities
.
Table
3
:
Experimental
results
of
detection
of
coordinate
structures._
,
_
baseline
proposed
precision
recall
F-measure
a
coordinate
structure
.
Table
3
lists
the
experimental
results
.
The
F-measure
of
our
method
is
slightly
higher
than
that
of
the
baseline
method
(
KNP
)
.
In
particular
,
our
method
achieved
good
precision
.
6.2
Evaluation
of
Dependency
Parsing
Secondly
,
we
evaluated
the
dependency
structures
analyzed
by
the
proposed
model
.
Evaluating
the
scope
ambiguity
of
coordinate
structures
is
subsumed
within
this
dependency
evaluation
.
The
dependency
structures
obtained
were
evaluated
with
regard
to
dependency
accuracy
—
the
proportion
of
correct
dependencies
out
of
all
dependencies
except
for
the
last
dependency
in
the
sentence
end
5
.
Table
4
lists
the
dependency
accuracy
.
In
this
table
,
"
syn
"
represents
the
rule-based
dependency
parser
,
KNP
,
"
syn+case
"
represents
the
probabilistic
parser
of
syntactic
and
case
structure
(
Kawahara
and
Kuro-hashi
,
2006b
)
,
and
"
syn+case+coord
"
represents
our
proposed
model
.
The
proposed
model
significantly
outperformed
both
of
the
baseline
systems
(
McNe-mar
's
test
;
p
&lt;
0.01
)
.
In
the
table
,
the
dependency
accuracies
are
classified
into
four
types
on
the
basis
of
the
bunsetsu
classes
(
PB
:
predicate
bunsetsu
and
NB
:
noun
bun-setsu
)
of
a
dependent
and
its
head
.
"
syn+case
"
outperformed
"
syn
"
.
In
particular
,
the
accuracy
of
predicate-argument
relations
(
"
NB^PB
"
)
was
improved
,
but
the
accuracies
of
"
NB^NB
"
and
"
PB^PB
"
decreased
.
"
syn+case+coord
"
outperformed
the
two
baselines
for
all
of
the
types
.
Not
only
the
accuracy
of
predicate-argument
relations
(
"
NB^PB
"
)
but
also
the
accuracies
of
coordinate
noun
/
predicate
bunsetsus
(
related
to
"
NB^NB
"
and
"
PB^PB
"
)
were
improved
.
These
improvements
are
conduced
by
the
integration
of
coordination
disambiguation
and
syntactic
/
case
structure
analysis
.
5Since
Japanese
is
head-final
,
the
second
last
bunsetsu
unambiguously
depends
on
the
last
bunsetsu
,
and
the
last
bunsetsu
has
no
dependency
.
Table
4
:
Experimental
results
of
dependency
parsing
.
syn+case
syn+case+coord
To
compare
our
results
with
a
state-of-the-art
discriminative
dependency
parser
,
we
input
the
same
test
corpus
into
an
SVM-based
Japanese
dependency
parser
,
CaboCha6
(
Kudo
and
Matsumoto
,
2002
)
.
Its
dependency
accuracy
was
86.3
%
(
3,829
/
4,436
)
,
which
is
equivalent
to
that
of
"
syn
"
(
KNP
)
.
This
low
accuracy
is
attributed
to
the
out-of-domain
training
corpus
.
That
is
,
the
parser
is
trained
on
a
newspaper
corpus
,
whereas
the
test
corpus
is
obtained
from
the
web
,
because
of
the
non-availability
of
a
tagged
web
corpus
that
is
large
enough
to
train
a
supervised
parser
.
Figure
5
shows
some
analysis
results
,
where
the
dotted
lines
represent
the
analysis
by
the
baseline
,
"
syn+case
"
,
and
the
solid
lines
represent
the
analysis
by
the
proposed
method
,
"
syn+case+coord
"
.
These
sentences
are
incorrectly
analyzed
by
the
baseline
but
correctly
analyzed
by
the
proposed
method
.
For
instance
,
in
sentence
(
1
)
,
the
noun
phrase
coordination
of
"
apurikeesyon
"
(
application
)
and
"
doraiba
"
(
driver
)
can
be
correctly
analyzed
.
This
is
because
the
case
frame
of
"
insutooru-sareru
"
(
installed
)
is
likely
to
generate
"
doraiba
"
,
and
"
apurikeesyon
"
and
"
doraiba
"
are
likely
to
be
coordinated
.
One
of
the
causes
of
errors
in
dependency
parsing
is
the
mismatch
between
analysis
results
and
annotation
criteria
.
As
per
the
annotation
criteria
,
each
bunsetsu
has
only
one
modifying
head
.
Therefore
,
in
some
cases
,
even
if
analysis
results
are
semantically
correct
,
they
are
judged
as
incorrect
from
the
viewpoint
of
the
annotation
.
For
example
,
in
sentence
(
4
)
in
Figure
6
,
the
baseline
method
,
"
syn
"
,
correctly
recognized
the
head
of
"
iin-wa
"
(
commissioner-TM
)
as
"
hirakimasu
"
(
open
)
.
However
,
the
proposed
method
incorrectly
judged
it
as
"
oujite-imasuga
"
(
offer
)
.
Both
analysis
results
can
be
considered
to
be
semantically
correct
,
but
from
the
viewpoint
of
6http
:
/
/
chasen.org
/
~
taku
/
software
/
cabocha
/
our
annotation
criteria
,
the
latter
is
not
a
syntactic
relation
(
i.e.
,
incorrect
)
,
but
an
ellipsis
relation
.
This
kind
of
error
is
caused
by
the
strong
lexical
preference
considered
in
our
method
.
To
address
this
problem
,
it
is
necessary
to
simultaneously
evaluate
not
only
syntactic
relations
but
also
indirect
relations
,
such
as
ellipses
and
anaphora
.
This
kind
of
mismatch
also
occurred
for
the
detection
of
coordinate
structures
.
Another
errors
were
caused
by
an
inherent
characteristic
of
generative
models
.
Generative
models
have
some
advantages
,
such
as
their
application
to
language
models
.
However
,
it
is
difficult
to
incorporate
various
features
that
seem
to
be
useful
for
addressing
syntactic
and
coordinate
ambiguity
.
We
plan
to
apply
discriminative
reranking
to
the
n-best
parses
produced
by
our
generative
model
in
the
same
way
as
(
Charniak
and
Johnson
,
2005
)
.
7
Conclusion
This
paper
has
described
an
integrated
probabilistic
model
for
coordination
disambiguation
and
syntactic
/
case
structure
analysis
.
This
model
takes
advantage
of
lexical
preference
of
a
huge
raw
corpus
and
large-scale
case
frames
and
performs
coordination
disambiguation
and
syntactic
/
case
analysis
simultaneously
.
The
experiments
indicated
the
effectiveness
of
our
model
.
Our
future
work
involves
incorporating
ellipsis
resolution
to
develop
an
integrated
model
for
syntactic
,
case
,
and
ellipsis
analysis
.
Acknowledgment
This
research
is
partially
supported
by
special
coordination
funds
for
promoting
science
and
technology
.
