Statistical
parsing
of
noun
phrase
(
np
)
structure
has
been
hampered
by
a
lack
of
gold-standard
data
.
This
is
a
significant
problem
for
CCGbank
,
where
binary
branching
np
derivations
are
often
incorrect
,
a
result
of
the
automatic
conversion
from
the
penn
Treebank
.
We
correct
these
errors
in
CCGbank
using
a
gold-standard
corpus
of
np
structure
,
resulting
in
a
much
more
accurate
corpus
.
We
also
implement
novel
ner
features
that
generalise
the
lexical
information
needed
to
parse
nps
and
provide
important
semantic
information
.
Finally
,
evaluating
against
DepBank
demonstrates
the
effectiveness
of
our
modified
corpus
and
novel
features
,
with
an
increase
in
parser
performance
of
1.51
%
.
1
Introduction
CCGbank
(
Hockenmaier
and
Steedman
,
2007
)
is
the
primary
English
corpus
for
Combinatory
Cate-gorial
Grammar
(
ccg
)
(
Steedman
,
2000
)
and
was
created
by
a
semi-automatic
conversion
from
the
penn
Treebank
.
However
,
cc
g
is
a
binary
branching
grammar
,
and
as
such
,
cannot
leave
np
structure
underspecified
.
Instead
,
all
nps
were
made
right-branching
,
as
shown
in
this
example
:
This
structure
is
correct
for
most
English
nps
and
is
the
best
solution
that
doesn
't
require
manual
reannotation
.
However
,
the
resulting
derivations
often
contain
errors
.
This
can
be
seen
in
the
previous
example
,
where
lung
cancer
should
form
a
constituent
,
but
does
not
.
The
first
contribution
of
this
paper
is
to
correct
these
CCGbank
errors
.
We
apply
an
automatic
conversion
process
using
the
gold-standard
np
data
annotated
by
Vadas
and
Curran
(
2007a
)
.
Over
a
quarter
of
the
sentences
in
CCGbank
need
to
be
altered
,
demonstrating
the
magnitude
of
the
np
problem
and
how
important
it
is
that
these
errors
are
fixed
.
We
then
run
a
number
of
parsing
experiments
using
our
new
version
of
the
CCGbank
corpus
.
In
particular
,
we
implement
new
features
using
ner
tags
from
the
BBN
Entity
Type
Corpus
(
Weischedel
and
Brunstein
,
2005
)
.
These
features
are
targeted
at
improving
the
recovery
of
np
structure
,
increasing
parser
performance
by
0.64
%
F-score
.
Finally
,
we
evaluate
against
DepBank
(
King
et
al.
,
2003
)
.
This
corpus
annotates
internal
np
structure
,
and
so
is
particularly
relevant
for
the
changes
we
have
made
to
CCGbank
.
The
ccg
parser
now
recovers
additional
structure
learnt
from
our
np
corrected
corpus
,
increasing
performance
by
0.92
%
.
Applying
the
ner
features
results
in
a
total
increase
of
1.51
%
.
This
work
allows
parsers
trained
on
CCGbank
to
model
np
structure
accurately
,
and
then
pass
this
crucial
information
on
to
downstream
systems
.
conj
N
/
N
acetate
fibers
and
acetate
2
Background
Parsing
of
nps
is
typically
framed
as
np
bracketing
,
where
the
task
is
limited
to
discriminating
between
left
and
right-branching
NPs
of
three
nouns
only
:
Lauer
(
1995
)
presents
two
models
to
solve
this
problem
:
the
adjacency
model
,
which
compares
the
association
strength
between
words
1-2
to
words
2-3
;
and
the
dependency
model
,
which
compares
words
1-2
to
words
1-3
.
Lauer
(
1995
)
experiments
with
a
data
set
of
244
nps
,
and
finds
that
the
dependency
model
is
superior
,
achieving
80.7
%
accuracy
.
Most
np
bracketing
research
has
used
Lauer
's
data
set
.
Because
it
is
a
very
small
corpus
,
most
approaches
have
been
unsupervised
,
measuring
association
strength
with
counts
from
a
separate
large
corpus
.
Nakov
and
Hearst
(
2005
)
use
search
engine
hit
counts
and
extend
the
query
set
with
typographical
markers
.
This
results
in
89.3
%
accuracy
.
Recently
,
Vadas
and
Curran
(
2007a
)
annotated
internal
NP
structure
for
the
entire
Penn
Treebank
,
providing
a
large
gold-standard
corpus
for
np
bracketing
.
Vadas
and
Curran
(
2007b
)
carry
out
supervised
experiments
using
this
data
set
of
36,584
NPs
,
outperforming
the
Collins
(
2003
)
parser
.
The
Vadas
and
Curran
(
2007a
)
annotation
scheme
inserts
NML
and
JJP
brackets
to
describe
the
correct
np
structure
,
as
shown
below
:
We
use
these
brackets
to
determine
new
goldstandard
CCG
derivations
in
Section
3
.
2.1
Combinatory
Categorial
Grammar
Combinatory
Categorial
Grammar
(
ccg
)
(
Steedman
,
2000
)
is
a
type-driven
,
lexicalised
theory
of
grammar
.
Lexical
categories
(
also
called
supertags
)
are
made
up
of
basic
atoms
such
as
S
(
Sentence
)
and
NP
(
Noun
Phrase
)
,
which
can
be
combined
to
form
complex
categories
.
For
example
,
a
transitive
verb
such
as
bought
(
as
in
IBM
bought
the
company
)
would
have
the
category
:
(
S
\
NP
)
/
NP
.
The
slashes
indicate
the
directionality
of
arguments
,
here
two
arguments
are
expected
:
an
np
subject
on
the
left
;
and
an
np
object
on
the
right
.
Once
these
arguments
are
filled
,
a
sentence
is
produced
.
Categories
are
combined
using
combinatory
rules
such
as
forward
and
backward
application
:
Other
rules
such
as
composition
and
type-raising
are
used
to
analyse
some
linguistic
constructions
,
while
retaining
the
canonical
categories
for
each
word
.
This
is
an
advantage
of
ccg
,
allowing
it
to
recover
long-range
dependencies
without
the
need
for
postprocessing
,
as
is
the
case
for
many
other
parsers
.
In
Section
1
,
we
described
the
incorrect
np
structures
in
CCGbank
,
but
a
further
problem
that
highlights
the
need
to
improve
NP
derivations
is
shown
in
Figure
1
.
When
a
conjunction
occurs
in
an
NP
,
a
non-ccG
rule
is
required
in
order
to
reach
a
parse
:
conj
N
=
=
N
(
3
)
This
rule
treats
the
conjunction
in
the
same
manner
as
a
modifier
,
and
results
in
the
incorrect
derivation
shown
in
Figure
1
(
a
)
.
Our
work
creates
the
correct
CCG
derivation
,
shown
in
Figure
1
(
b
)
,
and
removes
the
need
for
the
grammar
rule
in
(
3
)
.
Honnibal
and
Curran
(
2007
)
have
also
made
changes
to
CCGbank
,
aimed
at
better
differentiating
between
complements
and
adjuncts
.
PropBank
(
Palmer
et
al.
,
2005
)
is
used
as
a
gold-standard
to
inform
these
decisions
,
similar
to
the
way
that
we
use
the
Vadas
and
Curran
(
2007a
)
data
.
deaths
lung
cancer
cancer
deaths
lung
cancer
lung
cancer
Figure
2
:
(
a
)
Original
right-branching
CCGbank
(
b
)
Left-branching
(
c
)
Left-branching
with
new
supertags
The
C
&amp;
C
ccg
parser
(
Clark
and
Curran
,
2007b
)
is
used
to
perform
our
experiments
,
and
to
evaluate
the
effect
of
the
changes
to
CCGbank
.
The
parser
uses
a
two-stage
system
,
first
employing
a
supertagger
(
Bangalore
and
Joshi
,
1999
)
to
propose
lexical
categories
for
each
word
,
and
then
applying
the
cky
chart
parsing
algorithm
.
A
log-linear
model
is
used
to
identify
the
most
probable
derivation
,
which
makes
it
possible
to
add
the
novel
features
we
describe
in
Section
4
,
unlike
a
pcfg.
The
C
&amp;
C
parser
is
evaluated
on
predicate-argument
dependencies
derived
from
CCGbank
.
These
dependencies
are
represented
as
5-tuples
:
(
hf
,
f
,
s
,
ha
,
l
)
,
where
hf
is
the
head
of
the
predicate
;
f
is
the
supertag
of
hf
;
s
describes
which
argument
of
f
is
being
filled
;
ha
is
the
head
of
the
argument
;
and
l
encodes
whether
the
dependency
is
local
or
long-range
.
For
example
,
the
dependency
encoding
company
as
the
object
of
bought
(
as
in
IBM
bought
the
company
)
is
represented
by
:
This
is
a
local
dependency
,
where
company
is
filling
the
second
argument
slot
,
the
object
.
3
Conversion
Process
This
section
describes
the
process
of
converting
the
Vadas
and
Curran
(
2007a
)
data
to
ccg
derivations
.
The
tokens
dominated
by
NML
and
JJP
brackets
in
the
source
data
are
formed
into
constituents
in
the
corresponding
CCGbank
sentence
.
We
generate
the
two
forms
of
output
that
CCGbank
contains
:
AUTO
files
,
which
represent
the
tree
structure
of
each
sentence
;
and
PARG
files
,
which
list
the
word-word
dependencies
(
Hockenmaier
and
Steedman
,
2005
)
.
We
apply
one
preprocessing
step
on
the
penn
Treebank
data
,
where
ifmultiple
tokens
are
enclosed
by
brackets
,
then
a
NML
node
is
placed
around
those
tokens
.
For
example
,
we
would
insert
the
NML
bracket
shown
below
:
This
simple
heuristic
captures
np
structure
not
explicitly
annotated
by
Vadas
and
Curran
(
2007a
)
.
The
conversion
algorithm
applies
the
following
steps
for
each
NML
or
JJP
bracket
:
Identify
the
CCGbank
lowest
spanning
node
,
the
lowest
constituent
that
covers
all
of
the
words
in
the
NML
or
JJP
bracket
;
2
.
flatten
the
lowest
spanning
node
,
to
remove
the
right-branching
structure
;
3
.
insert
new
left-branching
structure
;
6
.
generate
new
dependencies
.
As
an
example
,
we
will
follow
the
conversion
process
for
the
NML
bracket
below
:
The
corresponding
lowest
spanning
node
,
which
incorrectly
has
cancer
deaths
as
a
constituent
,
is
shown
in
Figure
2
(
a
)
.
To
flatten
the
node
,
we
recursively
remove
brackets
that
partially
overlap
the
nml
bracket
.
Nodes
that
don
't
overlap
at
all
are
left
intact
.
This
process
results
in
a
list
of
nodes
(
which
may
or
may
not
be
leaves
)
,
which
in
our
example
is
[
lung
,
cancer
,
deaths
]
.
We
then
insert
the
correct
left-branching
structure
,
shown
in
Figure
2
(
b
)
.
At
this
stage
,
the
supertags
are
still
incomplete
.
Heads
are
then
assigned
using
heuristics
adapted
from
Hockenmaier
and
Steedman
(
2007
)
.
Since
we
are
applying
these
to
CCGbank
np
structures
rather
than
the
Penn
Treebank
,
the
pos
tag
based
heuristics
are
sufficient
to
determine
heads
accurately
.
Finally
,
we
assign
supertags
to
the
new
structure
.
We
want
to
make
the
minimal
number
of
changes
to
the
entire
sentence
derivation
,
and
so
the
supertag
of
the
dominating
node
is
fixed
.
Categories
are
then
propagated
recursively
down
the
tree
.
For
a
node
with
category
X
,
its
head
child
is
also
given
the
category
X.
The
non-head
child
is
always
treated
as
an
adjunct
,
and
given
the
category
X
/
X
or
X
\
X
as
appropriate
.
Figure
2
(
c
)
shows
the
final
result
of
this
step
for
our
example
.
3.1
Dependency
generation
The
changes
described
so
far
have
generated
the
new
tree
structure
,
but
the
last
step
is
to
generate
new
dependencies
.
We
recursively
traverse
the
tree
,
at
each
level
creating
a
dependency
between
the
heads
of
the
left
and
right
children
.
These
dependencies
are
never
long-range
,
and
therefore
easy
to
deal
with
.
We
may
also
need
to
change
dependencies
reaching
from
inside
to
outside
the
np
,
if
the
head
(
s
)
of
the
np
have
changed
.
In
these
cases
we
simply
replace
the
old
head
(
s
)
with
the
new
one
(
s
)
in
the
relevant
dependencies
.
The
number
of
heads
may
change
because
we
now
analyse
conjunctions
correctly
.
In
our
example
,
the
original
dependencies
were
:
while
after
the
conversion
process
,
(
5
)
becomes
:
To
determine
that
the
conversion
process
worked
correctly
,
we
manually
inspected
its
output
for
unique
tree
structures
in
Sections
00-07
.
This
identified
problem
cases
to
correct
,
such
as
those
described
in
the
following
section
.
3.2
Exceptional
cases
Firstly
,
when
the
lowest
spanning
node
covers
the
NML
or
JJP
bracket
exactly
,
no
changes
need
to
be
made
to
CCGbank
.
These
cases
occur
when
CCG-bank
already
received
the
correct
structure
during
the
original
conversion
process
.
For
example
,
brackets
separating
a
possessive
from
its
possessor
were
detected
automatically
.
A
more
complex
case
is
conjunctions
,
which
do
not
follow
the
simple
head
/
adjunct
method
of
assigning
supertags
.
Instead
,
conjuncts
are
identified
during
the
head-finding
stage
,
and
then
assigned
the
supertag
dominating
the
entire
coordination
.
Intervening
non-conjunct
nodes
are
given
the
same
category
with
the
conj
feature
,
resulting
in
a
derivation
that
can
be
parsed
with
the
standard
CCGbank
binary
coordination
rules
:
The
derivation
in
Figure
1
(
b
)
is
produced
by
these
corrections
to
coordination
derivations
.
As
a
result
,
applications
of
the
non-ccG
rule
shown
in
(
3
)
have
been
reduced
from
1378
to
145
cases
.
Some
pos
tags
require
special
behaviour
.
Determiners
and
possessive
pronouns
are
both
usually
given
the
supertag
NP
[
nb
]
/
N
,
and
this
should
not
be
changed
by
the
conversion
process
.
Accordingly
,
we
do
not
alter
tokens
with
pos
tags
of
dt
and
prp
$
.
Instead
,
their
sibling
node
is
given
the
category
N
and
their
parent
node
is
made
the
head
.
The
parent
's
sibling
is
then
assigned
the
appropriate
adjunct
category
(
usually
NP
\
NP
)
.
Tokens
with
punctuation
pos
tags1
do
not
have
their
supertag
changed
either
.
Finally
,
there
are
cases
where
the
lowest
spanning
node
covers
a
constituent
that
should
not
be
changed
.
For
example
,
in
the
following
np
:
(
NP
with
the
original
CCGbank
lowest
spanning
node
:
the
final
rul
ing
node
should
not
be
altered
.
It
may
seem
trivial
to
process
in
this
case
,
but
consider
a
similarly
structured
NP
:
lower
court
ruling
that
the
U.S.
can
bar
the
use
of
.
.
.
Our
minimalist
approach
avoids
reanalysing
the
many
linguistic
constructions
that
can
be
dominated
by
nps
,
as
this
would
reinvent
the
creation
of
CCGbank
.
As
a
result
,
we
only
flatten
those
constituents
that
partially
overlap
the
NML
or
JJP
bracket
.
The
existing
structure
and
dependencies
of
other
constituents
are
retained
.
Note
that
we
are
still
converting
every
NML
and
JJP
bracket
,
as
even
in
the
subordinate
clause
example
,
only
the
structure
around
lower
court
needs
to
be
altered
.
1
period
,
comma
,
colon
,
and
left
and
right
bracket
.
world
'
s
largest
largest
aid
donor
Figure
3
:
CCGbank
derivations
for
possessives
Possessive
Left
child
contains
DT
/
PRP
$
Couldn
't
assign
to
non-leaf
Conjunction
Automatic
conversion
was
correct
Entity
with
internal
brackets
nml
/
jjp
bracket
is
an
error
Table
1
:
Manual
analysis
3.3
Manual
annotation
A
handful
of
problems
that
occurred
during
the
conversion
process
were
corrected
manually
.
The
first
indicator
of
a
problem
was
the
presence
of
a
possessive
.
This
is
unexpected
,
because
possessives
were
already
bracketed
properly
when
CCGbank
was
originally
created
(
Hockenmaier
,
2003
,
§
3.6.4
)
.
Secondly
,
a
non-flattened
node
should
not
be
assigned
a
supertag
that
it
did
not
already
have
.
This
is
because
,
as
described
previously
,
a
non-leaf
node
could
dominate
any
kind
of
structure
.
Finally
,
we
expect
the
lowest
spanning
node
to
cover
only
the
nml
or
jjp
bracket
and
one
more
constituent
to
the
right
.
If
it
doesn
't
,
because
of
unusual
punctuation
or
an
incorrect
bracket
,
then
it
may
be
an
error
.
In
all
these
cases
,
which
occur
throughout
the
corpus
,
we
manually
analysed
the
derivation
and
fixed
any
errors
that
were
observed
.
version
process
highlighted
a
number
of
instances
where
the
original
CCGbank
analysis
was
incorrect
.
An
example
of
this
error
can
be
seen
in
Figure
3
(
a
)
,
where
the
possessive
doesn
't
take
any
arguments
.
Instead
,
largest
aid
donor
incorrectly
modifies
the
np
one
word
at
a
time
.
The
correct
derivation
after
manual
analysis
is
in
(
b
)
.
The
second-most
common
cause
occurs
when
there
is
apposition
inside
the
np.
This
can
be
seen
in
Figure
4
.
As
there
is
no
punctuation
on
which
to
coordinate
(
which
is
how
CCGbank
treats
most
appositions
)
the
best
derivation
we
can
obtain
is
to
have
Victor
Borge
modify
the
preceding
np.
The
final
step
in
the
conversion
process
was
to
validate
the
corpus
against
the
ccg
grammar
,
first
by
those
productions
used
in
the
existing
CCGbank
,
and
then
against
those
actually
licensed
by
ccg
(
with
pre-existing
ungrammaticalities
removed
)
.
Sixteen
errors
were
identified
by
this
process
and
subsequently
corrected
by
manual
analysis
.
In
total
,
we
have
altered
12,475
CCGbank
sentences
(
25.5
%
)
and
20,409
dependencies
(
1.95
%
)
.
4
ner
features
Named
entity
recognition
(
ner
)
provides
information
that
is
particularly
relevant
for
np
parsing
,
simply
because
entities
are
nouns
.
For
example
,
knowing
that
Air
Force
is
an
entity
tells
us
that
Air
Force
contract
is
a
left-branching
np.
Vadas
and
Curran
(
2007a
)
describe
using
ne
tags
during
the
annotation
process
,
suggesting
that
ner-based
features
will
be
helpful
in
a
statistical
model
.
There
has
also
been
recent
work
combining
ner
and
parsing
in
the
biomedical
field
.
Lewin
(
2007
)
experiments
with
detecting
base-nps
using
ner
information
,
while
Buyko
et
al.
(
2007
)
use
a
crf
to
identify
guest
comedian
Figure
4
:
CCGbank
derivations
for
apposition
with
dt
coordinate
structure
in
biological
named
entities
.
We
draw
ne
tags
from
the
BBN
Entity
Type
Corpus
(
Weischedel
and
Brunstein
,
2005
)
,
which
describes
28
different
entity
types
.
These
include
the
standard
person
,
location
and
organization
classes
,
as
well
person
descriptions
(
generally
occupations
)
,
NORP
(
National
,
Other
,
Religious
or
Political
groups
)
,
and
works
of
art
.
Some
classes
also
have
finer-grained
subtypes
,
although
we
use
only
the
coarse
tags
in
our
experiments
.
Clark
and
Curran
(
2007b
)
has
a
full
description
of
the
C
&amp;
C
parser
's
pre-existing
features
,
to
which
we
have
added
a
number
of
novel
ner-based
features
.
Many
of
these
features
generalise
the
head
words
and
/
or
pos
tags
that
are
already
part
of
the
feature
set
.
The
results
of
applying
these
features
are
described
in
Sections
5.3
and
6
.
The
first
feature
is
a
simple
lexical
feature
,
describing
the
ne
tag
of
each
token
in
the
sentence
.
This
feature
,
and
all
others
that
we
describe
here
,
are
not
active
when
the
ne
tag
(
s
)
are
o
,
as
there
is
no
ner
information
from
tokens
that
are
not
entities
.
The
next
group
of
features
is
based
on
the
local
tree
(
a
parent
and
two
child
nodes
)
formed
by
every
grammar
rule
application
.
We
add
a
feature
where
the
rule
being
applied
is
combined
with
the
parent
's
ne
tag
.
For
example
,
when
joining
two
constituents2
:
(
five
,
cd
,
card
,
N
/
N
)
and
(
Europeans
,
nnps
,
norp
,
N
)
,
the
feature
is
:
as
the
head
of
the
constituent
is
Europeans
.
In
the
same
way
,
we
implement
features
that
combine
the
grammar
rule
with
the
child
nodes
.
There
are
already
features
in
the
model
describing
each
combination
of
the
children
's
head
words
and
pos
tags
,
which
we
extend
to
include
combinations
with
2These
4-tuples
are
the
node
's
head
,
POS
,
NE
,
and
supertag
.
the
ne
tags
.
Using
the
same
example
as
above
,
one
of
the
new
features
would
be
:
The
last
group
of
features
is
based
on
the
ne
category
spanned
by
each
constituent
.
We
identify
constituents
that
dominate
tokens
that
all
have
the
same
ne
tag
,
as
these
nodes
will
not
cause
a
"
crossing
bracket
"
with
the
named
entity
.
For
example
,
the
constituent
Force
contract
,
in
the
np
Air
Force
contract
,
spans
two
different
ne
tags
,
and
should
be
penalised
by
the
model
.
Ai
r
Force
,
on
the
other
hand
,
only
spans
org
tags
,
and
should
be
preferred
accordingly
.
We
also
take
into
account
whether
the
constituent
spans
the
entire
named
entity
.
Combining
these
nodes
with
others
of
different
ne
tags
should
not
be
penalised
by
the
model
,
as
the
ne
must
combine
with
the
rest
of
the
sentence
at
some
point
.
These
ne
spanning
features
are
implemented
as
the
grammar
rule
in
combination
with
the
parent
node
or
the
child
nodes
.
For
the
former
,
one
feature
is
active
when
the
node
spans
the
entire
entity
,
and
another
is
active
in
other
cases
.
Similarly
,
there
are
four
features
for
the
child
nodes
,
depending
on
whether
neither
,
the
left
,
the
right
or
both
nodes
span
the
entire
ne
.
As
an
example
,
if
the
Air
Force
constituent
were
being
joined
with
contract
,
then
the
child
feature
would
be
:
N
—
N
/
N
N
+
LEFT
+
org
+
o
assuming
that
there
are
more
o
tags
to
the
right
.
5
Experiments
Our
experiments
are
run
with
the
C
&amp;
C
ccg
parser
(
Clark
and
Curran
,
2007b
)
,
and
will
evaluate
the
changes
made
to
CCGbank
,
as
well
as
the
effectiveness
of
the
ner
features
.
We
train
on
Sections
0221
,
and
test
on
Section
00
.
Table
2
:
Supertagging
results
Table
3
:
Parsing
results
with
gold-standard
pos
tags
Before
we
begin
full
parsing
experiments
,
we
evaluate
on
the
supertagger
alone
.
The
supertagger
is
an
important
stage
of
the
ccg
parsing
process
,
its
results
will
affect
performance
in
later
experiments
.
Table
2
shows
that
F-score
has
dropped
by
0.61
%
.
This
is
not
surprising
,
as
the
conversion
process
has
increased
the
ambiguity
of
supertags
in
nps.
Previously
,
a
bare
np
could
only
have
a
sequence
of
N
/
N
tags
followed
by
a
final
N.
There
are
now
more
complex
possibilities
,
equal
to
the
Catalan
number
of
the
length
of
the
np.
5.2
Initial
parsing
results
We
now
compare
parser
performance
on
our
np
corrected
version
of
the
corpus
to
that
on
original
CCG-bank
.
We
are
using
the
normal-form
parser
model
and
report
labelled
precision
,
recall
and
F-score
for
all
dependencies
.
The
results
are
shown
in
Table
3
.
The
F-score
drops
by
0.31
%
in
our
new
version
of
the
corpus
.
However
,
this
comparison
is
not
entirely
fair
,
as
the
original
CCGbank
test
data
does
not
include
the
np
structure
that
the
np
corrected
model
is
being
evaluated
on
.
Vadas
and
Curran
(
2007a
)
experienced
a
similar
drop
in
performance
on
Penn
Tree-bank
data
,
and
noted
that
the
F-score
for
nml
and
jjp
brackets
was
about
20
%
lower
than
the
overall
figure
.
We
suspect
that
a
similar
effect
is
causing
the
drop
in
performance
here
.
Unfortunately
,
there
are
no
explicit
nml
and
jjp
brackets
to
evaluate
on
in
the
ccg
corpus
,
and
so
an
np
structure
only
figure
is
difficult
to
compute
.
Recall
can
be
calculated
by
marking
those
dependencies
altered
in
the
conversion
process
,
and
evaluating
only
on
them
.
Precision
cannot
be
measured
in
this
Table
4
:
Parsing
results
with
automatic
pos
tags
Original
np
corrected
Table
5
:
Parsing
results
with
ner
features
way
,
as
np
dependencies
remain
undifferentiated
in
parser
output
.
The
result
is
a
recall
of77.03
%
,
which
is
noticeably
lower
than
the
overall
figure
.
We
have
also
experimented
with
using
automatically
assigned
pos
tags
.
These
tags
are
accurate
with
an
F-score
of
96.34
%
,
with
precision
96.20
%
and
recall
96.49
%
.
Table
4
shows
that
,
unsurprisingly
,
performance
is
lower
without
the
goldstandard
data
.
The
np
corrected
model
drops
an
additional
0.1
%
F-score
over
the
original
model
,
suggesting
that
pos
tags
are
particularly
important
for
recovering
internal
np
structure
.
Evaluating
np
dependencies
only
,
in
the
same
manner
as
before
,
results
in
a
recall
figure
of
75.21
%
.
Table
5
shows
the
results
of
adding
the
ner
features
we
described
in
Section
4
.
Performance
has
increased
by
0.64
%
on
both
versions
of
the
corpora
.
It
is
surprising
that
the
np
corrected
increase
is
not
larger
,
as
we
would
expect
the
features
to
be
less
effective
on
the
original
CCGbank
.
This
is
because
incorrect
right-branching
nps
such
as
Air
Force
contract
would
introduce
noise
to
the
ner
features
.
Table
6
presents
the
results
of
using
automatically
assigned
pos
and
ne
tags
,
i.e.
parsing
raw
text
.
The
ner
tagger
achieves
84.45
%
F-score
on
all
non-O
classes
,
with
precision
being
78.35
%
and
recall
91.57
%
.
We
can
see
that
parsing
F-score
has
dropped
by
about
2
%
compared
to
using
goldstandard
pos
and
ner
data
,
however
,
the
ner
features
still
improve
performance
by
about
0.3
%
.
Table
6
:
Parsing
results
with
automatic
POS
and
NE
tags
6
DepBank
evaluation
One
problem
with
the
evaluation
in
the
previous
section
,
is
that
the
original
CCGbank
is
not
expected
to
recover
internal
np
structure
,
making
its
task
easier
and
inflating
its
performance
.
To
remove
this
variable
,
we
carry
out
a
second
evaluation
against
the
Briscoe
and
Carroll
(
2006
)
reannotation
ofDep-Bank
(
King
et
al.
,
2003
)
,
as
described
in
Clark
and
Curran
(
2007a
)
.
Parser
output
is
made
similar
to
the
grammatical
relations
(
grs
)
of
the
Briscoe
and
Carroll
(
2006
)
data
,
however
,
the
conversion
remains
complex
.
Clark
and
Curran
(
2007a
)
report
an
upper
bound
on
performance
,
using
gold-standard
CCG-bank
dependencies
,
of
S4.76
%
F-score
.
This
evaluation
is
particularly
relevant
for
nps
,
as
the
Briscoe
and
Carroll
(
2006
)
corpus
has
been
annotated
for
internal
np
structure
.
With
our
new
version
of
CCGbank
,
the
parser
will
be
able
to
recover
these
grs
correctly
,
where
before
this
was
unlikely
.
Firstly
,
we
show
the
figures
achieved
using
goldstandard
CCGbank
derivations
in
Table
7
.
In
the
np
corrected
version
of
the
corpus
,
performance
has
increased
by
1.02
%
F-score
.
This
is
a
reversal
of
the
results
in
Section
S
,
and
demonstrates
that
correct
np
structure
improves
parsing
performance
,
rather
than
reduces
it
.
Because
of
this
increase
to
the
upper
bound
of
performance
,
we
are
now
even
closer
to
a
true
formalism-independent
evaluation
.
We
now
move
to
evaluating
the
C
&amp;
C
parser
itself
and
the
improvement
gained
by
the
ner
features
.
Table
S
show
our
results
,
with
the
np
corrected
version
outperforming
original
CCGbank
by
0.92
%
.
Using
the
ner
features
has
also
caused
an
increase
in
F-score
,
giving
a
total
improvement
of
1.S1
%
.
These
results
demonstrate
how
successful
the
correcting
of
nps
in
CCGbank
has
been
.
Furthermore
,
the
performance
increase
of
0.S9
%
on
the
np
corrected
corpus
is
more
than
the
0.2S
%
increase
on
the
original
.
This
demonstrates
that
ner
features
are
particularly
helpful
for
np
structure
.
Table
7
:
DepBank
gold-standard
evaluation
Original
np
corrected
Original
,
ner
np
corrected
,
ner
Table
S
:
DepBank
evaluation
results
7
Conclusion
The
first
contribution
of
this
paper
is
the
application
of
the
Vadas
and
Curran
(
2007a
)
data
to
Combinatory
Categorial
Grammar
.
Our
experimental
results
have
shown
that
this
more
accurate
representation
of
CCGbank
's
np
structure
increases
parser
performance
.
Our
second
major
contribution
is
the
introduction
of
novel
ner
features
,
a
source
of
semantic
information
previously
unused
in
parsing
.
As
a
result
of
this
work
,
internal
np
structure
is
now
recoverable
by
the
C
&amp;
C
parser
,
a
result
demonstrated
by
our
total
performance
increase
of
1.51
%
F-score
.
Even
when
parsing
raw
text
,
without
gold
standard
pos
and
ner
tags
,
our
approach
has
resulted
in
performance
gains
.
In
addition
,
we
have
made
possible
further
increases
to
np
structure
accuracy
.
New
features
can
now
be
implemented
and
evaluated
in
a
ccg
parsing
context
.
For
example
,
bigram
counts
from
a
very
large
corpus
have
already
been
used
in
np
bracketing
,
and
could
easily
be
applied
to
parsing
.
similarly
,
additional
supertagging
features
can
now
be
created
to
deal
with
the
increased
ambiguity
in
nps.
Downstream
nlp
components
can
now
exploit
the
crucial
information
in
np
structure
.
Acknowledgements
We
would
like
to
thank
Mark
Steedman
and
Matthew
Honnibal
for
help
with
converting
the
np
data
to
ccg
;
and
the
anonymous
reviewers
for
their
helpful
feedback
.
This
work
has
been
supported
by
the
Australian
Research
Council
under
Discovery
Project
DP0665973
.
