We
present
Pro3Gres
,
a
deep-syntactic
,
fast
dependency
parser
that
combines
a
handwritten
competence
grammar
with
probabilistic
performance
disambiguation
and
that
has
been
used
in
the
biomedical
domain
.
We
discuss
its
performance
in
the
domain
adaptation
open
submission
.
We
achieve
average
results
,
which
is
partly
due
to
difficulties
in
mapping
to
the
dependency
representation
used
for
the
shared
task
.
1
Introduction
The
Pro3Gres
parser
is
a
dependency
parser
that
combines
a
hand-written
grammar
with
probabilistic
disambiguation
.
It
is
described
in
detail
in
(
Schneider
,
2007
)
.
It
uses
tagger
and
chunker
pre-processors
-
parsing
proper
happens
only
between
heads
of
chunks
-
and
a
post-processor
graph
converter
to
capture
long-distance
dependencies
.
Pro3Gres
is
embedded
in
a
flexible
XML
pipeline
.
It
has
been
applied
to
many
tasks
,
such
as
parsing
biomedical
literature
(
Rinaldi
et
al.
,
2006
;
Rinaldi
et
al.
,
2007
)
and
the
whole
British
National
Corpus
,
and
has
been
evaluated
in
several
ways
.
We
have
achieved
average
results
in
the
CoNLL
domain
adaptation
track
open
submission
(
Marcus
et
al.
,
1993
;
Johansson
and
Nugues
,
2007
;
Kulick
et
al.
,
2004
;
MacWhinney
,
2000
;
Brown
,
1973
)
.
The
performance
of
the
parser
is
seriously
affected
by
mapping
problems
to
the
particular
dependency
representation
used
in
the
shared
task
.
The
paper
is
structured
as
follows
.
We
give
a
brief
overview
of
the
parser
and
its
design
policy
in
sec
-
tion
2
,
we
describe
the
domain
adaptations
that
we
have
used
in
section
3
,
comment
on
the
results
obtained
in
section
4
and
conclude
in
section
5
.
2
Pro3Gres
and
its
Design
Policy
There
has
been
growing
interest
in
exploring
the
space
between
Treebank-trained
probabilistic
grammars
(
e.g.
(
Collins
,
1999
;
Nivre
,
2006
)
)
and
formal
grammar-based
parsers
integrating
statistics
(
e.g.
have
developed
a
parsing
system
that
explores
this
space
,
in
the
vein
of
systems
like
(
Kaplan
et
al.
,
2004
)
,
using
a
linguistic
competence
grammar
and
a
probabilistic
performance
disambiguation
allowing
us
to
explore
interactions
between
lexicon
and
grammar
(
Sinclair
,
1996
)
.
The
parser
has
been
explicitly
designed
to
be
deep-syntactic
like
a
formal
grammar-based
parser
,
by
using
a
dependency
representation
that
is
close
to
LFG
f-structure
,
but
at
the
same
time
mostly
context-free
and
integrating
shallow
approaches
and
aggressive
pruning
in
order
to
keep
search-spaces
small
,
without
permitting
compromise
on
performance
or
linguistic
adequacy
.
(
Abney
,
1995
)
establishes
the
chunks
and
dependencies
model
as
a
well-motivated
linguistic
theory
.
The
non-local
linguistic
constraints
that
a
hand-written
grammar
allows
us
to
formulate
,
e.g.
expressing
X-bar
principles
or
barring
very
marked
constructions
,
further
reduce
parsing
time
by
at
least
an
order
of
magnitude
.
Since
the
grammar
is
on
Penn
tags
(
except
for
few
closed
classed
words
,
e.g.
allowing
including
to
function
as
preposition
)
the
effort
for
writing
it
manually
is
manageable
.
It
has
been
developed
from
scratch
in
about
a
person
month
,
Figure
1
:
Pro3Gres
parser
flowchart
using
traditional
grammar
engineering
development
cycles
.
It
contains
about
1000
rules
,
the
number
is
largely
so
high
due
to
tag
combinatorics
:
for
example
,
the
various
subject
attachment
rules
combining
a
subject
(
JVN
,
JNNS
,
JNNP
,
JNNPS
)
and
a
verb
(
_VBZ
,
JVBP
,
VBG
,
_VBN
,
.
VBD
)
are
all
very
similar
.
The
parser
is
fast
enough
for
large-scale
application
to
unrestricted
texts
,
and
it
delivers
dependency
relations
which
are
a
suitable
base
for
a
range
of
applications
.
We
have
used
it
to
parse
the
entire
100
million
words
British
National
Corpus
(
http
:
/
/
www.natcorp.ox.ac.uk
)
and
similar
amounts
of
biomedical
texts
.
Its
parsing
speed
is
about
500,000
words
per
hour
.
The
flowchart
of
the
parser
can
be
seen
in
figure
1
.
Pro3Gres
(
PRObabilistic
PROlog-implemented
RObust
Grammatical
Role
Extraction
System
)
uses
a
dependency
representation
that
is
close
to
LFG
f-structure
,
in
order
to
give
it
an
established
linguistic
background
.
It
uses
post-processing
graph
structure
conversions
and
mild
context-sensitivity
to
capture
long-distance
dependencies
.
We
have
argued
in
(
Schneider
,
2005
)
that
LFG
f-structures
can
be
parsed
for
in
a
completely
context-free
fashion
,
except
for
embedded
WH-questions
,
where
a
device
such
as
functional
uncertainty
(
Kaplan
and
Za-enen
,
1989
)
or
the
equivalent
Tree-Adjoining
Grammar
Adjoining
operation
(
Joshi
and
Vijay-Shanker
,
1989
)
is
used
.
In
Dependency
Grammar
,
this
device
is
also
known
as
lifting
(
Kahane
et
al.
,
1998
;
Nivre
and
Nilsson
,
2005
)
.
We
use
a
hand-written
competence
grammar
,
combined
with
performance-driven
disambiguation
obtained
from
the
Penn
Treebank
(
Marcus
et
al.
,
1993
)
.
The
Maximum-Likelihood
Estimation
(
MLE
)
probability
of
generating
a
dependency
relation
R
given
lexical
heads
(
a
and
b
)
at
distance
(
in
chunks
)
5
is
calculated
as
follows
.
The
counts
are
backed
off
(
Collins
,
1999
;
Merlo
and
Esteve
Ferrer
,
2006
)
.
The
backoff
levels
include
semantic
classes
from
WordNet
(
Fellbaum
,
1998
)
:
we
back
off
to
the
lexicographer
file
ID
of
the
most
frequent
word
sense
.
An
example
output
of
the
parser
is
shown
in
figure
2
.
3
Domain
Adaptation
Based
on
our
experience
with
parsing
texts
form
the
biomedical
domain
,
we
have
used
the
following
two
adaptations
to
the
domain
of
chemistry
.
(
Hindle
and
Rooth
,
1993
)
exploit
the
fact
that
in
sentence-initial
NP
PP
sequences
the
PP
unambiguously
attaches
to
the
noun
.
We
have
observed
that
in
sentence-initial
NP
PP
PP
sequences
,
also
the
second
PP
frequently
attaches
to
the
noun
,
the
noun
itself
often
being
a
relational
noun
.
We
have
thus
used
such
sequences
to
learn
relational
nouns
from
the
unlabelled
domain
texts
.
Relational
nouns
are
allowed
to
attach
several
argument
PPs
in
the
grammar
,
all
other
nouns
are
not
.
Multi-word
terms
,
adjective-preposition
constructions
and
frequent
PP-arguments
have
strong
collocational
force
.
We
have
thus
used
the
collocation
extraction
tool
XTRACT
(
Smadja
,
2003
)
to
discover
collocations
from
large
domain
corpora
.
The
probability
of
generating
a
dependency
relation
is
augmented
for
collocations
above
a
certain
threshold
.
Since
the
tagging
quality
of
the
Chemistry
testset
is
high
,
the
impact
of
multi-word
term
recognition
was
lower
than
the
biomedical
domain
when
using
a
standard
tagger
,
as
we
have
shown
in
(
Rinaldi
et
al.
,
2007
)
.
For
the
CHILDES
domain
,
we
have
not
used
any
adaptation
.
The
hand-written
grammar
fares
quite
well
on
most
types
of
questions
,
which
are
very
frequent
in
this
domain
.
In
the
spirit
of
the
shared
task
,
we
have
not
attempted
to
correct
tagging
errors
,
which
were
frequent
in
the
CHILDES
domain
.
We
have
restricted
the
use
of
external
resources
to
the
hand-written
,
domain-independent
grammar
,
and
to
WordNet
.
Due
to
serious
problems
in
mapping
our
Figure
2
:
Example
of
original
parser
output
LFG
f-structure
based
dependencies
to
the
CoNLL
representation
,
much
less
time
than
expected
was
available
for
the
domain
adaptation
.
4
Our
Results
We
have
achieved
average
results
:
Labeled
attachment
score
:
3151
/
5001
*
100
=
63.01
,
unlabeled
attachment
score
:
3327
/
5001
*
100
=
66.53
,
label
accuracy
score
:
3832
/
5001
*
100
=
76.62
.
These
results
are
about
10
%
below
what
we
typically
obtain
when
using
our
own
dependency
representation
or
GREVAL
(
Carroll
et
al.
,
2003
)
,
a
deep-syntactic
annotation
scheme
that
is
close
to
ours
.
Detailed
evaluations
are
reported
in
(
Schneider
,
2007
)
.
Our
mapping
was
quite
poor
,
especially
when
conjunctions
are
involved
.
Also
punctuation
is
attached
poorly
.
5.7
%
of
all
dependencies
remained
unmapped
(
unknown
in
the
figure
)
.
We
give
an
overview
of
the
the
relation-dependent
results
in
figures
1
and
2
.
Mapping
problems
include
the
following
examples
.
First
,
headedness
is
handled
very
differently
:
while
we
assume
auxiliaries
,
prepositions
and
coordinations
to
be
dependents
,
the
CoNNL
representation
assumes
the
opposite
,
which
leads
to
incorrect
mapping
under
complex
interactions
.
Second
,
the
semantics
of
parentheticals
(
PRN
)
partly
remains
unclear
.
In
Quinidine
elimination
was
capacity
limited
with
apparent
Michaelis
constant
(
appKM
)
of
2.6
microM
(
about
1.2
mg
/
L
)
the
gold
standard
annotates
the
second
parenthesis
as
parenthetical
,
but
the
first
as
nominal
modification
,
although
both
may
be
said
to
have
appositional
character
.
Third
,
we
seem
to
have
misinterpreted
the
roles
of
ADV
and
AMOD
,
as
they
are
often
mutually
exchanged
.
Fourth
,
the
logical
subject
(
LGS
)
is
sometimes
marked
on
the
by-PP
(
.
.
.
are
strongly
inhibited
by-LGS
carbon
monoxide
)
and
sometimes
on
the
participle
(
.
.
.
are
increased-LGS
by
pre
-
treatment
)
in
the
gold
standard
.
Relations
between
heads
of
chunks
,
which
are
central
for
predicate-argument
structures
which
Pro3Gres
aims
to
recover
,
such
as
SBJ
,
NMOD
,
ROOT
,
perform
better
than
those
for
which
Pro3Gres
was
not
originally
designed
,
particularly
ADV
,
AMOD
,
PRN
,
P.
Performance
on
COORD
was
particularly
disappointing
.
Generally
,
mapping
problems
between
different
representations
would
be
smaller
if
one
used
a
dependency
representation
that
maximally
abstracts
away
from
form
to
function
,
for
example
(
Carroll
et
al.
,
2003
)
.
We
have
obtained
results
slightly
above
average
on
the
CHILDES
domain
,
although
we
did
not
adapt
the
parser
to
this
domain
in
any
way
(
unlabeled
attachment
score
:
3013
/
4999
*
100
=
60.27
%
)
.
The
hand-written
grammar
,
which
includes
rules
for
most
types
of
questions
,
fares
relatively
well
on
this
domain
since
questions
are
rare
in
the
Penn
Tree-bank
(
see
(
Hermjakob
,
2001
)
)
.
Pro3Gres
has
been
employed
for
question
parsing
at
a
TREC
conference
(
Burger
and
Bayer
,
2005
)
.
Table
2
:
Prec
.
&amp;
recall
of
DEPREL+ATTACHMENT
5
Conclusion
We
have
described
the
Pro3Gres
parser
.
We
have
achieved
average
results
in
the
shared
task
with
relatively
little
adaptation
.
Mapping
to
different
representations
is
an
often
underestimated
task
.
Our
performance
on
the
CHILDES
task
,
where
we
did
not
adapt
the
parser
,
indicates
that
hand-written
,
carefully
engineered
competence
grammars
may
be
relatively
domain-independent
while
performance
disambiguation
is
more
domain-dependent
.
We
will
adapt
the
parser
to
further
domains
and
include
more
unsupervised
learning
methods
.
