The
objective
of
this
work
is
to
disambiguate
transducers
which
have
the
following
form
:
T
=
R
o
D
and
to
be
able
to
apply
the
determinization
algorithm
described
in
(
Mohri
,
1997
)
.
Our
approach
to
disambiguating
T
=
R
o
D
consists
first
of
computing
the
composition
T
and
thereafter
to
disambiguate
the
transducer
T.
We
will
give
an
important
consequence
of
this
result
that
allows
us
to
compose
any
number
of
transducers
R
with
the
transducer
D
,
in
contrast
to
the
previous
approach
which
consisted
in
first
disambiguating
transducers
D
and
R
to
produce
respectively
D
and
R
,
then
computing
T
=
R
o
D
where
T
is
unambiguous
.
We
will
present
results
in
the
case
of
a
transducer
D
representing
a
dictionary
and
R
representing
phonological
rules
.
Keywords
:
ambiguity
,
deterministic
,
dictionary
,
transducer
.
1
Introduction
The
task
of
speech
recognition
can
be
decomposed
into
several
steps
,
where
each
step
is
represented
by
a
finite-state
transducer
(
Mohri
et
al.
,
1998
)
.
The
search
space
of
the
recognizer
is
defined
by
the
composition
of
transducers
T
=
A
o
C
o
R
o
D
o
M.
Transducer
A
converts
a
sequence
of
observations
O
to
a
sequence
of
context
-
dependent
phones
.
Transducer
C
converts
a
sequence
of
context-dependent
phones
to
a
sequence
of
context-independent
phones
.
Transducer
R
is
a
mapping
from
phones
to
phones
which
implements
phonological
rules
.
Transducer
D
is
the
pronunciations
dictionary
.
It
converts
a
sequence
of
context-independent
phones
to
a
sequence
of
words
.
Transducer
M
represents
a
language
model
:
it
converts
sequences
of
words
into
sequences
of
words
,
while
restricting
the
possible
sequences
or
assigning
a
score
to
the
sequences
.
The
speech
recognition
problem
consists
of
finding
the
path
of
least
cost
in
transducer
O
o
T
,
where
O
is
a
sequence
of
acoustic
observations
.
The
pronunciations
dictionary
representing
the
mapping
from
pronunciations
to
words
can
show
an
inherent
ambiguity
:
a
sequence
of
phones
can
correspond
to
more
than
one
word
,
so
we
cannot
apply
the
transducer
de-terminization
algorithm
(
an
operation
which
reduces
the
redundancy
,
search
time
and
possibly
space
)
.
This
problem
is
usually
handled
by
adding
special
symbols
to
the
dictionary
to
remove
the
ambiguity
in
order
to
be
able
to
apply
the
determinization
algorithm
(
Koskenniemi
,
1990
)
.
Nevertheless
,
when
we
compose
the
dictionary
with
the
phonological
rules
,
we
must
take
into
account
special
symbols
.
This
complicates
the
construction
of
transducers
representing
these
rules
and
leads
to
size
explosion
.
It
would
be
simpler
to
compose
the
rules
with
the
dictionary
,
then
remove
the
ambiguity
in
the
result
and
then
apply
the
determinization
algorithm
.
2
Notations
and
definitions
Formally
,
a
weighted
transducer
over
a
semiring
K
=
(
K
,
©
,
®
,
0,1
)
is
defined
as
a
6-tuple
T
=
(
Q
,
I
,
Si
,
E2
,
E
,
F
)
where
Q
is
a
finite
set
of
states
,
I
C
Q
is
a
finite
set
of
initial
states
,
S1
is
the
input
alphabet
,
E2
is
the
output
alphabet
,
E
is
a
finite
set
of
transitions
and
F
C
Q
is
a
finite
set
of
final
states
.
A
transition
is
an
element
of
Q
x
E1
x
S2
x
Q
x
K.
Transitions
are
of
the
form
where
p
(
t
)
denotes
the
transition
's
origin
state
,
i
(
t
)
its
input
label
,
o
(
t
)
its
output
label
,
n
(
t
)
the
transition
's
destination
state
and
w
(
t
)
G
K
is
the
weight
of
t.
The
tropical
semiring
defined
as
(
R+
U
to
,
min
,
+
,
to
,
0
)
is
commonly
used
in
speech
recognition
,
but
our
results
are
applicable
to
the
case
of
general
semirings
as
well
.
n
(
ti-1
)
=
for
2
&lt;
i
&lt;
n.
We
can
easily
extend
the
functions
p
and
n
to
those
paths
:
We
denote
by
P
(
r
,
s
)
the
set
of
paths
whose
origin
is
state
r
and
whose
destination
is
state
s.
We
can
also
extend
We
can
extend
the
functions
i
and
o
to
the
paths
by
taking
the
concatenations
of
the
input
and
output
symbols
:
Definition
1
(
unambiguous
transducer
,
(
Berstel
,
1979
)
)
A
transducer
T
is
said
to
be
unambiguous
if
for
each
w
G
S1
,
there
exists
at
most
one
path
n
in
T
such
that
=
w.
Definition
2
(
ambiguous
paths
)
Two
paths
n
and
a
are
ambiguous
if
n
=
a
and
=
i
(
a
)
.
Remark
1
:
To
remove
the
ambiguity
between
two
paths
n
and
a
,
it
suffices
to
modify
by
changing
the
first
input
label
of
the
path
n.
This
is
done
by
introducing
an
auxiliary
symbol
such
that
:
=
i
(
a
)
.
Figure
1a
shows
an
ambiguous
transducer
.
It
is
ambiguous
since
for
the
input
string
"
s
e
[
z
]
"
,
there
are
two
paths
representing
the
output
strings
{
ces
,
ses
}
.
In
this
figure
,
"
eps
"
stands
for
epsilon
or
null
symbol
.
To
disambiguate
a
transducer
,
we
first
group
the
ambiguous
paths
;
we
then
remove
the
ambiguity
in
each
group
by
adding
auxiliary
labels
as
shown
in
Figure
1b
.
Unfortunately
,
it
is
infeasible
to
enumerate
all
the
paths
in
a
cyclic
transducer
.
However
,
in
(
Smaili
,
2001
)
it
is
shown
that
cyclic
transducers
of
the
type
studied
in
this
work
can
be
disambiguated
by
transforming
to
a
corresponding
acyclic
sub-transducer
such
that
T
C
T.
This
Figure
1
:
(
a
)
Ambiguous
transducer
(
b
)
Disambiguated
transducer
fundamental
property
is
described
in
detail
in
section
2.1
.
Accordingly
,
we
apply
the
appropriate
transformation
to
the
input
transducer
.
2.1
Fundamental
Property
Any
cycle
in
T
contains
at
least
a
transition
t
such
that
i
(
t
)
g
S1
.
that
E
=
E0
w
E1
.
We
can
give
a
characterization
of
the
ambiguous
paths
verifying
the
fundamental
property
.
Before
,
let
's
make
the
following
remark
:
with
n
g
E+
,
f
g
E+
for
1
&lt;
i
&lt;
n
,
/
0
g
Ej
*
and
n0
g
Eq
if
n
&gt;
1
.
If
n
=
0
then
n
=
/
0
n0
.
Proposition
1
(
characterization
of
ambiguous
paths
)
ai
and
ni
are
ambiguous
(
0
&lt;
i
&lt;
n
)
.
fj
and
gi
are
ambiguous
(
0
&lt;
i
&lt;
n
)
.
We
will
assume
that
the
first
transition
's
path
belongs
to
E0
,
i.e.
f0
=
e.
Recall
that
if
we
want
to
avoid
cycles
,
we
just
have
to
remove
from
T
all
transitions
t
g
Ei
.
According
to
Proposition
1
,
ambiguity
needs
to
be
removed
only
in
paths
that
use
transitions
t
g
E0
,
namely
the
path
ni
that
performs
the
decomposition
given
in
Remark
2
.
Disambiguation
consists
only
of
introducing
auxiliary
labels
in
the
ambiguous
paths
.
We
denote
by
Asrc
the
set
of
origin
states
of
transitions
belonging
to
Ei
and
by
Adst
the
set
of
destination
states
of
transitions
belonging
to
E2
.
According
to
Proposition
1
and
what
precedes
,
it
would
be
equivalent
and
simpler
to
disambiguate
an
acyclic
transducer
obtained
from
T
in
which
we
have
removed
all
Ei
transitions
.
Therefore
,
we
introduce
the
operator
*
:
{
Tin
}
—
&gt;
{
Tout
}
which
accomplishes
this
construction
.
Ii
=
I
u
Adst
u
{
i
}
,
with
i
g
Q.
Fi
=
F
u
Asrc
u
{
/
}
,
with
/
g
Q.
ET
=
E
\
Ei
u
{
(
i
,
q
,
e
,
e
,
0
)
,
q
g
Ii
}
u
{
(
q
,
/
,
e
,
e
,
0
)
,
q
g
Fi
}
.
The
third
condition
insures
the
connectivity
of
\
P
(
T
)
if
T
is
itself
connected
.
It
suffices
to
disambiguate
the
acyclic
transducer
\
P
(
T
)
,
then
reinsert
the
transitions
of
E1
in
^
(
T
)
.
The
set
of
paths
in
*
(
T
)
is
then
P
(
i1
,
Ft
)
.
T
=
(
Q
,
i
,
X
,
Y
,
E
,
F
)
is
an
ambiguous
transducer
verifying
the
fundamental
property
.
T1
=
(
Q
,
i
,
X
U
X1
,
Y
,
ET
,
F
)
is
an
unambiguous
transducer
,
X1
is
the
set
of
auxiliary
symbols
.
Tacyclic
—
)
.
Path
—
set
of
paths
of
Tacyclic
.
Disambiguate
the
set
Path
(
creating
the
set
X1
)
.
T0
—
build
the
unambiguous
transducer
which
has
unambiguous
paths
.
T1
—
\
P-1
(
T0
)
(
consists
of
reinserting
in
T0
the
transitions
of
T
which
where
removed
)
.
Now
,
we
will
study
an
important
class
of
transducers
verifying
the
fundamental
property
.
This
class
is
obtained
by
doing
the
composition
of
a
transducer
D
verifying
the
fundamental
property
with
a
transducer
R.
The
composition
of
two
transducers
is
an
efficient
algebraic
operation
for
building
more
complex
transducers
.
We
give
a
brief
definition
of
composition
and
the
fundamental
theorem
that
insures
the
invariance
of
the
fundamental
property
by
composition
.
3
Composition
The
transducer
T
created
by
the
composition
of
two
transducers
R
and
D
,
denoted
T
=
R
o
D
,
performs
the
mapping
of
word
x
to
word
z
if
and
only
if
R
maps
x
to
y
and
D
maps
y
to
z.
The
weight
of
the
resulting
word
is
the
0-product
of
the
weights
of
y
and
z
(
Pereira
and
Riley
,
1997
)
.
Note
that
,
in
order
to
make
the
composition
possible
,
we
must
have
o
(
t
)
=
i
(
e
)
.
Definition
4
(
Composition
)
E
=
{
eRoes
:
eR
G
Er
,
es
G
Es
}
.
Let
D
=
(
Qd
,
Id
,
Y
,
Z
,
Ed
,
Fd
)
be
a
transducer
verifying
the
fundamental
property
.
We
can
write
Y
=
Y0
W
Yi
where
Y0
=
{
i
(
t
)
:
t
G
E0
}
and
Yi
=
{
i
(
t
)
:
t
G
Ei
}
.
Theorem
1
(
Fundamental
)
Let
(
C
)
Vt
G
Er
,
o
(
t
)
G
Yi
^
i
(
t
)
G
Yi
.
Then
the
transducer
T
=
R
o
D
verifies
the
fundamental
property
.
n
=
ffR
o
nD
=
(
/
1
o
g1
)
•
•
•
(
/
n
o
gn
)
.
S
o
(
R
o
D
)
=
(
S
o
R
)
o
D.
TTO
=
RTO
o
RTO-1
•
•
•
R1
o
D.
To
this
end
,
we
proceed
as
follows
:
we
add
the
auxiliary
symbols
to
disam-biguate
the
transducer
;
then
we
apply
determinization
and
finally
we
remove
the
auxiliary
labels
.
These
three
operations
are
denoted
by
-0
.
=
r
0
(
D
)
if
i
=
0
.
i
\
0
(
Ri
o
0
(
Ti-1
)
)
if
i
&gt;
1
.
The
size
of
transducer
Tm
can
also
be
reduced
by
computing
:
Tm
=
0
(
Rm
o
Rm-1
•
•
•
R1
o
D
)
.
The
old
approach
:
Tm
=
Rm
o
Rm-1
^
^
^
R1
o
D
.
has
several
disadvantages
.
The
size
of
Ri
for
1
&lt;
i
&lt;
m
increases
considerably
since
the
auxiliary
labels
introduced
in
each
transducer
have
to
be
taken
into
account
in
all
others
.
This
fact
limits
the
number
of
transducers
that
can
be
composed
with
D.
4
Application
and
Results
We
will
now
apply
our
algorithm
to
transducers
involved
in
speech
recognition
.
Transducer
D
represents
the
pronunciation
dictionary
and
possesses
the
fundamental
property
.
The
set
of
transitions
of
D
is
defined
as
where
/
is
the
unique
final
state
of
D
,
0
is
the
unique
initial
state
of
D
,
x
is
any
symbol
and
#
is
a
symbol
representing
the
end
of
a
word
.
All
transitions
t
G
E0
are
such
that
i
(
t
)
=
#
.
Any
path
n
in
is
acyclic
.
The
transducer
R
representing
a
phonological
rule
is
constructed
to
fulfill
condition
(
C
)
of
the
fundamental
theorem
.
The
transducer
D
represents
a
French
dictionary
with
20000
words
and
their
pronunciations
.
The
transducer
R
represents
the
phonological
rule
that
handles
liaison
in
the
French
language
.
This
liaison
,
which
is
represented
by
a
phoneme
appearing
at
the
end
of
some
words
,
must
be
removed
when
the
next
word
begins
with
a
consonant
since
the
liaison
phoneme
is
never
pronounced
in
that
case
.
However
,
if
the
next
word
begins
with
a
vowel
,
the
liaison
phoneme
may
or
may
not
be
pronounced
and
thus
becomes
optional
.
Figure
2
:
Transducer
used
to
handle
the
optional
liaison
rule
.
Figure
2
shows
the
transducer
that
handles
this
rule
.
In
the
figure
,
p
denotes
all
phonemes
,
v
the
vowels
and
[
x
]
the
liaison
phonemes
.
Table
1
shows
the
results
of
our
algorithm
using
the
dictionary
and
the
phonological
rule
previously
described
.
Transducer
Transitions
Table
1
:
Size
reduction
on
a
French
dictionary
As
we
can
see
in
Table
1
,
the
operator
0
produces
a
smaller
transducer
in
all
the
cases
considered
here
.
5
Conclusion
and
future
work
We
have
been
able
to
disambiguate
an
important
class
of
cyclic
and
ambiguous
transducers
,
which
allows
us
to
apply
the
determinization
algorithm
(
Mohri
,
1997
)
;
and
then
to
reduce
the
size
of
those
transducers
.
With
our
new
approach
,
we
do
not
have
to
take
into
account
the
number
of
transducers
Ri
and
their
auxiliary
labels
as
was
the
case
with
the
approach
used
before
.
Thus
,
new
transducers
Ri
such
as
phonological
rules
can
be
easily
inserted
in
the
chain
.
The
major
disadvantage
of
our
approach
is
that
disambiguating
a
transducer
increases
its
size
systematically
.
Our
future
work
will
consist
of
developing
a
more
effective
algorithm
for
dis-ambiguating
an
acyclic
transducer
.
