Textual
records
of
business-oriented
conversations
between
customers
and
agents
need
to
be
analyzed
properly
to
acquire
useful
business
insights
that
improve
productivity
.
For
such
an
analysis
,
it
is
critical
to
identify
appropriate
textual
segments
and
expressions
to
focus
on
,
especially
when
the
textual
data
consists
of
complete
transcripts
,
which
are
often
lengthy
and
redundant
.
In
this
paper
,
we
propose
a
method
to
identify
important
segments
from
the
conversations
by
looking
for
changes
in
the
accuracy
of
a
categorizer
designed
to
separate
different
business
outcomes
.
We
extract
effective
expressions
from
the
important
segments
to
define
various
viewpoints
.
In
text
mining
a
viewpoint
defines
the
important
associations
between
key
entities
and
it
is
crucial
that
the
correct
viewpoints
are
identified
.
We
show
the
effectiveness
of
the
method
by
using
real
datasets
from
a
car
rental
service
center
.
1
Introduction
"
Contact
center
"
is
a
general
term
for
customer
service
centers
,
help
desks
,
and
information
phone
lines
.
Many
companies
operate
contact
centers
to
sell
their
products
,
handle
customer
issues
,
and
address
product-related
and
services-related
issues
.
In
contact
centers
,
analysts
try
to
get
insights
for
improving
business
processes
from
stored
customer
contact
data
.
Gigabytes
of
customer
contact
records
are
produced
every
day
in
the
form
of
audio
recordings
of
speech
,
transcripts
,
call
summaries
,
email
,
etc.
Though
analysis
by
experts
results
in
insights
that
are
very
deep
and
useful
,
such
analysis
usually
covers
only
a
very
small
(
1-2
%
)
fraction
of
the
total
call
volume
and
yet
requires
significant
workload
.
The
demands
for
extracting
trends
and
knowledge
from
the
whole
text
data
collection
by
using
text
mining
technology
,
therefore
,
are
increasing
rapidly
.
In
order
to
acquire
valuable
knowledge
through
text
mining
,
it
is
generally
critical
to
identify
important
expressions
to
be
monitored
and
compared
within
the
textual
data
.
For
example
,
given
a
large
collection
of
contact
records
at
the
contact
center
of
a
manufacturer
,
the
analysis
of
expressions
for
products
and
expressions
for
problems
often
leads
to
business
value
by
identifying
specific
problems
in
a
specific
product
.
If
30
%
of
the
contact
records
with
expressions
for
a
specific
product
such
as
"
ABC
"
contain
expressions
about
a
specific
trouble
such
as
"
cracked
"
,
while
the
expressions
about
the
same
trouble
appear
in
only
5
%
of
the
contact
records
for
similar
products
,
then
it
should
be
a
clue
that
the
product
"
ABC
"
may
actually
have
a
crack-related
problem
.
An
effective
way
to
facilitate
this
type
of
analysis
is
to
register
important
expressions
in
a
lexicon
such
as
"
ABC
"
and
"
cracked
"
as
associated
respectively
with
their
categories
such
as
"
product
"
and
"
problem
"
so
that
the
behavior
of
terms
in
the
same
category
can
be
compared
easily
.
It
is
actually
one
of
the
most
important
steps
of
text
mining
to
identify
such
relevant
expressions
and
their
categories
that
can
potentially
lead
to
some
valuable
insights
.
A
failure
in
this
step
often
leads
to
a
failure
in
the
text
mining
.
Also
,
it
has
been
considered
an
artistic
task
that
requires
highly
experienced
consul
-
Proceedings
of
the
2007
Joint
Conference
on
Empirical
Methods
in
Natural
Language
Processing
and
Computational
Natural
Language
Learning
,
pp.
458-467
,
Prague
,
June
2007
.
©
2007
Association
for
Computational
Linguistics
tants
to
define
such
categories
,
which
are
often
described
as
the
viewpoint
for
doing
the
analysis
,
and
their
corresponding
expressions
through
trial
and
error
.
In
this
paper
,
we
propose
a
method
to
identify
important
segments
of
textual
data
for
analysis
from
full
transcripts
of
conversations
.
Compared
to
the
written
summary
of
a
conversation
,
a
transcription
of
an
entire
conversation
tends
to
be
quite
lengthy
and
contains
various
forms
of
redundancy
.
Many
of
the
terms
appearing
in
the
conversation
are
not
relevant
for
specific
analysis
.
For
example
,
the
terms
for
greeting
such
as
"
Hello
"
and
"
Welcome
to
(
Company
A
)
"
are
unlikely
to
be
associated
with
specific
business
results
such
as
purchased-or-not
and
satisfied-or-not
,
especially
because
the
conversation
is
transcribed
without
preserving
the
nonverbal
moods
such
as
tone
of
voice
,
emotion
etc.
Thus
it
is
crucial
to
identify
key
segments
and
notable
expressions
within
conversations
for
analysis
to
acquire
valuable
insights
.
We
exploit
the
fact
that
business
conversations
follow
set
patterns
such
as
an
opening
followed
by
a
request
and
the
confirmation
of
details
followed
by
a
closing
,
etc.
By
taking
advantage
of
this
feature
of
business
conversations
,
we
have
developed
a
method
to
identify
key
segments
and
the
notable
expressions
within
conversations
that
tend
to
discriminate
between
the
business
results
.
Such
key
segments
,
the
trigger
segments
,
and
the
notable
expressions
associated
with
certain
business
results
lead
us
to
easily
understand
appropriate
viewpoints
for
analysis
.
Application
of
our
method
for
analyzing
nearly
one
thousand
conversations
from
a
rental
car
reservation
office
enabled
us
to
acquire
novel
insights
for
improving
agent
productivity
and
resulted
in
an
actual
increase
in
revenues
.
Organization
of
the
Paper
:
We
start
by
describing
the
properties
of
the
conversation
data
used
in
this
paper
.
Section
3
describes
the
method
for
identifying
useful
viewpoints
and
expressions
that
meet
the
specified
purpose
.
Section
4
provides
the
results
using
conversational
data
.
After
the
discussion
in
Section
5
,
we
conclude
the
paper
in
Section
6
.
2
Business-Oriented
Conversation
Data
We
consider
business-oriented
conversation
data
collected
at
contact
centers
handling
inbound
telephone
sales
and
reservations
.
Such
business
oriented
conversations
have
the
following
properties
.
•
Each
conversation
is
a
one-to-one
interaction
between
a
customer
and
an
agent
.
•
For
many
contact
center
processes
the
conversation
flow
is
well
defined
in
advance
.
•
There
are
a
fixed
number
of
outcomes
and
each
conversation
has
one
of
these
outcomes
.
For
example
,
in
car
rentals
,
the
following
conversation
flow
is
pre-defined
for
the
agent
.
In
practice
most
calls
to
a
car
rental
center
follow
this
call
flow
.
•
Opening
-
contains
greeting
,
brand
name
,
name
of
agent
•
Pick-up
and
return
details
-
agent
asks
location
,
dates
and
times
of
pick
up
and
return
,
etc.
•
Offering
car
and
rate
-
agent
offers
a
car
specifying
rate
and
mentions
applicable
special
offers
.
•
Personal
details
-
agent
asks
for
customer
's
information
such
as
name
,
address
,
etc.
•
Confirm
specifications
-
agent
recaps
reservation
information
such
as
name
,
location
,
etc.
•
Mandatory
enquiries
-
agent
verifies
clean
driving
record
,
valid
license
,
etc.
•
Closing
-
agent
gives
confirmation
number
and
thanks
the
customer
for
calling
.
In
these
conversations
the
participants
speak
in
turns
and
the
segments
can
be
clearly
identified
.
Figure
1
shows
part
of
a
transcribed
call
.
Each
call
has
a
specific
outcome
.
For
example
,
each
car
rental
transaction
has
one
of
two
call
types
,
reservation
or
unbooked
,
as
an
outcome
.
Because
the
call
process
is
pre-defined
,
the
conversations
look
similar
in
spite
of
having
different
results
.
In
such
a
situation
,
finding
the
differences
in
the
conversations
that
have
effects
on
the
outcomes
is
very
important
,
but
it
is
very
expensive
and
difficult
to
find
such
unknown
differences
by
human
analysis
.
We
show
that
it
is
possible
to
define
proper
viewpoints
and
corresponding
expressions
leading
to
insights
on
how
to
change
the
outcomes
of
the
calls
.
AGENT
:
Welcome
to
CarCompanyA
.
My
name
is
Albert
.
How
may
I
help
you
?
AGENT
:
Allright
may
i
know
the
location
you
want
to
pick
the
car
from
.
CUSTOMER
:
Aah
ok
I
need
it
from
SFO
.
AGENT
:
For
what
date
and
time
.
AGENT
:
Wonderful
so
let
me
see
ok
mam
so
we
have
a
12
or
15
passenger
van
avilable
on
this
location
on
those
dates
and
for
that
your
estimated
total
for
those
three
dates
just
300.58
$
this
is
with
Taxes
with
surcharges
and
with
free
unlimited
free
milleage
.
AGENT
:
alright
mam
let
me
recap
the
dates
you
want
to
pick
it
up
from
SFO
on
3rd
August
and
drop
it
off
on
august
6th
in
LA
alright
CUSTOMER
:
and
one
more
questions
Is
it
just
in
states
or
could
you
travel
out
of
states
AGENT
:
The
confirmation
number
for
your
booking
is
221
384
.
CUSTOMER
:
ok
ok
Thank
you
Agent
:
Thank
you
for
calling
CarCompanyA
and
you
have
a
great
day
good
bye
Figure
1
:
Transcript
of
a
car
rental
dialog
(
partial
)
3
Trigger
Segment
Detection
and
Effective
Expression
Extraction
In
this
section
,
we
describe
a
method
for
automatically
identifying
valuable
segments
and
concepts
from
the
data
for
the
user-specified
difference
analysis
.
First
,
we
present
a
model
to
represent
the
conversational
data
.
After
that
we
introduce
a
method
to
detect
the
segments
where
the
useful
concepts
for
the
analysis
appear
.
Finally
,
we
select
useful
expressions
in
each
detected
trigger
segment
.
Each
conversational
data
record
in
the
collection
D
is
defined
as
di
.
Each
di
can
be
seen
as
a
sequence
of
conversational
turns
in
the
conversational
data
,
and
then
di
can
be
divided
as
where
di
is
the
k-th
turn
in
di
and
Mi
is
the
total
number
of
turns
in
di
.
The
+
operator
in
the
above
equation
can
be
seen
as
an
equivalent
of
the
string
concatenation
operator
.
We
define
as
the
portion
of
di
from
the
beginning
to
turn
j.
Using
the
same
notation
,
collection
of
d
~
mk
constitutes
the
Chronologically
Cumulative
Data
up
to
turn
mk
(
Dk
)
.
Dk
is
represented
as
Figure
2
shows
an
image
of
the
data
model
.
We
set
some
mk
and
prepare
the
chronologically
cumulative
data
set
as
shown
in
Figure
3
.
We
represent
binary
mutually
exclusive
business
outcomes
such
as
success
and
failure
resulting
from
the
conversations
as
"
A
"
and
"
not
A
"
.
di
™
.
Number
of
turns
Figure
2
:
Conversation
data
model
Pi
Figure
3
:
Chronologically
cumulative
conversational
data
3.2
Trigger
Segment
Detection
Trigger
segments
can
be
viewed
as
portions
of
the
data
which
have
important
features
which
distinguish
data
of
class
"
A
"
from
data
of
class
"
not
A
"
.
To
detect
such
segments
,
we
divide
each
chronologically
cumulative
data
set
Dk
into
two
data
sets
,
training
data
Dka
%
mng
and
test
data
Dtjk
:
st.
Starting
from
D1
,
for
each
Dk
we
trained
a
classifier
training
and
evaluated
it
on
D
accuracy
,
the
fraction
of
correctly
classified
documents
,
as
a
metric
of
performance
(
Yang
and
Liu
,
1999
)
,
we
denote
the
evaluation
result
of
the
categorization
as
acc
(
categorizer
(
Dk
)
)
for
each
Dk
and
plot
it
along
with
its
turn
.
Figure
4
shows
the
effect
of
gradually
increasing
the
training
data
for
the
classification
.
The
distribution
of
expressions
acc
(
categorizer
(
Di
)
)
rigger
in
a
business-oriented
conversation
will
change
almost
synchronously
because
the
call
flow
is
predefined
.
Therefore
acc
(
categorizer
(
Dk
)
)
will
increase
if
features
that
contribute
to
the
categorization
appear
in
Dk.
In
contrast
,
acc
(
categorizer
(
Dk
)
)
will
decrease
if
no
features
that
contribute
to
the
categorization
are
in
Dk.
Therefore
,
from
the
transitions
of
acc
(
categorizer
(
Dk
)
)
,
we
can
identify
the
segments
with
increases
as
triggers
where
the
features
that
have
an
effect
on
the
outcome
appear
.
We
denote
a
trigger
segment
as
seg
(
start
position
,
end
position
)
.
Because
the
total
numbers
of
turns
can
be
different
,
we
do
not
detect
the
last
section
as
a
trigger
.
In
Figure
4
,
seg
(
m
\
,
m2
)
and
seg
(
m4
,
m5
)
are
triggers
.
It
is
important
to
note
that
using
the
cumulative
data
is
key
to
the
detection
of
trigger
segments
.
Using
non-cumulative
segment
data
would
give
us
the
categorization
accuracy
for
the
features
within
that
segment
but
would
not
tell
us
whether
the
features
of
this
segment
are
improving
the
accuracy
or
decreasing
it
.
It
is
this
gradient
information
between
segments
that
is
key
to
identifying
trigger
segments
.
Many
approaches
have
been
proposed
for
docu
-
ment
classiication
(
Yang
and
Liu
,
1999
)
.
In
this
research
,
however
,
we
are
not
interested
in
the
clas-siication
accuracy
itself
but
in
the
increase
and
decrease
of
the
accuracy
within
particular
segments
.
For
example
,
the
greeting
,
or
the
particular
method
of
payment
may
not
affect
the
outcome
,
but
the
mention
of
a
speciic
feature
of
the
product
may
have
an
effect
on
the
outcome
.
Therefore
in
our
research
we
are
interested
in
identifying
the
particular
portion
of
the
call
where
this
product
feature
is
mentioned
,
along
with
its
mention
,
which
has
an
effect
on
the
outcome
of
the
call
.
In
our
experiments
we
used
the
SVM
(
Support
Vector
Machine
)
classifier
(
Joachims
,
1998
)
,
but
almost
any
classifier
should
work
because
our
approach
does
not
depend
on
the
classiication
method
.
3.3
Effective
Expression
Extraction
In
this
section
,
we
describe
our
method
to
extract
effective
expressions
from
the
detected
trigger
segments
.
The
effective
expressions
in
Dk
are
those
which
are
representative
in
the
selected
documents
and
appear
for
the
irst
time
in
the
trigger
segments
seg
(
mi
,
mj
)
.
Numerous
methods
to
select
features
exist
(
Hisamitsu
and
Niwa
,
2002
)
(
Yang
and
Ped-ersen
,
1997
)
.
We
use
the
%
2
statistic
for
each
expression
in
Dk
as
a
representative
metric
.
For
the
two-by-two
contingency
table
of
a
expression
w
and
Table
1
:
Contingency
table
for
calculating
the
%
2
statistic
#
of
documents
including
w
#
of
documents
not
including
w
where
N
is
the
number
of
documents
.
This
statistic
can
be
compared
to
the
%
2
distribution
with
one
degree
of
freedom
to
judge
representativeness
.
We
also
want
to
extract
the
expressions
that
have
not
had
an
effect
on
the
outcome
before
Dk.
To
detect
the
new
expressions
in
Dk
,
we
define
the
metric
where
w
(
Dk
)
is
the
frequency
of
expression
w
in
the
chronologically
cumulative
data
Dk
,
max
(
a
,
b
)
selects
the
larger
value
in
the
arguments
,
mk
is
the
number
of
turns
in
Dk
,
w
(
DkA
)
is
the
frequency
of
w
in
Dk
with
the
outcome
ofthe
corresponding
data
being
"
A
"
,
and
sign
(
)
is
the
signum
function
.
When
w
in
class
"
A
"
appears
in
Dk
much
more
frequently
than
Dk-i
compared
with
the
ratio
of
their
turns
,
this
metric
will
be
more
than
1
.
We
detect
signii-cant
expressions
by
considering
the
combined
score
X2
(
w
)
•
new
(
w
)
.
Using
this
combined
score
,
we
can
ilter
out
the
representative
expressions
that
have
already
appeared
before
Dk
and
distinguish
signii-cant
expressions
that
irst
appear
in
Dk
for
each
class
"
A
"
and
"
not
A
"
.
3.4
Appropriate
Viewpoint
Selection
In
a
text
mining
system
,
to
get
an
association
that
leads
to
a
useful
insight
,
we
have
to
deine
appropriate
viewpoints
.
Viewpoints
refer
to
objects
in
relation
to
other
objects
.
In
analysis
using
a
conventional
text
mining
system
(
Nasukawa
and
Nagano
,
2001
)
,
the
viewpoints
are
selected
based
on
expressions
in
user
dictionaries
prepared
by
domain
experts
.
We
have
identiied
important
segments
of
the
conversations
by
seeing
changes
in
the
accuracy
ofa
categorizer
designed
to
segregate
different
business
outcomes
.
We
have
also
been
able
to
extract
effective
expressions
from
these
trigger
segments
to
deine
various
viewpoints
.
Hence
,
viewpoint
selection
is
now
based
on
the
trigger
segments
and
effective
expressions
identiied
automatically
based
on
speci-ied
business
outcomes
.
In
the
next
section
we
apply
our
technique
to
a
real
life
dataset
and
show
that
we
can
successfully
select
useful
viewpoints
.
4
Experiments
and
Results
4.1
Experiment
Data
and
System
We
collected
914
recorded
calls
from
the
car
rental
help
desk
and
manually
transcribed
them
.
Figure
1
shows
part
of
a
call
that
has
been
transcribed
.
There
are
three
types
of
calls
:
Reservation
Calls
:
Calls
which
got
converted
.
Here
,
"
converted
"
means
the
customer
made
a
reservation
for
a
car
.
Reserved
cars
can
get
picked-up
or
not
picked-up
,
so
some
reserved
cars
do
not
eventually
get
picked-up
by
customers
(
no
shows
and
cancellations
)
.
Unbooked
Calls
:
Calls
which
did
not
get
converted
.
Service
Calls
:
Customers
changing
or
enquiring
about
a
previous
booking
.
The
distribution
of
the
calls
is
given
in
Table
2
.
Table
2
:
Distribution
of
calls
Unbooked
Calls
Reservation
Calls
(
Picked-Up
)
Reservation
Calls
(
Not
Picked-Up
)
Service
Calls
Total
Calls
The
reservation
calls
are
most
important
in
this
context
,
so
we
focus
on
those
137
calls
.
In
the
reservation
calls
,
there
are
two
types
of
outcomes
,
car
picked-up
and
car
not
picked-up
.
All
reservation
calls
look
similar
in
spite
of
having
different
outcomes
(
in
terms
of
pick
up
)
.
The
reservation
happens
during
the
call
but
the
pick
up
happens
at
a
later
date
.
If
we
can
ind
differences
in
the
conversation
that
affect
the
outcome
,
it
is
expected
that
we
could
improve
the
agent
productivity
.
Reservation
calls
follow
the
pre-defined
reservation
call
flow
that
we
mentioned
in
Section
2
and
it
is
very
dificult
to
ind
differences
between
them
manually
.
In
this
experiment
,
by
using
the
proposed
method
,
we
try
to
extract
trigger
segments
and
expressions
to
ind
viewpoints
that
affect
the
outcome
ofthe
reservation
calls
.
For
the
analysis
,
we
constructed
a
text
mining
system
for
the
difference
analysis
"
picked-up
"
vs.
"
not
picked-up
"
.
The
experimental
system
consists
of
two
parts
,
an
information
extraction
part
and
a
text
mining
part
.
In
the
information
extraction
part
we
deine
dictionaries
and
templates
to
identify
useful
expressions
.
In
the
text
mining
part
we
deine
appropriate
viewpoints
based
on
the
identiied
expressions
to
get
useful
associations
leading
to
useful
insights
.
4.2
Results
of
Trigger
Segment
Detection
and
Effective
Expression
Extraction
Torn
Smj
)
seg
(
10,15
)
are
detected
as
trigger
segments
.
We
now
know
that
these
segments
are
highly
correlated
to
the
outcome
of
the
call
.
For
each
detected
trigger
segment
,
we
extract
effective
expressions
in
each
class
using
the
metric
described
in
Section
3.3
.
Table
3
shows
some
expressions
with
high
values
for
the
metric
for
each
trigger
.
In
this
table
,
"
just
NUMERIC
dollars
"
is
a
canonical
expression
and
an
expression
such
as
"
just
160
dollars
"
is
mapped
to
this
canonical
expression
in
the
information
extraction
process
.
From
this
result
,
in
seg
(
1
,
2
)
,
"
make
"
,
"
reservation
"
are
correlated
with
"
pick
up
"
and
"
rate
"
and
"
check
"
are
correlated
with
Table
3
:
Selected
expressions
in
trigger
segments
Selected
expressions
make
,
return
,
tomorrow
,
assist
,
reservation
,
tonight
number
,
corporate
program
,
contract
,
card
,
have
,
tax
surcharge
,
just
NUMERIC
dollars
,
discount
,
customer
club
,
good
rate
,
economy
go
,
impala
"
not-picked
up
"
.
By
looking
at
some
documents
containing
these
expressions
,
we
found
customer
intention
phrases
such
as
"
would
like
to
make
a
reservation
"
,
"
want
to
check
a
rate
"
,
etc.
Therefore
,
it
can
be
induced
that
the
way
a
customer
starts
the
call
may
have
an
impact
on
the
outcome
.
From
expressions
in
seg
(
10,15
)
,
it
can
be
said
that
discount-related
phrases
and
mentions
of
the
good
rates
by
the
agent
can
have
an
effect
on
the
outcome
.
We
can
directly
apply
the
conventional
methods
for
representative
feature
selection
to
D.
The
following
expressions
were
selected
as
the
top
20
expressions
from
whole
conversational
data
by
using
the
x2
metric
defined
in
(
3
)
.
corporate
program
,
contract
,
counter
,
September
,
mile
,
rate
,
economy
,
last
name
,
valid
driving
license
,
BRAND
NAME
,
driving
,
telephone
,
midsize
,
tonight
,
use
,
credit
,
moment
,
airline
,
afternoon
From
these
results
,
we
see
that
looking
at
the
call
as
a
whole
does
not
point
us
to
the
fact
that
discount-related
phrases
,
or
the
irst
customers-utterance
,
affect
the
outcome
.
Detecting
trigger
segments
and
extracting
important
expressions
from
each
trigger
segment
are
key
to
identifying
subtle
differences
between
very
similar
looking
calls
that
have
entirely
opposite
outcomes
.
4.3
Results
of
Text
Mining
Analysis
using
Selected
Viewpoints
and
Expressions
From
the
detected
segments
and
expressions
we
determined
that
the
customer
's
first
utterance
along
with
discount
phrases
and
value
selling
phrases
affected
the
call
outcomes
.
Under
these
hypotheses
,
we
prepared
the
following
semantic
categories
.
•
Customer
intention
at
start
of
call
:
From
the
customer
's
irst
utterance
,
we
extract
the
following
intentions
based
on
the
patterns
.
-
strong
start
:
would
like
to
make
a
booking
,
need
to
pick
up
a
car
,
.
.
.
want
to
know
the
rate
for
vans
,
.
.
.
Under
our
hypotheses
,
the
customer
with
a
strong
start
has
the
intention
of
booking
a
car
and
we
classify
such
a
customer
as
a
book-ing.customer
.
The
customer
with
a
weak
start
usually
just
wants
to
know
the
rates
and
is
classified
as
a
rates_customer
.
•
discount-related
phrases
:
discount
,
corporate
program
,
motor
club
,
buying
club
.
.
.
are
registered
into
the
domain
dictionary
as
discount-related
phrases
.
•
value
selling
phrases
:
we
extract
phrases
mentioning
good
rates
and
good
vehicles
by
matching
patterns
related
to
such
utterances
.
-
mentions
of
good
rates
:
good
rate
,
wonderful
price
,
save
money
,
just
need
to
pay
this
low
amount
,
.
.
.
-
mentions
of
good
vehicles
:
good
car
,
fantastic
car
,
latest
model
,
.
.
.
Using
these
three
categories
,
we
tried
to
ind
insights
to
improve
agent
productivity
.
Table
4
shows
the
result
of
two-dimensional
association
analysis
for
137
reservation
calls
.
This
table
shows
the
association
between
customer
types
based
on
customer
intention
at
the
start
of
a
call
and
pick
up
information
.
From
these
results
,
67
%
Table
4
:
Association
between
customer
types
and
pick
up
information
Customer
types
extracted
from
texts
based
on
customer
intent
at
start
of
call
(
47
out
of
70
)
of
the
booking_customers
picked
up
the
reserved
car
and
only
35
%
(
13
out
of
37
)
of
the
rates_customers
picked
it
up
.
This
supports
our
hypothesis
and
means
that
pick
up
is
predictable
from
the
customer
's
irst
or
second
utterance
.
It
was
found
that
cars
booked
by
rates_customers
tend
to
be
"
not
picked
up
,
"
so
if
we
can
ind
any
actions
by
agents
that
convert
such
customers
into
"
pick
up
,
"
then
the
revenue
will
improve
.
In
the
booking.customer
case
,
to
keep
the
"
pick
up
"
high
,
we
need
to
determine
specific
agent
actions
that
concretize
the
customer
's
intent
.
Table
5
shows
how
mentioning
discount-related
phrases
affects
the
pick
up
ratios
for
rates_customers
and
booking_customers
.
From
this
table
,
it
can
Table
5
:
Association
between
mention
of
discount
phrases
and
pick
up
information
Rates
-
customer
not-picked
up
Booking-customer
Pick
up
information
Mention
of
discount
phrases
by
agents
not
picked
up
be
seen
that
mentioning
discount
phrases
affects
the
inal
status
of
both
types
of
customers
.
In
the
rates_customer
case
,
the
probability
that
the
booked
car
will
be
picked
up
,
P
(
pick-up
)
is
improved
to
0.476
by
mentioning
discount
phrases
.
This
means
customers
are
attracted
by
offering
discounts
and
this
changes
their
intention
from
"
just
checking
rate
"
to
"
make
a
reservation
here
"
.
We
found
similar
trends
for
the
association
between
mention
of
value
selling
phrases
and
pick
up
information
.
4.4
Improving
Agent
Productivity
From
the
results
of
the
text
mining
analysis
experiment
,
we
derived
the
following
actionable
insights
:
•
There
are
two
types
of
customers
in
reservation
calls
.
-
Booking.customer
(
with
strong
start
)
tends
to
pick
up
the
reserved
car
.
-
Rates_customer
(
with
weak
start
)
tends
not
to
pick
up
the
reserved
car
.
•
In
the
rates_customer
case
,
"
pick
up
"
is
improved
by
mentioning
discount
phrases
.
By
implementing
the
actionable
insights
derived
from
the
analysis
in
an
actual
car
rental
process
,
we
veriied
improvements
in
pick
up
.
We
divided
the
83
agents
in
the
car
rental
reservation
center
into
two
groups
.
One
of
them
,
consisting
of
22
agents
,
was
trained
based
on
the
insights
from
the
text
mining
analysis
.
The
remaining
61
agents
were
not
told
about
these
indings
.
By
comparing
these
two
groups
over
a
period
of
one
month
we
hoped
to
see
how
the
actionable
insights
contributed
to
improving
agent
performance
.
As
the
evaluation
metric
,
we
used
the
pick
up
ratio
-
that
is
the
ratio
ofthe
number
of
"
pick-ups
"
to
the
number
of
reservations
.
Following
the
training
the
pick
up
ratio
of
the
trained
agents
increased
by
4.75
%
.
The
average
pick
up
ratio
for
the
remaining
agents
increased
by
2.08
%
.
Before
training
the
ratios
of
both
groups
were
comparable
.
The
seasonal
trends
in
this
industry
mean
that
depending
on
the
month
the
bookings
and
pickups
may
go
up
or
down
.
We
believe
this
is
why
the
average
pick
up
ratio
for
the
remaining
agents
also
increased
.
Considering
this
,
it
can
be
estimated
that
by
implementing
the
actionable
insights
the
pick
up
ratio
for
the
pilot
group
was
improved
by
about
2.67
%
.
We
conirmed
that
this
difference
is
meaningful
because
the
p-value
of
the
t-test
statistic
is
0.0675
and
this
probability
is
close
to
the
standard
t-test
(
a
=
0.05
)
.
Seeing
this
,
the
contact
center
trained
all
of
its
agents
based
on
the
insights
from
the
text
mining
analysis
.
5
Discussion
There
has
been
a
lot
of
work
on
speciic
tools
for
analyzing
the
conversational
data
collected
at
contact
centers
.
These
include
call
type
classiication
for
the
purpose
of
categorizing
calls
(
Tang
et
al.
,
2003
)
(
Zweig
et
al.
,
2006
)
,
call
routing
(
Kuo
and
Lee
,
2003
)
(
Haffner
et
al.
,
2003
)
,
obtaining
call
log
summaries
(
Douglas
et
al.
,
2005
)
,
agent
assisting
and
monitoring
(
Mishne
et
al.
,
2005
)
,
and
building
of
domain
models
(
Roy
and
Subramaniam
,
2006
)
.
Filtering
problematic
dialogs
automatically
from
an
automatic
speech
recognizer
has
also
been
studied
(
Hastie
et
al.
,
2002
)
(
Walker
et
al.
,
2002
)
.
In
contrast
to
these
technologies
,
in
this
paper
we
consider
the
task
of
trying
to
ind
insights
from
a
collection
of
complete
conversations
.
In
(
Nasukawa
and
Nagano
,
2001
)
,
such
an
analysis
was
attempted
for
agent-entered
call
summaries
of
customer
contacts
by
extracting
phrases
based
on
domain-expert-speciied
viewpoints
.
In
our
work
we
have
shown
that
even
for
conversational
data
,
which
is
more
complex
,
we
could
identify
proper
viewpoints
and
prepare
expressions
for
each
viewpoint
.
Call
summaries
by
agents
tend
to
mask
the
customers
'
intention
at
the
start
of
the
call
.
We
get
more
valuable
insights
from
the
text
mining
analysis
of
conversational
data
.
For
such
an
analysis
of
conversational
data
,
our
proposed
method
has
an
important
role
.
With
our
method
,
we
find
the
important
segments
in
the
data
for
doing
analyses
.
Also
our
analyses
are
closely
linked
to
the
desired
outcomes
.
In
trigger
detection
,
we
created
a
chronologically
cumulative
data
set
based
on
turns
.
We
can
also
use
the
segment
information
such
as
the
"
opening
"
and
"
enquiries
"
described
in
Section
2
.
We
prepared
data
with
segment
information
manually
assigned
,
made
the
chronologically
cumulative
data
and
applied
our
trigger
detection
method
.
Figure
6
shows
the
results
of
acc
(
categorizer
(
Dk
)
)
.
The
trend
in
details
mandatory
questions
,
closing
Conversation
flow
Figure
6
:
Result
of
acc
(
categorizer
(
Dk
)
)
using
segment
information
Figure
6
is
similar
to
that
in
Figure
5
.
From
this
result
,
it
is
observed
that
"
opening
"
and
"
offering
"
segments
are
trigger
segments
.
Usually
,
segmentation
is
not
done
in
advance
and
to
assign
such
information
automatically
we
need
data
with
labeled
segmentation
information
.
The
results
show
that
even
in
the
absence
of
labeled
data
our
trigger
detection
method
identiies
the
trigger
segments
.
In
the
experiments
in
Section
4
,
we
set
turns
for
each
chronologically
cumulative
data
by
taking
into
account
the
pre-defined
call
flow
.
In
Figure
5
we
observe
that
the
accuracy
of
the
categorizer
is
decreasing
even
when
using
increasing
parts
of
the
call
.
Even
the
accuracy
using
the
complete
call
is
less
than
using
only
the
irst
turn
.
This
indicates
that
the
irst
turn
is
very
informative
,
but
it
also
indicates
that
the
features
are
not
being
used
judiciously
.
In
a
conventional
classification
task
,
the
number
offeatures
are
sometimes
restricted
when
constructing
a
categorizer
.
It
is
known
that
selecting
only
significant
features
improves
the
classification
accuracy
(
Yang
and
Pedersen
,
1997
)
.
We
used
Information
Gain
for
selecting
features
from
the
document
collection
.
This
method
selects
the
most
discriminative
features
between
two
classes
.
As
expected
the
classification
accuracy
improved
significantly
as
we
reduced
the
total
number
of
features
from
over
2,000
to
the
range
of
100
to
300
.
Figure
7
shows
the
changes
in
accuracy
.
In
the
pro
-
posed
method
,
we
detect
trigger
segments
using
the
increases
and
decreases
of
the
classification
accuracy
.
By
selecting
features
,
the
noisy
features
are
not
added
in
the
segments
.
Decreasing
portions
,
therefore
are
not
observed
.
In
this
situation
,
as
a
trigger
segment
,
we
can
detect
the
portion
where
the
gradient
of
the
accuracy
curve
increases
.
Also
using
feature
selection
,
we
find
that
the
classification
accuracy
is
highest
when
using
the
entire
document
,
which
is
expected
.
However
,
we
notice
that
the
trigger
segments
obtained
with
and
without
feature
selection
are
almost
the
same
.
In
the
experiment
,
we
use
manually
transcribed
data
.
As
future
work
we
would
like
to
use
the
noisy
output
of
an
automatic
speech
recognition
system
to
obtain
viewpoints
and
expressions
.
6
Conclusion
In
this
paper
,
we
have
proposed
methods
for
identifying
appropriate
segments
and
expressions
automatically
from
the
data
for
user
specified
difference
analysis
.
We
detected
the
trigger
segments
using
the
property
that
a
business-oriented
conversation
fol
-
lows
a
pre-defined
flow
.
After
that
,
we
identified
the
appropriate
expressions
from
each
trigger
segment
.
It
was
found
that
in
a
long
business-priented
conversation
there
are
important
segments
affecting
the
outcomes
that
can
not
been
easily
detected
by
just
looking
through
the
conversation
,
but
such
segments
can
be
detected
by
monitoring
the
changes
of
the
categorization
accuracy
.
For
the
trigger
segment
detection
,
we
do
not
use
semantic
segment
information
but
only
the
positional
segment
information
based
on
the
conversational
turns
.
Because
our
method
does
not
rely
on
the
semantic
information
in
the
data
,
therefore
our
method
can
be
seen
as
robust
.
Through
experiments
with
real
conversational
data
,
using
identified
segments
and
expressions
we
were
able
to
define
appropriate
viewpoints
and
concepts
leading
to
insights
for
improving
the
car
rental
business
process
.
Acknowledgment
The
authors
would
like
to
thank
Sreeram
Balakr-ishnan
,
Raghuram
Krishnapuram
,
Hideo
Watanabe
,
and
Koichi
Takeda
at
IBM
Research
for
their
support
.
The
authors
also
appreciate
the
efforts
of
Jatin
Joy
Giri
at
IBM
India
in
providing
domain
knowledge
about
the
car
rental
process
and
thank
him
for
help
in
constructing
the
dictionaries
.
