Approaches to plural reference generation emphasise descriptive brevity, but often lack empirical backing.
This paper describes a corpus-based study of plural descriptions, and proposes a psycholinguistically-motivated algorithm for plural reference generation.
The descriptive strategy is based on partitioning and incorporates corpus-derived heuristics.
An exhaustive evaluation shows that the output closely matches human data.
1 Introduction
Generation of Referring Expressions (GRE) is a well-studied sub-task of microplanning in Natural Language Generation.
Most algorithms in this area view GRE as a content determination problem, that is, their emphasis is on the construction of a semantic representation which is eventually mapped to a linguistic realisation (i.e. a noun phrase).
Content Determination for GRE starts from a Knowledge Base (KB) consisting of a set of entities U and a set of properties P represented as attribute-value pairs, and searches for a description D C P which distinguishes a referent r G U from its distractors.
Under this view, reference is mainly about identification of an entitiy in a given context (represented by the KB), a well-studied pragmatic function of definite noun phrases in both the psycholinguistic and the computational literature (Olson, 1970).
For example, the KB in Table 1 represents 8 entities in a 2D visual domain, each with 6 attributes, including their location, represented as a combination of horizontal (X) and vertical (Y) numerical co-
ordinates.
To refer to an entity an algorithm searches through values of the different attributes.
GRE has been dominated by Dale and Reiter's (1995) Incremental Algorithm (IA), one version of which, generalised to deal with non-disjunctive plural references, is shown in Algorithm 1 (van Deemter, 2002).
A non-disjunctive reference to a set R is possible just in case all the elements of R can be distinguished using the same attribute-value pairs.
Such a description is equivalent to the logical conjunction of the properties in question.
This algorithm, lApiur, initialises a description D and a set of distractors C [1.1-1.2], and traverses an ordered list of properties, called the preference order (PO) [1.3], which reflects general or domain-specific pref-
return D end if end if
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 102-111, Prague, June 2007.
©2007 Association for Computational Linguistics
erences for attributes.
For instance, with the PO in the top row of the Table, the algorithm first considers values of type, then colour, and so on, adding a property to D if it is true of the intended referents R, and has some contrastive value, that is, excludes some distractors [1.4].
The description and the dis-tractor set C are updated accordingly [1.5-1.6], and the description returned if it is distinguishing [1.7].
Given R = {ei,e2}, this algorithm would return the following description:
This description is overspecified, because ORIENTATION is not strictly necessary to distinguish the referents ((size : small) suffices).
Moreover, the description does not include TYPE, though it has been argued that this is always required, as it maps to the head noun of an NP (Dale and Reiter, 1995).
We will adopt this assumption here, for reasons explained below.
Due to its hillclimbing nature, the IA avoids combinatorial search, unlike some predecessors which searched exhaustively for the briefest possible description of a referent (Dale, 1989), based on a strict interpretation of the Gricean Maxim of Quantity (Grice, 1975).
Given that, under the view proposed by Olson (1970) among others, the function of a referential NP is to identify, a strict Gricean interpretation holds that it should contain no more information than necessary to achieve this goal.
The Incremental Algorithm constitutes a departure from this view given that it can overspecify through its use of a PO.
This has been justified on psycholinguistic grounds.
Speakers overspecify their descriptions because they begin their formulation of a reference without exhaustively scanning a domain (Pechmann, 1989; Belke and Meyer, 2002).
They prioritise the basic-level category (TYPE) of an object, and salient, absolute properties like COLOUR (Pechmann, 1989; Eikmeyer and Ahlsen, 1996), as well as locative properties in the vertical dimension (Arts, 2004).
Relative attributes like SIZE are avoided unless absolutely required for identification (Belke and Meyer, 2002).
This evidence suggests that speakers conceptualise referents as gestalts (Pechmann, 1989) whose core is their basic-level TYPE (Murphy, 2002) and some other salient attributes like COLOUR.
For instance, according to
Schriefers and Pechmann (1988), an NP such as the large black triangle reflects a conceptualisation of the referent as a black triangle, of which the SIZE property is predicated.
Thus, the type+colour combination is not mentally represented as two separable dimensions.
In what follows, we will sometimes refer to this principle as the Conceptual Gestalts Principle.
Note that the IA does not fully mirror these human tendencies, since it only includes preferred attributes in a description if they remove some distractors given the current state of the algorithm, whereas psycholin-guistic research suggests that people include them irrespective of contrastiveness (but cf. van der Sluis and Krahmer (2005)).
More recent research on plural GRE has de-emphasised these issues, especially in case of disjunctive plural reference.
Disjunction is required whenever elements of a set of referents R do not have identical distinguishing properties.
For example, {ei,e3} can be distinguished by the following Conjunctive Normal Form (CNF) description1:
Such a description would be returned by a generalised version of Algorithm 1 proposed by van Deemter (2002).
This generalisation, IAboo1 (so called because it handles all Boolean operators, such as negation and disjunction), first tries to find a non-disjunctive description using Algorithm 1.
Failing this, it searches through disjunctions of properties of increasing length, conjoining them to the description.
This procedure has three consequences:
Efficiency: Searching through disjunctive combinations results in a combinatorial explosion (van Deemter, 2002).
Gestalts and content: The notion of a 'preferred attribute' is obscured, since it is difficult to apply the same reasoning that motivated the P O in the IA to combinations like
(COLOUR V SIZE).
1Note that logical disjunction is usually rendered as linguistic coordination using and.
Thus, the table and the desk is the union of things which are desks or tables.
Form: Descriptions can become logically very complex (Gardent, 2002; Horacek, 2004).
Proposals to deal with (3) include Gardent's (2002) non-incremental, constraint-based algorithm to generate the briefest available description of a set, an approach extended in Gardent et al. (2004).
An alternative, by Horacek (2004), combines best-first search with optimisation to reduce logical complexity.
Neither approach benefits from empirical grounding, and both leave open the question of whether previous psycholinguistic research on singular reference is applicable to plurals.
This paper reports a corpus-based analysis of plural descriptions elicited in well-defined domains, of which Table 1 is an example.
This study falls within a recent trend in which empirical issues in gre have begun to be tackled (Gupta and Stent, 2005; Jordan and Walker, 2005; Viethen and Dale, 2006).
We then propose an efficient algorithm for the generation of references to arbitrary sets, which combines corpus-derived heuristics and a partitioning-based procedure, comparing this to iaboo1.
Unlike van Deemter (2002), we only focus on disjunction, leaving negation aside.
Our starting point is the assumption that plurals, like singulars, evince preferences for certain attributes as predicted by the Conceptual Gestalts Principle.
Based on previous work in Gestalt perception (Wertheimer, 1938; Rock, 1983), we propose an extension of this to sets, whereby plural descriptions are preferred if (a) they maximise the similarity of their referents, using the same attributes to describe them as far as possible; (b) prioritise salient ('preferred') attributes which are central to the conceptual representation of an object.
We address (3) above by investigating the logical form of plurals in the corpus.
One determinant of logical form is the basic-level category of objects.
For example, to refer to (ei, e2 } in the Table, an author has at least the following options:
(b) the small red desk and the small blue sofa
(c) the small desk and the small blue sofa
(d) the small objects
These descriptions exemplify three possible sources of variation:
Disjunctive/Non-disjunctive: The last description,
(3d), is non-disjunctive (i.e. it is logically a conjunction of properties).
This, however, is only achievable through the use of a non-basic level value for the type of the entities (objects).
Using the basic-level would require the disjunction ((type : desk) V (type : sofa)), which is the case in (3a-c).
Given that basic-level categories are preferred on independent grounds (Rosch et al., 1976), we would expect examples like (3d) to be relatively infrequent.
Aggregation: If a description is disjunctive, it may be aggregated, with properties common to all objects realised as wide-scope modifiers.
For instance, in (3a), small modifies desk and sofa.
By contrast, (3b) is non-aggregated: small occurs twice (modifying each coordinate in the np).
Non-aggregated, disjunctive descriptions are logically equivalent to a partition of a set.
For instance, (3c) partitions the set R = {ei, e2} into {{ei}, {e2}}, describing each element separately.
Descriptions like (3b) are more overspecified than their aggregated counterparts due to the repetition of information.
Paralellism/Similarity: Non-aggregated, disjunctive descriptions (partitions) may exhibit semantic parallelism: In (3b), elements of the partition are described using exactly the same attributes (that is, type, colour, and size).
This is not the case in (3c), which does represent a partition but is nonparallel.
Parallel structures maximise the similarity of elements of a partition, using the same attributes to describe both.
The likelihood of propagation of an attribute across disjuncts is probably dependent on its degree of salience or preference (e.g. colour is expected to be more likely to be found in a parallel structure than size).
The data for our study is a subset of the tuna Corpus (Gatt et al., 2007), consisting of 900 references to furniture and household items, collected via a controlled experiment involving 45 participants.
In addition to their type, objects in the domains have colour, orientation and size (see Table 1).
For each subset of these three attributes, there was an equal number of domains in which the minimally distinguishing description (md) consisted of values of that subset.
For example, Table 1 represents a domain in which the intended referents, {ei, e2}, can
<ATTRIBUTE name='sizey value=,small'>small</ATTRIBUTE> <ATTRIBUTE name='coloury value=*red'>red</ATTRIBUTE> <ATTRIBUTE name=^type' value=Mesk'>desk</ATTRIBUTE>
</DESCRIPTION> and
<ATTRIBUTE name='sizey value=,small'>small</ATTRIBUTE> <ATTRIBUTE name='coloury value=*blue'>blue</ATTRIBUTE> <ATTRIBUTE name=^type' value=*sofa'>sofa</ATTRIBUTE>
</DESCRIPTION> </DESCRIPTION>
Figure 1: Corpus annotation examples
be minimally distinguished using only size2.
Thus, overspecified usage of attributes can be identified in authors' descriptions.
Domain objects were randomly placed in a 3 (row) x 5 (column) grid, represented by x and y in Table 1.
These are relevant for a subset of descriptions which contain locative expressions.
Corpus descriptions are paired with an explicit xml domain representation, and annotated with semantic markup which makes clear which attributes a description contains.
This markup abstracts away from differences in lexicalisation, making it an ideal resource to evaluate content determination algorithms, because it is semantically transparent, in the sense of this term used by van Deemter et al. (2006).
This markup scheme also enables the compositional derivation of a logical form from a natural language description.
For example, the xml representation of (3b) is shown in Figure 1, which also displays the lf derived from it.
Each constituent np in (3b) is annotated as a set of attributes enclosed by a description tag, which is marked up as singular (sg).
The two coordinates are further enclosed in a plural description; correspondingly, the lf is a disjunction of (the lfs of) the two internal descriptions.
Descriptions in the corpus were elicited in 7 domains with one referent, and 13 domains with 2 referents.
Plural domains represented levels of a Value Similarity factor.
In 7 Value-Similar (vs) domains, referents were identifiable using identical values of the minimally distinguishing attributes.
In the remaining 6 Value-Dissimilar (vds) domains, the minimally distinguishing values were different.
Table 1 represents a vs domain, where [e\, e2} can
2type was not included in the calculation of md.
% overall
Table 2: % disjunctive and non-disjunctive plurals
be minimally distinguished using the same value of size (small).
In terms of our introductory discussion, referents in Value-Similar conditions could be minimally distinguished using a conjunction of properties, while Value-Dissimilar referents required a disjunction since, if two referents could be minimally distinguished by different values v and v' of an attribute a, then md had the form (a : v) V (a : v').
However, even in the vs condition, referents had different basic-level types.
Thus, an author faced with a domain like Table 1 had at least the descriptive options in (3a-d).
If they chose to refer to entities using basic-level values of type, their description would be disjunctive (e.g. 3a).
A non-disjunctive description would require the use of a superordinate value, as in (3d).
Our analysis will focus on a stratified random sample of 180 plural descriptions, referred to as pli, generated by taking 4 descriptions from each author (2 each from vs and vds conditions).
We also use the singular data (sg; N = 315).
The remaining plural descriptions (pl2; N = 405) are used for evaluation.
3 The logical form of plurals
Descriptions in pl1 were first classified according to whether they were non-disjunctive (cf. 3d) or disjunctive (3a-c).
The latter were further classified into aggregated (3a) and non-aggregated (3b).
Table 2 displays the percentage of descriptions in each of the four categories, within each level of Value Similarity.
Disjunctive descriptions were a majority in either condition, and most of these were non-aggregated.
As noted in § 1, these descriptions correspond to partitions of the set of referents.
Since referents in vs had identical properties except for type values, the most likely reason for the majority of disjunctives in vs is that people's descriptions represented a partition of a set of referents induced by the basic-level category of the ob-
Non-Parallel
Parallel
overspec.
underspec.
well-spec.
Table 3: Parallelism: % per description type
jects.
This is strengthened by the finding that the likelihood of a description being disjunctive or non-disjunctive did not differ as a function of Value Similarity (x2 = 2.56, p > .
1).
A x2 test on overall frequencies of aggregated versus non-aggregated disjunctives showed that the non-aggregated descriptions ('true' partitions) were a significant majority (x2 = 83.63, p < .
However, the greater frequency of aggregation in vs compared to vds turned out to be significant (x2 = 15.498, p < .
Note that the predominance of non-aggregated descriptions in vs implies that properties are repeated in two disjuncts (resp. coordinate nps), suggesting that authors are likely to redundantly propagate properties across disjuncts.
This evidence goes against some recent proposals for plural reference generation which emphasise brevity (Gardent, 2002).
3.1 Conceptual gestalts and similarity
Allowing for the independent motivation for set partitioning based on type values, we suggested in §1 that parallel descriptions such as (3b) may be more likely than non-parallel ones (3c), since the latter does not use the same properties to describe the two referents.
Similarity, however, should also interact with attribute preferences.
For this part of the analysis, we focus exclusively on the disjunctive descriptions in pl1 (N = 150) in both vs and vds. The descriptions were categorised according to whether they had parallel or nonparallel semantic structure.
Evidence for Similarity interacting with attribute preferences is strongest if it is found in those cases where an attribute is over-speciied (i.e. used when not required for a distinguishing description).
In those cases where corpus descriptions do not contain locative expressions (the x and/or y attributes), such an overspecified usage is straightforwardly identified based on the md of a domain.
This is less straightforward in the case of locatives, since the position ofobjects was randomly determined in each domain.
Therefore, we divided
Predicted
ORIENTATION
X-DIMENSION
Y-DIMENSION
Table 4: Actual and predicted usage probabilities
descriptions into three classes, whereby a description is considered to be:
1. underspecified if it does not include a locative expression and omits some md attributes;
2. overspecified if either (a) it does not omit any md attributes, but includes locatives and/or non-required visual attributes; or (b) it omits some md attributes, but includes both a locative expression and other, non-required attributes;
3. well-specified otherwise.
Proportions of Parallel and Non-Parallel descriptions for each of the three classes are are shown in Table 3.
In all three description types, there is an overwhelming majority of Parallel descriptions, conirmed by a x2 analysis.
The difference in proportions of description types did not differ between vs and vds (x2 < 1, p > .
8), suggesting that the tendency to redundantly repeat attributes, avoiding aggregation, is independent of whether elements of a set can be minimally distinguished using identical values.
Our second prediction was that the likelihood with which an attribute is used in a parallel structure is a function of its overall 'preference'.
Thus, we expect attributes such as colour to feature more than once (perhaps redundantly) in a parallel description to a greater extent than size.
To test this, we used the sg sample, estimating the overall probability of occurrence of a given attribute in a singular description (denoted p(a, sg)), and using this in a non-linear regression model to predict the likelihood of usage of an attribute in a plural partitioned description with parallel semantic structure (denoted p(a, pps)).
The data was fitted to a regression equation of the form p(a, pps) = k x p(a, sg)s. The resulting equation, shown in (4), had a near-perfect fit
to the data (R2 = .
910)3.
This is confirmed by comparing actual probability of occurrence in the second column of Table 4, to the predicted probabilities in the third column, which are estimated from singular probabilities using (4).
Note that the probabilities in the Table con-irm previous psycholinguistic indings.
To the extent that probability of occurrence reflects salience and/or conceptual importance, an order over the three attributes colour, size and orientation can be deduced (c>>o>>s), which is compatible with the findings of Pechmann (1989), Belke and Meyer (2002) and others.
The locative attributes are also ordered (Y>>X), confirming the findings of Arts (2004) that vertical location is preferred. or-derings deducible from the SG data in turn are excellent predictors of the likelihood of 'propagating' an attribute across disjuncts in a plural description, something which is likely even if an attribute is redundant, modulo the centrality or salience of the attribute in the mental gestalt corresponding to the set.
Together with the earlier indings on logical form, the data evinces a dual strategy whereby (a) sets are partitioned based on basic-level conceptual category; (b) elements of the partitions are described using the same attributes if they are easily perceived and conceptualised.
Thus, of the descriptions in (3) above, it is (3b) that is the norm among authors.
4 Content determination by partitioning
In this section we describe IApart, a partitioning-based content determination algorithm.
Though presented as a version of the IA, the basic strategy is generalisable beyond it.
For our purposes, the assumption of a preference order will be maintained. lApart is distinguished from the original IA and lAbool (cf. §1) in two respects.
First, it induces partitions opportunistically based on KB information, and this is is reflected in the way descriptions are represented.
Second,, the criteria whereby a property is added to a description include a consideration of the overall salience or preference of an attribute, and its contribution to the conceptual cohesiveness
3A similar analysis using linear regression gave essentially the same results.
of the description.
Throughout the following discussion, we maintain a running example from Table 1, in which R = {e1, e2, e5}.
4.1 Partitioned descriptions
iApart generates a partitioned description (Dpart) of a set R, corresponding to a formula in Disjunctive Normal Form.
Dpart is a set of Description Fragments (DFs).
A DFis atriple (RDF, TDF, MDF), where RDF - R, TDF is a value of type, and MDF is a possibly empty set of other properties.
DFs refer to disjoint subsets of R. As the representation suggests, type is given a special status. lApart starts by selecting the basic-level values of type, partitioning R and creating a DF for each element of the partition on this basis.
In our example, the selection of type results in two DFs, with MDF initialised to empty:
Although neither DF is distinguishing, RDF indicates which referents a fragment is intended to identify.
In this way, the algorithm incorporates a 'divide-and-conquer' strategy, splitting up the referential intention into 'sub-intentions' to refer to elements of a partition.
Following the initial step of selecting TYPE, the algorithm considers other properties in PO.
Suppose (colour : blue) is considered first.
This property is true of e2 and e5.
Since DF2 refers to e2, the new property can be added to MDF2.
Since e5 is not the sole referent of DF1, the property induces a further partitioning of this fragment, resulting in a new DF.
This is identical to DF1 except that it refers only to e5 and contains (COLOUR : blue).
DF1 itself now refers only to e1.
Once (COLOUR : red) is considered, it is added to the latter, yielding (6).
RDF - R' [2.4].
This corresponds to our example involving (COLOUR : blue) and DF2.
The property is simply added to MDF [2.5] and R' is updated by removing the elements thus accounted for [2.6].
Suppose RDF ^ R'.
If RDF n R' is empty, then (a : v) is not useful.
Suppose on the other hand that RDF n R' = 0 [2.7].
This occurred with (COLOUR : red) in relation to DF1.
The procedure initialises Rnew, a set holding those referents in RDF which are also in R' [2.8].
A new DF (DFnew) is created, which is a copy of the old DF, except that (a) it contains the new property; and (b) its intended referents are Rnew [2.9].
The new DF is included in the description [2.10], while the old DF is altered by removing Rnew from RDF [2.11].
This ensures that DFs denote disjoint subsets of R.
(_L) and the property is included in M [2.18]4.
Note that this procedure easily generalises to the singular case, where D part would only contain one DF.
4.2 Property selection criteria
IApart's content determination strategy maximises the similarity of a set by generating semantically parallel structures.
Though contrastiveness plays a role in property selection, the 'preference' or conceptual salience of an attribute is also considered in the decision to propagate it across DFs.
Candidate properties for addition need only be true of at least one element of R. Because of the partitioning strategy, properties are not equally con-strastive for all referents.
For instance, in (5), e2 needs to be distinguished from the other sofas in Table 1, while {e1; e5} need to be distinguished from the desks.
Therefore, distractors are held in an associative array C, such that for all r G R, C [r] is the set of distractors for that referent at a given stage in the procedure.
Contrastiveness is deined via the following Boolean function:
We turn next to salience and similarity.
Let A(Dpart) be the set of attributes included in Dpart.
A property is salient with respect to D part if it satis-ies the following:
that is, the attribute is already included in the description, and the predicted probability of its being propagated in more than one fragment of a description is greater than chance.
A potential problem arises here.
Consider the description in (5) once more.
At this stage, IApart begins to consider colour.
The value red is true of e1, but non-contrastive (all the desks which are not in R are red).
If this is the first value of colour considered, (8) returns false because the attribute has not been used in any part of the description.
On later considering (COLOUR : blue), the algorithm adds it to
4This only occurs if the kb is incomplete, that is, there some entities have no type, so that R is not fully covered by the intended referents of the dfs when type is initially added.
Dpart, since it is contrastive for {e2, e5}, but will have failed to propagate colour across fragments.
As a result, lApart considers values of an attribute in order of discriminatory power (Dale, 1989), deined in the present context as follows:
Discriminatory power depends on the number ofref-erents a property includes in its extension, and the number of distractors (U — R) it removes.
By prioritising discriminatory values, the algorithm irst considers and adds (colour : blue), and subsequently will include red because (8) returns true.
To continue with the example, at the stage represented by (6), only e5 has been distinguished. orientation, the next attribute considered, is not contrastive for any referent.
On considering SIZE, small is found to be contrastive for e1 and e2, and added to DF1 and DF2.
However, size is not added to DF3, in spite of being present in two other fragments.
This is because the probability function p(SLZE, PPS) returns a value below 0.5 (see Table 4, reflecting the relatively low conceptual salience of this attribute.
The inal description is the blue desk, the small red desk and the small blue sofa.
This example illustrates the limits set on semantic parallelism and similarity: only attributes which are salient enough are redundantly propagated across DFs.
An estimate of the complexity of IApart must account for the way properties are selected (§4.2) and the way descriptions are updated (Algorithm 2).
Property selection involves checking properties for contrastive value and salience, and updating the ordering of values of each attribute based on discriminatory power (9).
Clearly, the number of times this is carried out is bounded by the number of properties in the KB, which we denote np. Every time a property is selected, the discriminatory power ofval-ues changes (since the number of remaining distrac-tors changes).
Now, in the worst case, all np properties are selected by the algorithm 5.
Each time, the algorithm must compare the remaining properties
5Only unique properties need to be considered, as each property is selected at most once, though it can be included in more than one DF.
Table 5: Edit distance scores
pairwise for discriminatory power, a quadratic operation with complexity O(n2).
With respect to the procedure update-Description, we need to consider the number of iterations in the for loop starting at line [2.1].
This is bounded by nr = |R| (there can be no more DFs than there are referents).
Once again, if at most np properties are selected, then the algorithm makes at most nr iterations np times, yielding complexity O(npnr).
Overall, then, lApart has a worst-case runtime complexity O(npnr).
5 Evaluation
iApart was compared to van Deemter's lAboo1 (§1) against human output in the evaluation sub-corpus PL2 (N = 405).
This was considered an adequate comparison, since IAboo1 shares with the current framework a genetic relationship with the IA.
Other approaches, such as Gardent's (2002) brevity-oriented algorithm, would perform poorly on our data.
As shown in §3, overspecification is extremely common in plural descriptions, suggesting that such a strategy is on the wrong track (but see §6).
lApart and lAboo1 were each run over the domain representation paired with each corpus description.
The output logical form was compared to the LF compiled from the XML representation of an author's description (cf. Figure 1).
LFs were represented as and-or trees, and compared using the tree edit distance algorithm of Shasha and Zhang (1990).
On this measure, a value of 0 indicates identity.
perfect agreement with an author.
As the means and modes indicate, lApart outperformed lAboo1 on both datasets, with a consistently higher PRP (this coincides with the modal score in the case of -loc).
Pairwise t-tests showed that the trends were significant in both +LOC (t(147) = 9.28, p < .
001) and -LOC (t(256) = 10.039, p < .
001).
lAboo1 has a higher (worse) mean on -LOC, but a better PRP than on +LOC.
This apparent discrepancy is partly due to variance in the edit distance scores.
For instance, because the Y attribute was highest in the preference order for +LOC, there were occasions when both referents could be identiied using the same value of Y, which was therefore included by IAboo1 at first pass, before considering disjunctions.
Since Y was highly preferred by authors (see Table 4), there was higher agreement on these cases, compared to those where the values of Y were different for the two referents.
In the latter case, Y was only when disjunctions were considered, if at all.
The worse performance of lApart on +LOC is due to a larger choice of attributes, also resulting in greater variance, and occasionally incurring higher Edit cost when the algorithm overspeciied more than a human author.
This is a potential shortcoming of the partitioning strategy outlined here, when it is applied to more complex domains.
Some example outputs are given below, in a domain where COLOUR sufficed to distinguish the referents, which had different values of this attribute (i.e. an instance of the VDS condition).
The formula returned by lApart (10a) is identical to the (LF of) the human-authored description (with Edit score of 0).
The output of IAboo1 is shown in (10b).
As a result of IAboo1's requiring a property or disjunction to be true of the the entire set of referents, COLOUR is not included until disjunctions are considered, while values of size and orientation are included at first pass.
By contrast, IApart includes COLOUR before any other attribute apart from TYPE.
Though overspecification is common in our data, iAboo1 overspecifies with the 'wrong' attributes
(those which are relatively dispreferred).
The rationale in IApart is to overspecify only if a property will enhance referent similarity, and is suficiently salient.
As for logical form, the Conjunctive Normal Form output of IAboo1 increases the Edit score, given the larger number of logical operators in (10b) compared to (10a).
6 Summary and conclusions
This paper presented a study of plural reference, showing that people (a) partition sets based on the basic level type or category of their elements and (b) redundantly propagate attributes across disjuncts in a description, modulo their salience.
Our algorithm partitions a set opportunistically, and incorporates a corpus-derived heuristic to estimate the salience of a property.
Evaluation results showed that these principles are on the right track, with sig-niicantly better performance over a previous model (van Deemter, 2002).
The partitioning strategy is related to a proposal by van Deemter and Krah-mer (2007), which performs exhaustive search for a partition of a set whose elements can be described non-disjunctively.
Unlike the present approach, this algorithm is non-incremental and computationally costly.
lApart initially performs partitioning based on the basic-level type of objects, in line with the evidence.
However, later partitions can be induced by other properties, possible yielding partitions even with same-TYPE referents (e.g. the blue chair and the red chair).
Aggregation (the blue and red chairs) may be desirable in such cases, but limits on syntactic complexity of NPs are bound to play a role (Ho-racek, 2004).
Another possible limitation of IApart is that, despite strong evidence for overspeciica-tion, complex domains could yield very lengthy outputs.
Strategies to avoid them include the utilisation of other boolean operators like negation (the desks which are not red) (Horacek, 2004).
These issues are open to future empirical research.
7 Acknowledgements
Thanks to Ehud Reiter and Ielka van der Sluis for useful comments.
This work forms part of the TUNA
supported by epsrc grant GR/S13330/01.
