Fahime Same


2022

pdf
Constructing Distributions of Variation in Referring Expression Type from Corpora for Model Evaluation
T. Mark Ellison | Fahime Same
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The generation of referring expressions (REs) is a non-deterministic task. However, the algorithms for the generation of REs are standardly evaluated against corpora of written texts which include only one RE per each reference. Our goal in this work is firstly to reproduce one of the few studies taking the distributional nature of the RE generation into account. We add to this work, by introducing a method for exploring variation in human RE choice on the basis of longitudinal corpora - substantial corpora with a single human judgement (in the process of composition) per RE. We focus on the prediction of RE types, proper name, description and pronoun. We compare evaluations made against distributions over these types with evaluations made against parallel human judgements. Our results show agreement in the evaluation of learning algorithms against distributions constructed from parallel human evaluations and from longitudinal data.

pdf
Non-neural Models Matter: a Re-evaluation of Neural Referring Expression Generation Systems
Fahime Same | Guanyi Chen | Kees Van Deemter
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In recent years, neural models have often outperformed rule-based and classic Machine Learning approaches in NLG. These classic approaches are now often disregarded, for example when new neural models are evaluated. We argue that they should not be overlooked, since, for some tasks, well-designed non-neural approaches achieve better performance than neural ones. In this paper, the task of generating referring expressions in linguistic context is used as an example. We examined two very different English datasets (WEBNLG and WSJ), and evaluated each algorithm using both automatic and human evaluations.Overall, the results of these evaluations suggest that rule-based systems with simple rule sets achieve on-par or better performance on both datasets compared to state-of-the-art neural REG systems. In the case of the more realistic dataset, WSJ, a machine learning-based system with well-designed linguistic features performed best. We hope that our work can encourage researchers to consider non-neural models in future.

pdf
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset
Guanyi Chen | Fahime Same | Kees Van Deemter
Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems

2021

pdf
What can Neural Referential Form Selectors Learn?
Guanyi Chen | Fahime Same | Kees van Deemter
Proceedings of the 14th International Conference on Natural Language Generation

Despite achieving encouraging results, neural Referring Expression Generation models are often thought to lack transparency. We probed neural Referential Form Selection (RFS) models to find out to what extent the linguistic features influencing the RE form are learned and captured by state-of-the-art RFS models. The results of 8 probing tasks show that all the defined features were learned to some extent. The probing tasks pertaining to referential status and syntactic position exhibited the highest performance. The lowest performance was achieved by the probing models designed to predict discourse structure properties beyond the sentence level.

2020

pdf
A Linguistic Perspective on Reference: Choosing a Feature Set for Generating Referring Expressions in Context
Fahime Same | Kees van Deemter
Proceedings of the 28th International Conference on Computational Linguistics

This paper reports on a structured evaluation of feature-based Machine Learning algorithms for selecting the form of a referring expression in discourse context. Based on this evaluation, we selected seven feature sets from the literature, amounting to 65 distinct linguistic features. The features were then grouped into 9 broad classes. After building Random Forest models, we used Feature Importance Ranking and Sequential Forward Search methods to assess the “importance” of the features. Combining the results of the two methods, we propose a consensus feature set. The 6 features in our consensus set come from 4 different classes, namely grammatical role, inherent features of the referent, antecedent form and recency.

pdf
Computational Interpretations of Recency for the Choice of Referring Expressions in Discourse
Fahime Same | Kees van Deemter
Proceedings of the First Workshop on Computational Approaches to Discourse

First, we discuss the most common linguistic perspectives on the concept of recency and propose a taxonomy of recency metrics employed in Machine Learning studies for choosing the form of referring expressions in discourse context. We then report on a Multi-Layer Perceptron study and a Sequential Forward Search experiment, followed by Bayes Factor analysis of the outcomes. The results suggest that recency metrics counting paragraphs and sentences contribute to referential choice prediction more than other recency-related metrics. Based on the results of our analysis, we argue that, sensitivity to discourse structure is important for recency metrics used in determining referring expression forms.