Proceedings of the First Workshop on Multilingual Surface Realisation

Simon Mille, Anja Belz, Bernd Bohnet, Emily Pitler, Leo Wanner (Editors)

Anthology ID:: W18-36
Month:: July
Year:: 2018
Address:: Melbourne, Australia
Venue:: ACL
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
URL:: https://aclanthology.org/W18-36
DOI:
Bib Export formats:: BibTeX
PDF:: https://preview.aclanthology.org/add_acl24_videos/W18-36.pdf

pdf bib
Proceedings of the First Workshop on Multilingual Surface Realisation
Simon Mille | Anja Belz | Bernd Bohnet | Emily Pitler | Leo Wanner

We report results from the SR’18 Shared Task, a new multilingual surface realisation task organised as part of the ACL’18 Workshop on Multilingual Surface Realisation. As in its English-only predecessor task SR’11, the shared task comprised two tracks with different levels of complexity: (a) a shallow track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (b) a deep track where additionally, functional words and morphological information were removed. The shallow track was offered in ten, and the deep track in three languages. Systems were evaluated (a) automatically, using a range of intrinsic metrics, and (b) by human judges in terms of readability and meaning similarity. This report presents the evaluation results, along with descriptions of the SR’18 tracks, data and evaluation methods. For full descriptions of the participating systems, please see the separate system reports elsewhere in this volume.

pdf bib abs
BinLin: A Simple Method of Dependency Tree Linearization
Yevgeniy Puzikov | Iryna Gurevych

Surface Realization Shared Task 2018 is a workshop on generating sentences from lemmatized sets of dependency triples. This paper describes the results of our participation in the challenge. We develop a data-driven pipeline system which first orders the lemmas and then conjugates the words to finish the surface realization process. Our contribution is a novel sequential method of ordering lemmas, which, despite its simplicity, achieves promising results. We demonstrate the effectiveness of the proposed approach, describe its limitations and outline ways to improve it.

pdf abs
IIT (BHU) Varanasi at MSR-SRST 2018: A Language Model Based Approach for Natural Language Generation
Shreyansh Singh | Ayush Sharma | Avi Chawla | A.K. Singh

This paper describes our submission system for the Shallow Track of Surface Realization Shared Task 2018 (SRST’18). The task was to convert genuine UD structures, from which word order information had been removed and the tokens had been lemmatized, into their correct sentential form. We divide the problem statement into two parts, word reinflection and correct word order prediction. For the first sub-problem, we use a Long Short Term Memory based Encoder-Decoder approach. For the second sub-problem, we present a Language Model (LM) based approach. We apply two different sub-approaches in the LM Based approach and the combined result of these two approaches is considered as the final output of the system.

pdf abs
Surface Realization Shared Task 2018 (SR18): The Tilburg University Approach
Thiago Castro Ferreira | Sander Wubben | Emiel Krahmer

This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18). Based on (Castro Ferreira et al., 2017), the approach works by first preprocessing an input dependency tree into an ordered linearized string, which is then realized using a statistical machine translation model. Our approach shows promising results, with BLEU scores above 50 for 5 different languages (English, French, Italian, Portuguese and Spanish) and above 35 for the Dutch language.

pdf abs
The OSU Realizer for SRST ‘18: Neural Sequence-to-Sequence Inflection and Incremental Locality-Based Linearization
David King | Michael White

Surface realization is a nontrivial task as it involves taking structured data and producing grammatically and semantically correct utterances. Many competing grammar-based and statistical models for realization still struggle with relatively simple sentences. For our submission to the 2018 Surface Realization Shared Task, we tackle the shallow task by first generating inflected wordforms with a neural sequence-to-sequence model before incrementally linearizing them. For linearization, we use a global linear model trained using early update that makes use of features that take into account the dependency structure and dependency locality. Using this pipeline sufficed to produce surprisingly strong results in the shared task. In future work, we intend to pursue joint approaches to linearization and morphological inflection and incorporating a neural language model into the linearization choices.

pdf abs
Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models
Henry Elder | Chris Hokamp

This work presents state of the art results in reconstruction of surface realizations from obfuscated text. We identify the lack of sufficient training data as the major obstacle to training high-performing models, and solve this issue by generating large amounts of synthetic training data. We also propose preprocessing techniques which make the structure contained in the input features more accessible to sequence models. Our models were ranked first on all evaluation metrics in the English portion of the 2018 Surface Realization shared task.

In this paper we describe our system and experimental results on the development set of the Surface Realisation Shared Task. Our system is an entry for the Shallow-Task, with two different models based on deep-learning implementations for building the sentence combined with a rule-based morphology component.

pdf abs
NILC-SWORNEMO at the Surface Realization Shared Task: Exploring Syntax-Based Word Ordering using Neural Models
Marco Antonio Sobrevilla Cabezudo | Thiago Pardo

This paper describes the submission by the NILC Computational Linguistics research group of the University of São Paulo/Brazil to the Track 1 of the Surface Realization Shared Task (SRST Track 1). We present a neural-based method that works at the syntactic level to order the words (which we refer by NILC-SWORNEMO, standing for “Syntax-based Word ORdering using NEural MOdels”). Additionally, we apply a bottom-up approach to build the sentence and, using language-specific lexicons, we produce the proper word form of each lemma in the sentence. The results obtained by our method outperformed the average of the results for English, Portuguese and Spanish in the track.

pdf abs
The DipInfo-UniTo system for SRST 2018
Valerio Basile | Alessandro Mazzei

This paper describes the system developed by the DipInfo-UniTo team to participate to the shallow track of the Surface Realization Shared Task 2018. The system employs two separate neural networks with different architectures to predict the word ordering and the morphological inflection independently from each other. The UniTO realizer is language independent, and its simple architecture allowed it to be scored in the central part of the final ranking of the shared task.