Abstract
Although there has been much work in recent years on data-driven natural language generation, little attention has been paid to the fine-grained interactions that arise during microplanning between aggregation, surface realization, and sentence segmentation. In this article, we propose a hybrid symbolic/statistical approach to jointly model the constraints regulating these interactions. Our approach integrates a small handwritten grammar, a statistical hypertagger, and a surface realization algorithm. It is applied to the verbalization of knowledge base queries and tested on 13 knowledge bases to demonstrate domain independence. We evaluate our approach in several ways. A quantitative analysis shows that the hybrid approach outperforms a purely symbolic approach in terms of both speed and coverage. Results from a human study indicate that users find the output of this hybrid statistic/symbolic system more fluent than both a template-based and a purely symbolic grammar-based approach. Finally, we illustrate by means of examples that our approach can account for various factors impacting aggregation, sentence segmentation, and surface realization.- Anthology ID:
- J17-1001
- Volume:
- Computational Linguistics, Volume 43, Issue 1 - April 2017
- Month:
- April
- Year:
- 2017
- Address:
- Cambridge, MA
- Venue:
- CL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 1–30
- Language:
- URL:
- https://aclanthology.org/J17-1001
- DOI:
- 10.1162/COLI_a_00273
- Cite (ACL):
- Claire Gardent and Laura Perez-Beltrachini. 2017. A Statistical, Grammar-Based Approach to Microplanning. Computational Linguistics, 43(1):1–30.
- Cite (Informal):
- A Statistical, Grammar-Based Approach to Microplanning (Gardent & Perez-Beltrachini, CL 2017)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/J17-1001.pdf