Logic2Text: High-Fidelity Natural Language Generation from Logical Forms
Zhiyu Chen, Wenhu Chen, Hanwen Zha, Xiyou Zhou, Yunkai Zhang, Sairam Sundaresan, William Yang Wang
Abstract
Previous studies on Natural Language Generation (NLG) from structured data have primarily focused on surface-level descriptions of record sequences. However, for complex structured data, e.g., multi-row tables, it is often desirable for an NLG system to describe interesting facts from logical inferences across records. If only provided with the table, it is hard for existing models to produce controllable and high-fidelity logical generations. In this work, we formulate high-fidelity NLG as generation from logical forms in order to obtain controllable and faithful generations. We present a new large-scale dataset, Logic2Text, with 10,753 descriptions involving common logic types paired with the underlying logical forms. The logical forms show diversified graph structure of free schema, which pose great challenges on the model’s ability to understand the semantics. We experiment on (1) Fully-supervised training with the full datasets, and (2) Few-shot setting, provided with hundreds of paired examples; We compare several popular generation models and analyze their performances. We hope our dataset can encourage research towards building an advanced NLG system capable of natural, faithful, and human-like generation. The dataset and code is available at https://github.com/czyssrs/Logic2Text.- Anthology ID:
- 2020.findings-emnlp.190
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2096–2111
- Language:
- URL:
- https://aclanthology.org/2020.findings-emnlp.190
- DOI:
- 10.18653/v1/2020.findings-emnlp.190
- Cite (ACL):
- Zhiyu Chen, Wenhu Chen, Hanwen Zha, Xiyou Zhou, Yunkai Zhang, Sairam Sundaresan, and William Yang Wang. 2020. Logic2Text: High-Fidelity Natural Language Generation from Logical Forms. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2096–2111, Online. Association for Computational Linguistics.
- Cite (Informal):
- Logic2Text: High-Fidelity Natural Language Generation from Logical Forms (Chen et al., Findings 2020)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2020.findings-emnlp.190.pdf
- Code
- czyssrs/Logic2Text
- Data
- Logic2Text, WikiBio