Mingming Sun


2020

pdf bib
A Predicate-Function-Argument Annotation of Natural Language for Open-Domain Information eXpression
Mingming Sun | Wenyue Hua | Zoey Liu | Xin Wang | Kangjie Zheng | Ping Li
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Existing OIE (Open Information Extraction) algorithms are independent of each other such that there exist lots of redundant works; the featured strategies are not reusable and not adaptive to new tasks. This paper proposes a new pipeline to build OIE systems, where an Open-domain Information eXpression (OIX) task is proposed to provide a platform for all OIE strategies. The OIX is an OIE friendly expression of a sentence without information loss. The generation procedure of OIX contains shared works of OIE algorithms so that OIE strategies can be developed on the platform of OIX as inference operations focusing on more critical problems. Based on the same platform of OIX, the OIE strategies are reusable, and people can select a set of strategies to assemble their algorithm for a specific task so that the adaptability may be significantly increased. This paper focuses on the task of OIX and propose a solution – Open Information Annotation (OIA). OIA is a predicate-function-argument annotation for sentences. We label a data set of sentence-OIA pairs and propose a dependency-based rule system to generate OIA annotations from sentences. The evaluation results reveal that learning the OIA from a sentence is a challenge owing to the complexity of natural language sentences, and it is worthy of attracting more attention from the research community.

pdf bib
Learning Interpretable Relationships between Entities, Relations and Concepts via Bayesian Structure Learning on Open Domain Facts
Jingyuan Zhang | Mingming Sun | Yue Feng | Ping Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Concept graphs are created as universal taxonomies for text understanding in the open-domain knowledge. The nodes in concept graphs include both entities and concepts. The edges are from entities to concepts, showing that an entity is an instance of a concept. In this paper, we propose the task of learning interpretable relationships from open-domain facts to enrich and refine concept graphs. The Bayesian network structures are learned from open-domain facts as the interpretable relationships between relations of facts and concepts of entities. We conduct extensive experiments on public English and Chinese datasets. Compared to the state-of-the-art methods, the learned network structures help improving the identification of concepts for entities based on the relations of entities on both datasets.

2019

pdf bib
Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews
Miao Fan | Chao Feng | Mingming Sun | Ping Li
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

To automatically assess the helpfulness of a customer review online, conventional approaches generally acquire various linguistic and neural embedding features solely from the textual content of the review itself as the evidence. We, however, find out that a helpful review is largely concerned with the metadata (such as the name, the brand, the category, etc.) of its target product. It leaves us with a challenge of how to choose the correct key-value product metadata to help appraise the helpfulness of free-text reviews more precisely. To address this problem, we propose a novel framework composed of two mutual-benefit modules. Given a product, a selector (agent) learns from both the keys in the product metadata and one of its reviews to take an action that selects the correct value, and a successive predictor (network) makes the free-text review attend to this value to obtain better neural representations for helpfulness assessment. The predictor is directly optimized by SGD with the loss of helpfulness prediction, and the selector could be updated via policy gradient rewarded with the performance of the predictor. We use two real-world datasets from Amazon.com and Yelp.com, respectively, to compare the performance of our framework with other mainstream methods under two application scenarios: helpfulness identification and regression of customer reviews. Extensive results demonstrate that our framework can achieve state-of-the-art performance with substantial improvements.

2018

pdf bib
Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain
Mingming Sun | Xu Li | Ping Li
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose the task of Open-Domain Information Narration (OIN) as the reverse task of Open Information Extraction (OIE), to implement the dual structure between language and knowledge in the open domain. Then, we develop an agent, called Orator, to accomplish the OIN task, and assemble the Orator and the recently proposed OIE agent — Logician into a dual system to utilize the duality structure with a reinforcement learning paradigm. Experimental results reveal the dual structure between OIE and OIN tasks helps to build better both OIE agents and OIN agents.