Xiangliang Zhang


ArMATH: a Dataset for Solving Arabic Math Word Problems
Reem Alghamdi | Zhenwen Liang | Xiangliang Zhang
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper studies solving Arabic Math Word Problems by deep learning. A Math Word Problem (MWP) is a text description of a mathematical problem that can be solved by deriving a math equation to reach the answer. Effective models have been developed for solving MWPs in English and Chinese. However, Arabic MWPs are rarely studied. This paper contributes the first large-scale dataset for Arabic MWPs, which contains 6,000 samples of primary-school math problems, written in Modern Standard Arabic (MSA). Arabic MWP solvers are then built with deep learning models and evaluated on this dataset. In addition, a transfer learning model is built to let the high-resource Chinese MWP solver promote the performance of the low-resource Arabic MWP solver. This work is the first to use deep learning methods to solve Arabic MWP and the first to use transfer learning to solve MWP across different languages. The transfer learning enhanced solver has an accuracy of 74.15%, which is 3% higher than the solver without using transfer learning. We make the dataset and solvers available in public for encouraging more research of Arabic MWPs: https://github.com/reem-codes/ArMATH

Unsupervised Mitigating Gender Bias by Character Components: A Case Study of Chinese Word Embedding
Xiuying Chen | Mingzhe Li | Rui Yan | Xin Gao | Xiangliang Zhang
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Word embeddings learned from massive text collections have demonstrated significant levels of discriminative biases.However, debias on the Chinese language, one of the most spoken languages, has been less explored.Meanwhile, existing literature relies on manually created supplementary data, which is time- and energy-consuming.In this work, we propose the first Chinese Gender-neutral word Embedding model (CGE) based on Word2vec, which learns gender-neutral word embeddings without any labeled data.Concretely, CGE utilizes and emphasizes the rich feminine and masculine information contained in radicals, i.e., a kind of component in Chinese characters, during the training procedure.This consequently alleviates discriminative gender biases.Experimental results on public benchmark datasets show that our unsupervised method outperforms the state-of-the-art supervised debiased word embedding models without sacrificing the functionality of the embedding model.

MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving
Zhenwen Liang | Jipeng Zhang | Lei Wang | Wei Qin | Yunshi Lan | Jie Shao | Xiangliang Zhang
Findings of the Association for Computational Linguistics: NAACL 2022

Math word problem (MWP) solving faces a dilemma in number representation learning. In order to avoid the number representation issue and reduce the search space of feasible solutions, existing works striving for MWP solving usually replace real numbers with symbolic placeholders to focus on logic reasoning. However, different from common symbolic reasoning tasks like program synthesis and knowledge graph reasoning, MWP solving has extra requirements in numerical reasoning. In other words, instead of the number value itself, it is the reusable numerical property that matters more in numerical reasoning. Therefore, we argue that injecting numerical properties into symbolic placeholders with contextualized representation learning schema can provide a way out of the dilemma in the number representation issue here. In this work, we introduce this idea to the popular pre-training language model (PLM) techniques and build MWP-BERT, an effective contextual number representation PLM. We demonstrate the effectiveness of our MWP-BERT on MWP solving and several MWP-specific understanding tasks on both English and Chinese benchmarks.

Scientific Paper Extractive Summarization Enhanced by Citation Graphs
Xiuying Chen | Mingzhe Li | Shen Gao | Rui Yan | Xin Gao | Xiangliang Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

In a citation graph, adjacent paper nodes share related scientific terms and topics. The graph thus conveys unique structure information of document-level relatedness that can be utilized in the paper summarization task, for exploring beyond the intra-document information.In this work, we focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.We first propose a Multi-granularity Unsupervised Summarization model (MUS) as a simple and low-cost solution to the task.MUS finetunes a pre-trained encoder model on the citation graph by link prediction tasks.Then, the abstract sentences are extracted from the corresponding paper considering multi-granularity information.Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.Motivated by this, we next propose a Graph-based Supervised Summarizationmodel (GSS) to achieve more accurate results on the task when large-scale labeled data are available.Apart from employing the link prediction as an auxiliary task, GSS introduces a gated sentence encoder and a graph information fusion module to take advantage of the graph information to polish the sentence representation.Experiments on a public benchmark dataset show that MUS and GSS bring substantial improvements over the prior state-of-the-art model.

ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture
Youssef Mohamed | Mohamed Abdelfattah | Shyma Alhuwaider | Feifan Li | Xiangliang Zhang | Kenneth Church | Mohamed Elhoseiny
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

This paper introduces ArtELingo, a new benchmark and dataset, designed to encourage work on diversity across languages and cultures. Following ArtEmis, a collection of 80k artworks from WikiArt with 0.45M emotion labels and English-only captions, ArtELingo adds another 0.79M annotations in Arabic and Chinese, plus 4.8K in Spanish to evaluate “cultural-transfer” performance. 51K artworks have 5 annotations or more in 3 languages. This diversity makes it possible to study similarities and differences across languages and cultures. Further, we investigate captioning tasks, and find diversity improves the performance of baseline models. ArtELingo is publicly available at ‘www.artelingo.org‘ with standard splits and baseline models. We hope our work will help ease future research on multilinguality and culturally-aware AI.

Analogical Math Word Problems Solving with Enhanced Problem-Solution Association
Zhenwen Liang | Jipeng Zhang | Xiangliang Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Math word problem (MWP) solving is an important task in question answering which requires human-like reasoning ability. Analogical reasoning has long been used in mathematical education, as it enables students to apply common relational structures of mathematical situations to solve new problems. In this paper, we propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver’s generalization ability across different kinds of MWPs. The key idea, named analogy identification, is to associate the analogical MWP pairs in a latent space, i.e., encoding an MWP close to another analogical MWP, while leaving away from the non-analogical ones. Moreover, a solution discriminator is integrated into the MWP solver to enhance the association between an MWP and its true solution. The evaluation results verify that our proposed analogical learning strategy promotes the performance of MWP-BERT on Math23k over the state-of-the-art model Generate2Rank, with 5 times fewer parameters in the encoder. We also find that our model has a stronger generalization ability in solving difficult MWPs due to the analogical learning from easy MWPs.


Data-Efficient Language Shaped Few-shot Image Classification
Zhenwen Liang | Xiangliang Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

Many existing works have demonstrated that language is a helpful guider for image understanding by neural networks. We focus on a language-shaped learning problem in a few-shot setting, i.e., using language to improve few-shot image classification when language descriptions are only available during training. We propose a data-efficient method that can make the best usage of the few-shot images and the language available only in training. Experimental results on dataset ShapeWorld and Birds show that our method outperforms other state-of-the-art baselines in language-shaped few-shot learning area, especially when training data is more severely limited. Therefore, we call our approach data-efficient language-shaped learning (DF-LSL).

Capturing Relations between Scientific Papers: An Abstractive Model for Related Work Section Generation
Xiuying Chen | Hind Alamro | Mingzhe Li | Shen Gao | Xiangliang Zhang | Dongyan Zhao | Rui Yan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Given a set of related publications, related work section generation aims to provide researchers with an overview of the specific research area by summarizing these works and introducing them in a logical order. Most of existing related work generation models follow the inflexible extractive style, which directly extract sentences from multiple original papers to form a related work discussion. Hence, in this paper, we propose a Relation-aware Related work Generator (RRG), which generates an abstractive related work from the given multiple scientific papers in the same research area. Concretely, we propose a relation-aware multi-document encoder that relates one document to another according to their content dependency in a relation graph. The relation graph and the document representation are interacted and polished iteratively, complementing each other in the training process. We also contribute two public datasets composed of related work sections and their corresponding papers. Extensive experiments on the two datasets show that the proposed model brings substantial improvements over several strong baselines. We hope that this work will promote advances in related work generation task.