Hao Zheng


2022

pdf
Disentangled Sequence to Sequence Learning for Compositional Generalization
Hao Zheng | Mirella Lapata
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

There is mounting evidence that existing neural network models, in particular the very popular sequence-to-sequence architecture, struggle to systematically generalize to unseen compositions of seen components. We demonstrate that one of the reasons hindering compositional generalization relates to representations being entangled. We propose an extension to sequence-to-sequence models which encourage disentanglement by adaptively re-encoding (at each time step) the source input. Specifically, we condition the source representations on the newly decoded target context which makes it easier for the encoder to exploit specialized information for each prediction rather than capturing it all in a single forward pass. Experimental results on semantic parsing and machine translation empirically show that our proposal delivers more disentangled representations and better generalization.

2021

pdf
LenAtten: An Effective Length Controlling Unit For Text Summarization
Zhongyi Yu | Zhenghao Wu | Hao Zheng | Zhe XuanYuan | Jefferson Fong | Weifeng Su
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
Compositional Generalization via Semantic Tagging
Hao Zheng | Mirella Lapata
Findings of the Association for Computational Linguistics: EMNLP 2021

Although neural sequence-to-sequence models have been successfully applied to semantic parsing, they fail at compositional generalization, i.e., they are unable to systematically generalize to unseen compositions of seen components. Motivated by traditional semantic parsing where compositionality is explicitly accounted for by symbolic grammars, we propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models while featuring lexicon-style alignments and disentangled information processing. Specifically, we decompose decoding into two phases where an input utterance is first tagged with semantic symbols representing the meaning of individual words, and then a sequence-to-sequence model is used to predict the final meaning representation conditioning on the utterance and the predicted tag sequence. Experimental results on three semantic parsing datasets show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.

2019

pdf
BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model
Zhenghao Wu | Hao Zheng | Jianming Wang | Weifeng Su | Jefferson Fong
Proceedings of the 13th International Workshop on Semantic Evaluation

In this study we deal with the problem of identifying and categorizing offensive language in social media. Our group, BNU-HKBU UIC NLP Team2, use supervised classification along with multiple version of data generated by different ways of pre-processing the data. We then use the state-of-the-art model Bidirectional Encoder Representations from Transformers, or BERT (Devlin et al, 2018), to capture linguistic, syntactic and semantic features. Long range dependencies between each part of a sentence can be captured by BERT’s bidirectional encoder representations. Our results show 85.12% accuracy and 80.57% F1 scores in Subtask A (offensive language identification), 87.92% accuracy and 50% F1 scores in Subtask B (categorization of offense types), and 69.95% accuracy and 50.47% F1 score in Subtask C (offense target identification). Analysis of the results shows that distinguishing between targeted and untargeted offensive language is not a simple task. More work needs to be done on the unbalance data problem in Subtasks B and C. Some future work is also discussed.

pdf
Sentence Centrality Revisited for Unsupervised Summarization
Hao Zheng | Mirella Lapata
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Single document summarization has enjoyed renewed interest in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. We revisit a popular graph-based ranking algorithm and modify how node (aka sentence) centrality is computed in two ways: (a) we employ BERT, a state-of-the-art neural representation learning model to better capture sentential meaning and (b) we build graphs with directed edges arguing that the contribution of any two nodes to their respective centrality is influenced by their relative position in a document. Experimental results on three news summarization datasets representative of different languages and writing styles show that our approach outperforms strong baselines by a wide margin.