Hu Zhang


2023

pdf
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
Yujie Wang | Hu Zhang | Jiye Liang | Ru Li
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA.We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods.

2021

pdf
Integrating Semantic Scenario and Word Relations for Abstractive Sentence Summarization
Yong Guan | Shaoru Guo | Ru Li | Xiaoli Li | Hu Zhang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently graph-based methods have been adopted for Abstractive Text Summarization. However, existing graph-based methods only consider either word relations or structure information, which neglect the correlation between them. To simultaneously capture the word relations and structure information from sentences, we propose a novel Dual Graph network for Abstractive Sentence Summarization. Specifically, we first construct semantic scenario graph and semantic word relation graph based on FrameNet, and subsequently learn their representations and design graph fusion method to enhance their correlation and obtain better semantic representation for summary generation. Experimental results show our model outperforms existing state-of-the-art methods on two popular benchmark datasets, i.e., Gigaword and DUC 2004.

pdf
A Knowledge-Guided Framework for Frame Identification
Xuefeng Su | Ru Li | Xiaoli Li | Jeff Z. Pan | Hu Zhang | Qinghua Chai | Xiaoqi Han
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Frame Identification (FI) is a fundamental and challenging task in frame semantic parsing. The task aims to find the exact frame evoked by a target word in a given sentence. It is generally regarded as a classification task in existing work, where frames are treated as discrete labels or represented using onehot embeddings. However, the valuable knowledge about frames is neglected. In this paper, we propose a Knowledge-Guided Frame Identification framework (KGFI) that integrates three types frame knowledge, including frame definitions, frame elements and frame-to-frame relations, to learn better frame representation, which guides the KGFI to jointly map target words and frames into the same embedding space and subsequently identify the best frame by calculating the dot-product similarity scores between the target word embedding and all of the frame embeddings. The extensive experimental results demonstrate KGFI significantly outperforms the state-of-the-art methods on two benchmark datasets.

2020

pdf bib
基于语料库的武侠与仙侠网络小说文体、词汇及主题对比分析(A Corpus-based Contrastive Analysis of Style, Vocabulary and Theme of Wuxia and Xianxia Internet Novels)
Sanle Zhang (张三乐) | Pengyuan Liu (刘鹏远) | Hu Zhang (张虎)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

网络文学在我国发展迅猛,其数量和影响力呈现逐年上升的趋势,但目前尚无公开的较大规模网络文学作品语料库,鲜见基于语料库对网络文学具体类别作品的定量研究。本文初步建立了一个网络文学语料库,其中包括武侠和仙侠网络小说,使用文本计量、词频统计以及主题挖掘的方法对两类小说的文体风格、具体词汇使用和小说主题进行对比分析。通过比较,我们发现两类小说的文体风格大致相同,它们在词汇的使用和主题上既有共性又各具特色。从微观到宏观,从表面到内容,将定量统计和定性分析相结合,多角度、多层次的对武侠和仙侠网络小说进行比较。

2008

pdf
A Chinese Word Segmentation System Based on Cascade Model
Jianfeng Zhang | Jiaheng Zheng | Hu Zhang | Hongye Tan
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

pdf bib
A Study on Consistency Checking Method of Part-Of-Speech Tagging for Chinese Corpora
Hu Zhang | Jiaheng Zheng
International Journal of Computational Linguistics & Chinese Language Processing, Volume 13, Number 2, June 2008

2005

pdf bib
A Classification-based Algorithm for Consistency Check of Part-of-Speech Tagging for Chinese Corpora
Hu Zhang | Jia-heng Zheng | Ying Zhao
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts