Zeyu Zhang

Papers on this page may belong to the following people: Zeyu Zhang, Zeyu Zhang, Zeyu Zhang, Zeyu Zhang

2025

pdf bib abs
Retrieving Support to Rank Answers in Open-Domain Question Answering
Zeyu Zhang | Alessandro Moschitti | Thuy Vu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

We introduce a novel Question Answering (QA) architecture that enhances answer selection by retrieving targeted supporting evidence. Unlike traditional methods, which retrieve documents or passages relevant only to a query q, our approach retrieves content relevant to the combined pair (q, a), explicitly emphasizing the supporting relation between the query and a candidate answer a. By prioritizing this relational context, our model effectively identifies paragraphs that directly substantiate the correctness of a with respect to q, leading to more accurate answer verification than standard retrieval systems. Our neural retrieval method also scales efficiently to collections containing hundreds of millions of paragraphs. Moreover, this approach can be used by large language models (LLMs) to retrieve explanatory paragraphs that ground their reasoning, enabling them to tackle more complex QA tasks with greater reliability and interpretability.

Trending topics have become a significant part of modern social media, attracting users to participate in discussions of breaking events. However, they also bring in a new channel for poisoning attacks, resulting in negative impacts on society. Therefore, it is urgent to study this critical problem and develop effective strategies for defense. In this paper, we propose TrendSim, an LLM-based multi-agent system to simulate trending topics in social media under poisoning attacks. Specifically, we create a simulation environment for trending topics that incorporates a time-aware interaction mechanism, centralized message dissemination, and an interactive system. Moreover, we develop LLM-based humanoid agents to simulate users in social media, and propose prototype-based attackers to replicate poisoning attacks. Besides, we evaluate TrendSim from multiple aspects to validate its effectiveness. Based on TrendSim, we conduct simulation experiments to study four critical problems about poisoning attacks on trending topics.

2024

pdf bib abs
Granular Analysis of Social Media Users’ Truthfulness Stances Toward Climate Change Factual Claims
Haiqi Zhang | Zhengyuan Zhu | Zeyu Zhang | Jacob Devasier | Chengkai Li
Proceedings of the 1st Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2024)

Climate change poses an urgent global problem that requires efficient data analysis mechanisms to provide insights into climate-related discussions on social media platforms. This paper presents a framework aimed at understanding social media users’ perceptions of various climate change topics and uncovering the insights behind these perceptions. Our framework employs large language model to develop a taxonomy of factual claims related to climate change and build a classification model that detects the truthfulness stance of tweets toward the factual claims. The findings reveal two key conclusions: (1) The public tends to believe the claims are true, regardless of the actual claim veracity; (2) The public shows a lack of discernment between facts and misinformation across different topics, particularly in areas related to politics, economy, and environment.

pdf bib abs
Improving Toponym Resolution by Predicting Attributes to Constrain Geographical Ontology Entries
Zeyu Zhang | Egoitz Laparra | Steven Bethard
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Geocoding is the task of converting location mentions in text into structured geospatial data.We propose a new prompt-based paradigm for geocoding, where the machine learning algorithm encodes only the location mention and its context.We design a transformer network for predicting the country, state, and feature class of a location mention, and a deterministic algorithm that leverages the country, state, and feature class predictions as constraints in a search for compatible entries in the ontology.Our architecture, GeoPLACE, achieves new state-of-the-art performance on multiple datasets.Code and models are available at https://github.com/clulab/geonorm.

2023

pdf bib abs
Hallucination Mitigation in Natural Language Generation from Large-Scale Open-Domain Knowledge Graphs
Xiao Shi | Zhengyuan Zhu | Zeyu Zhang | Chengkai Li
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In generating natural language descriptions for knowledge graph triples, prior works used either small-scale, human-annotated datasets or datasets with limited variety of graph shapes, e.g., those having mostly star graphs. Graph-to-text models trained and evaluated on such datasets are largely not assessed for more realistic large-scale, open-domain settings. We introduce a new dataset, GraphNarrative, to fill this gap. Fine-tuning transformer-based pre-trained language models has achieved state-of-the-art performance among graph-to-text models. However, this method suffers from information hallucination—the generated text may contain fabricated facts not present in input graphs. We propose a novel approach that, given a graph-sentence pair in GraphNarrative, trims the sentence to eliminate portions that are not present in the corresponding graph, by utilizing the sentence’s dependency parse tree. Our experiment results verify this approach using models trained on GraphNarrative and existing datasets. The dataset, source code, and trained models are released at https://github.com/idirlab/graphnarrator.

pdf bib abs
Double Retrieval and Ranking for Accurate Question Answering
Zeyu Zhang | Thuy Vu | Alessandro Moschitti
Findings of the Association for Computational Linguistics: EACL 2023

Recent work has shown that an answer verification step introduced in Transformer-based answer selection models can significantly improve the state of the art in Question Answering. This step is performed by aggregating the embeddings of top k answer candidates to support the verification of a target answer. Although the approach is intuitive and sound, it still shows two limitations: (i) the supporting candidates are ranked only according to the relevancy with the question and not with the answer, and (ii) the support provided by the other answer candidates is suboptimal as these are retrieved independently of the target answer. In this paper, we address both drawbacks by proposing (i) a double reranking model, which, for each target answer, selects the best support; and (ii) a second neural retrieval stage designed to encode question and answer pair as the query, which finds more specific verification information. The results on well-known datasets for Answer Sentence Selection show significant improvement over the state of the art.

pdf bib abs
Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution
Zeyu Zhang | Steven Bethard
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry’s population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at \url{https://github.com/clulab/geonorm}.

2022

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended. This is a slow and tedious task, and one which does not scale well. Here we propose an interactive tool that allows a taxonomy to be built or extended rapidly and with a human in the loop to control precision. We apply insights from text summarization and information extraction to reduce the search space dramatically, then leverage modern pretrained language models to perform contextualized clustering of the remaining concepts to yield candidate nodes for the user to review. We show this allows a user to consider as many as 200 taxonomy concept candidates an hour, to quickly build or extend a taxonomy to better fit information needs.

2021

pdf bib abs
Joint Models for Answer Verification in Question Answering Systems
Zeyu Zhang | Thuy Vu | Alessandro Moschitti
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper studies joint models for selecting correct answer sentences among the top k provided by answer sentence selection (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. Our work shows that a critical step to effectively exploiting an answer set regards modeling the interrelated information between pair of answers. For this purpose, we build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one. More specifically, our neural architecture integrates a state-of-the-art AS2 module with the multi-classifier, and a joint layer connecting all components. We tested our models on WikiQA, TREC-QA, and a real-world dataset. The results show that our models obtain the new state of the art in AS2.

This paper describes the current milestones achieved in our ongoing project that aims to understand the surveillance of, impact of and intervention on COVID-19 misinfodemic on Twitter. Specifically, it introduces a public dashboard which, in addition to displaying case counts in an interactive map and a navigational panel, also provides some unique features not found in other places. Particularly, the dashboard uses a curated catalog of COVID-19 related facts and debunks of misinformation, and it displays the most prevalent information from the catalog among Twitter users in user-selected U.S. geographic regions. The paper explains how to use BERT models to match tweets with the facts and misinformation and to detect their stance towards such information. The paper also discusses the results of preliminary experiments on analyzing the spatio-temporal spread of misinformation.

2020

pdf bib abs
A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization
Dongfang Xu | Zeyu Zhang | Steven Bethard
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Concept normalization, the task of linking textual mentions of concepts to concepts in an ontology, is challenging because ontologies are large. In most cases, annotated datasets cover only a small sample of the concepts, yet concept normalizers are expected to predict all concepts in the ontology. In this paper, we propose an architecture consisting of a candidate generator and a list-wise ranker based on BERT. The ranker considers pairings of concept mentions and candidate concepts, allowing it to make predictions for any concept, not just those seen during training. We further enhance this list-wise approach with a semantic type regularizer that allows the model to incorporate semantic type information from the ontology during training. Our proposed concept normalization framework achieves state-of-the-art performance on multiple datasets.

pdf bib abs
ScienceExamCER: A High-Density Fine-Grained Science-Domain Corpus for Common Entity Recognition
Hannah Smith | Zeyu Zhang | John Culnan | Peter Jansen
Proceedings of the Twelfth Language Resources and Evaluation Conference

Named entity recognition identifies common classes of entities in text, but these entity labels are generally sparse, limiting utility to downstream tasks. In this work we present ScienceExamCER, a densely-labeled semantic classification corpus of 133k mentions in the science exam domain where nearly all (96%) of content words have been annotated with one or more fine-grained semantic class labels including taxonomic groups, meronym groups, verb/action groups, properties and values, and synonyms. Semantic class labels are drawn from a manually-constructed fine-grained typology of 601 classes generated through a data-driven analysis of 4,239 science exam questions. We show an off-the-shelf BERT-based named entity recognition model modified for multi-label classification achieves an accuracy of 0.85 F1 on this task, suggesting strong utility for downstream tasks in science domain question answering requiring densely-labeled semantic classification.

Co-authors

Venues

ws1