Xinyue Liu

2024

Short video fake news detection is crucial for combating the spread of misinformation. Current detection methods tend to aggregate features from individual modalities into multimodal features, overlooking the implicit opinions and the evolving nature of opinions across modalities. In this paper, we mine implicit opinions within short video news and promote the evolution of both explicit and implicit opinions across all modalities. Specifically, we design a prompt template to mine implicit opinions regarding the credibility of news from the textual component of videos. Additionally, we employ a diffusion model that encourages the interplay among diverse modal opinions, including those extracted through our implicit opinion prompts. Experimental results on a publicly available dataset for short video fake news detection demonstrate the superiority of our model over state-of-the-art methods.

pdf abs
Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning
Xinyue Liu | Harshita Diddee | Daphne Ippolito
Proceedings of the 17th International Natural Language Generation Conference

One-size-fits-all large language models (LLMs) are increasingly being used to help people with their writing. However, the style these models are trained to write in may not suit all users or use cases. LLMs would be more useful as writing assistants if their idiolect could be customized to match each user. In this paper, we explore whether parameter-efficient finetuning (PEFT) with Low-Rank Adaptation can effectively guide the style of LLM generations. We use this method to customize LLaMA-2 to ten different authors and show that the generated text has lexical, syntactic, and surface alignment with the target author but struggles with content memorization. Our findings highlight the potential of PEFT to support efficient, user-level customization of LLMs.

pdf abs
RENN: A Rule Embedding Enhanced Neural Network Framework for Temporal Knowledge Graph Completion
Linlin Zong | Zhenrong Xie | Chi Ma | Xinyue Liu | Xianchao Zhang | Bo Xu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Temporal knowledge graph completion is a critical task within the knowledge graph domain. Existing approaches encompass deep neural network-based methods for temporal knowledge graph embedding and rule-based logical symbolic reasoning. However, the former may not adequately account for structural dependencies between relations.Conversely, the latter methods relies heavily on strict logical rule reasoning and lacks robustness in the face of fuzzy or noisy data. In response to these challenges, we present RENN, a groundbreaking framework that enhances temporal knowledge graph completion through rule embedding. RENN employs a three-step approach. First, it utilizes temporary random walk to extract temporal logic rules. Then, it pre-trains by learning embeddings for each logical rule and its associated relations, thereby enhancing the likelihood of existing quadruples and logical rules. Finally, it incorporates the embeddings of logical rules into the deep neural network. Our methodology has been validated through experiments conducted on various temporal knowledge graph models and datasets, consistently demonstrating its effectiveness and potential in improving temporal knowledge graph completion.

pdf abs
Temporal Knowledge Graph Reasoning with Dynamic Hypergraph Embedding
Xinyue Liu | Jianan Zhang | Chi Ma | Wenxin Liang | Bo Xu | Linlin Zong
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Reasoning over the Temporal Knowledge Graph (TKG) that predicts facts in the future has received much attention. Most previous works attempt to model temporal dynamics with knowledge graphs and graph convolution networks. However, these methods lack the consideration of high-order interactions between objects in TKG, which is an important factor to predict future facts. To address this problem, we introduce dynamic hypergraph embedding for temporal knowledge graph reasoning. Specifically, we obtain high-order interactions by constructing hypergraphs based on temporal knowledge graphs at different timestamps. Besides, we integrate the differences caused by time into the hypergraph representation in order to fit TKG. Then, we adapt dynamic meta-embedding for temporal hypergraph representation that allows our model to choose the appropriate high-order interactions for downstream reasoning. Experimental results on public TKG datasets show that our method outperforms the baselines. Furthermore, the analysis part demonstrates that the proposed method brings good interpretation for the predicted results.

2023

pdf abs
T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation
Jialu Wang | Xinyue Liu | Zonglin Di | Yang Liu | Xin Wang
Findings of the Association for Computational Linguistics: ACL 2023

*Warning: This paper contains several contents that may be toxic, harmful, or offensive.*In the last few years, text-to-image generative models have gained remarkable success in generating images with unprecedented quality accompanied by a breakthrough of inference speed. Despite their rapid progress, human biases that manifest in the training examples, particularly with regard to common stereotypical biases, like gender and skin tone, still have been found in these generative models. In this work, we seek to measure more complex human biases exist in the task of text-to-image generations. Inspired by the well-known Implicit Association Test (IAT) from social psychology, we propose a novel Text-to-Image Association Test (T2IAT) framework that quantifies the implicit stereotypes between concepts and valence, and those in the images. We replicate the previously documented bias tests on generative models, including morally neutral tests on flowers and insects as well as demographic stereotypical tests on diverse social attributes. The results of these experiments demonstrate the presence of complex stereotypical behaviors in image generations.

2022

Multi-modality support has become an integral part of creating a seamless user experience with modern voice assistants with smart displays. Users refer to images, video thumbnails, or the accompanying text descriptions on the screen through voice communication with AI powered devices. This raises the need to either augment existing commercial voice only dialogue systems with state-of-the-art multimodal components, or to introduce entirely new architectures; where the latter can lead to costly system revamps. To support the emerging visual navigation and visual product selection use cases, we propose to augment commercially deployed voice-only dialogue systems with additional multi-modal components. In this work, we present a novel yet pragmatic approach to expand an existing dialogue-based context carryover system (Chen et al., 2019a) in a voice assistant with state-of-the-art multimodal components to facilitate quick delivery of visual modality support with minimum changes. We demonstrate a 35% accuracy improvement over the existing system on an in-house multi-modal visual navigation data set.

2020

pdf abs
Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention
Mingda Li | Xinyue Liu | Weitong Ruan | Luca Soldaini | Wael Hamza | Chengwei Su
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track

Currently, in spoken language understanding (SLU) systems, the automatic speech recognition (ASR) module produces multiple interpretations (or hypotheses) for the input audio signal and the natural language understanding (NLU) module takes the one with the highest confidence score for domain or intent classification. However, the interpretations can be noisy, and solely relying on one interpretation can cause information loss. To address the problem, many research works attempt to rerank the interpretations for a better choice while some recent works get better performance by integrating all the hypotheses during prediction. In this paper, we follow the way of integrating hypotheses but strengthen the training mode by involving more tasks, some of which may be not in existing tasks of NLU but relevant, via multi-task learning or transfer learning. Moreover, we propose the Hierarchical Attention Mechanism (HAM) to further improve the performance with the acoustic-model features like confidence scores, which are ignored in the current hypotheses integration models. The experimental results show that compared to the standard estimation with one hypothesis, the multi-task learning with HAM can improve the domain and intent classification by relatively 19% and 37%, which are much higher than improvements with current integration or reranking methods. To illustrate the cause of improvements brought by our model, we decode the hidden representations of some utterance examples and compare the generated texts with hypotheses and transcripts. The comparison shows that our model could recover the transcription by integrating the fragmented information among hypotheses and identifying the frequent error patterns of the ASR module, and even rewrite the query for a better understanding, which reveals the characteristic of multi-task learning of broadcasting knowledge.

pdf abs
SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling
Luoxin Chen | Weitong Ruan | Xinyue Liu | Jianhua Lu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Virtual adversarial training (VAT) is a powerful technique to improve model robustness in both supervised and semi-supervised settings. It is effective and can be easily adopted on lots of image classification and text classification tasks. However, its benefits to sequence labeling tasks such as named entity recognition (NER) have not been shown as significant, mostly, because the previous approach can not combine VAT with the conditional random field (CRF). CRF can significantly boost accuracy for sequence models by putting constraints on label transitions, which makes it an essential component in most state-of-the-art sequence labeling model architectures. In this paper, we propose SeqVAT, a method which naturally applies VAT to sequence labeling models with CRF. Empirical studies show that SeqVAT not only significantly improves the sequence labeling performance over baselines under supervised settings, but also outperforms state-of-the-art approaches under semi-supervised settings.

pdf abs
Enhance Robustness of Sequence Labelling with Masked Adversarial Training
Luoxin Chen | Xinyue Liu | Weitong Ruan | Jianhua Lu
Findings of the Association for Computational Linguistics: EMNLP 2020

Adversarial training (AT) has shown strong regularization effects on deep learning algorithms by introducing small input perturbations to improve model robustness. In language tasks, adversarial training brings word-level robustness by adding input noise, which is beneficial for text classification. However, it lacks sufficient contextual information enhancement and thus is less useful for sequence labelling tasks such as chunking and named entity recognition (NER). To address this limitation, we propose masked adversarial training (MAT) to improve robustness from contextual information in sequence labelling. MAT masks or replaces some words in the sentence when computing adversarial loss from perturbed inputs and consequently enhances model robustness using more context-level information. In our experiments, our method shows significant improvements on accuracy and robustness of sequence labelling. By further incorporating with ELMo embeddings, our model achieves better or comparable results to state-of-the-art on CoNLL 2000 and 2003 benchmarks using much less parameters.