Yangkun Wang

2024

pdf abs
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation
Prashant Krishnan | Zilong Wang | Yangkun Wang | Jingbo Shang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images. Using coordinates can easily model the position of each token, but they are sensitive to manipulations in document images (e.g., shifting, rotation or scaling) which are common in real scenarios. Such limitation becomes even worse when the training data is limited in few-shot settings. In this paper, we propose a novel framework, LAGER, which leverages the topological adjacency relationship among the tokens through learning their relative layout information with graph neural networks. Specifically, we consider the tokens in the documents as nodes and formulate the edges based on the topological heuristics. Such adjacency graphs are invariant to affine transformations, making it robust to the common image manipulations. We incorporate these graphs into the pre-trained language model by adding graph neural network layers on top of the language model embeddings. Extensive experiments on two benchmark datasets show that LAGER significantly outperforms strong baselines under different few-shot settings and also demonstrate better robustness to manipulations.

2023

Despite remarkable advances that large language models have achieved in chatbots nowadays, maintaining a non-toxic user-AI interactive environment has become increasingly critical nowadays. However, previous efforts in toxicity detection have been mostly based on benchmarks derived from social media contents, leaving the unique challenges inherent to real-world user-AI interactions insufficiently explored. In this work, we introduce ToxicChat, a novel benchmark constructed based on real user queries from an open-source chatbot. This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference when compared to social media contents. Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat. Our work illuminates the potentially overlooked challenges of toxicity detection in real-world user-AI conversations. In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions.

Co-authors

Yujia Wang 1

Prashant Krishnan 1

Zilong Wang 1

Yangkun Wang

2024

2023

Co-authors

Venues