Kuangrong Hao


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
Federated Document-Level Biomedical Relation Extraction with Localized Context Contrast
Yan Xiao | Yaochu Jin | Kuangrong Hao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Existing studies on relation extraction focus at the document level in a centralized training environment, requiring the collection of documents from various sources. However, this raises concerns about privacy protection, especially in sensitive domains such as finance and healthcare. For the first time, this work extends document-level relation extraction to a federated environment. The proposed federated framework, called FedLCC, is tailored for biomedical relation extraction that enables collaborative training without sharing raw medical texts. To fully exploit the models of all participating clients and improve the local training on individual clients, we propose a novel concept of localized context contrast on the basis of contrastive learning. By comparing and rectifying the similarity of localized context in documents between clients and the central server, the global model can better represent the documents on individual clients. Due to the lack of a widely accepted measure of non-IID text data, we introduce a novel non-IID scenario based on graph structural entropy. Experimental results on three document-level biomedical relation extraction datasets demonstrate the effectiveness of our method. Our code is available at https://github.com/xxxxyan/FedLCC.

pdf bib
IDC: Boost Text-to-image Retrieval via Indirect and Direct Connections
Guowei Ge | Kuangrong Hao | Lingguang Hao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The Dual Encoders (DE) framework maps image and text inputs into a coordinated representation space, and calculates their similarity directly. On the other hand, the Cross Attention (CA) framework performs modalities interactions after completing the feature embedding of images and text, and then outputs a similarity score. For scenarios with bulk query requests or large query sets, the latter is more accurate, but the former is faster. Therefore, this work finds a new way to improve the retrieval accuracy of the DE framework by borrowing the advantages of the CA framework. Drawing inspiration from image captioning, we introduce a text decoder in the model training stage to simulate the cross-modal interaction function, like the CA framework. The text decoder is eventually discarded, aligning our model with the DE framework. Finally, to ensure training stability and prevent overfitting, we modify the Self-Distillation from Last Mini-Batch and apply it to the retrieval areas. Extensive experiments conducted on the MSCOCO and Flickr30K datasets validate the effectiveness of our proposed methods. Notably, our model achieves competitive results compared to state-of-the-art approaches on the Flickr30K dataset.