Kai fu

Also published as: Kai Fu

2026

Multimodal Named Entity Recognition relies on visual context to resolve textual ambiguities. To mitigate data scarcity, Data Augmentation (DA) has become a standard practice; however, existing methods predominantly adopt a one-size-fits-all and random perturbation paradigm, ignoring the internal state of the target model. In this paper, we first conduct a quantitative analysis, revealing that a significant portion of errors (over 30%) are model-specific, stemming from the unique biases of different architectures. To address this, we propose Memory-Guided Hard Data Augmentation, a framework designed to systematically repair these specific defects. First, we employ K-fold cross-validation to identify model-specific Hard Data. Second, we construct a Memory Tree and utilize Large Language Models (LLMs) with a clustering mechanism to induce macro-level error patterns from micro-level failures. This facilitates a paradigm shift from stateless instance-driven augmentation to a logical pattern-driven approach. Finally, we introduce an iterative augmentation mechanism that triggers recursive generation for stubborn instances that fail initial quality filters. Extensive experiments on Twitter-2015 and Twitter-2017 benchmarks demonstrate that our framework consistently yields significant performance gains across various MNER backbones.

2020

pdf bib abs

This paper introduces our system at NLPTEA2020 shared task for CGED, which is able to detect, locate, identify and correct grammatical errors in Chinese writings. The system consists of three components: GED, GEC, and post processing. GED is an ensemble of multiple BERT-based sequence labeling models for handling GED tasks. GEC performs error correction. We exploit a collection of heterogenous models, including Seq2Seq, GECToR and a candidate generation module to obtain correction candidates. Finally in the post processing stage, results from GED and GEC are fused to form the final outputs. We tune our models to lean towards optimizing precision, which we believe is more crucial in practice. As a result, among the six tracks in the shared task, our system performs well in the correction tracks: measured in F1 score, we rank first, with the highest precision, in the TOP3 correction track and third in the TOP1 correction track, also with the highest precision. Ours are among the top 4 to 6 in other tracks, except for FPR where we rank 12. And our system achieves the highest precisions among the top 10 submissions at IDENTIFICATION and POSITION tracks.

Co-authors

Bo Xu 1

Venues

Findings1
NLP-TEA1

Fix author