2023
pdf
abs
SRCB at SemEval-2023 Task 2: A System of Complex Named Entity Recognition with External Knowledge
Yuming Zhang
|
Hongyu Li
|
Yongwei Zhang
|
Shanshan Jiang
|
Bin Dong
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
The MultiCoNER II shared task aims at detecting semantically ambiguous and complex named entities in short and low-context settings for multiple languages. The lack of context makes the recognition of ambiguous named entities challenging. To alleviate this issue, our team SRCB proposes an external knowledge based system, where we utilize 3 different types of external knowledge retrieved in different ways. Given an original text, our system retrieves the possible labels and the descriptions for each potential entity detected by a mention detection model. And we also retrieve a related document as extra context from Wikipedia for each original text. We concatenate the original text with the external knowledge as the input of NER models. The informative contextual representations with external knowledge significantly improve the NER performance in both Chinese and English tracks. Our system win the 3rd place in the Chinese track and the 6th place in the English track.
2019
pdf
abs
Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition
Hongyu Lin
|
Yaojie Lu
|
Xianpei Han
|
Le Sun
|
Bin Dong
|
Shanshan Jiang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Current region-based NER models only rely on fully-annotated training data to learn effective region encoder, which often face the training data bottleneck. To alleviate this problem, this paper proposes Gazetteer-Enhanced Attentive Neural Networks, which can enhance region-based NER by learning name knowledge of entity mentions from easily-obtainable gazetteers, rather than only from fully-annotated data. Specially, we first propose an attentive neural network (ANN), which explicitly models the mention-context association and therefore is convenient for integrating externally-learned knowledge. Then we design an auxiliary gazetteer network, which can effectively encode name regularity of mentions only using gazetteers. Finally, the learned gazetteer network is incorporated into ANN for better NER. Experiments show that our ANN can achieve the state-of-the-art performance on ACE2005 named entity recognition benchmark. Besides, incorporating gazetteer network can further improve the performance and significantly reduce the requirement of training data.
pdf
abs
Supervised neural machine translation based on data augmentation and improved training & inference process
Yixuan Tong
|
Liang Liang
|
Boyan Liu
|
Shanshan Jiang
|
Bin Dong
Proceedings of the 6th Workshop on Asian Translation
This is the second time for SRCB to participate in WAT. This paper describes the neural machine translation systems for the shared translation tasks of WAT 2019. We participated in ASPEC tasks and submitted results on English-Japanese, Japanese-English, Chinese-Japanese, and Japanese-Chinese four language pairs. We employed the Transformer model as the baseline and experimented relative position representation, data augmentation, deep layer model, ensemble. Experiments show that all these methods can yield substantial improvements.
2018
pdf
SRCB Neural Machine Translation Systems in WAT 2018
Yihan Li
|
Boyan Liu
|
Yixuan Tong
|
Shanshan Jiang
|
Bin Dong
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation