Junjun Guo


2022

pdf
基于图文细粒度对齐语义引导的多模态神经机器翻译方法(Based on Semantic Guidance of Fine-grained Alignment of Image-Text for Multi-modal Neural Machine Translation)
Junjie Ye (叶俊杰) | Junjun Guo (郭军军) | Kaiwen Tan (谭凯文) | Yan Xiang (相艳) | Zhengtao Yu (余正涛)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“多模态神经机器翻译旨在利用视觉信息来提高文本翻译质量。传统多模态机器翻译将图像的全局语义信息融入到翻译模型,而忽略了图像的细粒度信息对翻译质量的影响。对此,该文提出一种基于图文细粒度对齐语义引导的多模态神经机器翻译方法,该方法首先跨模态交互图文信息,以提取图文细粒度对齐语义信息,然后以图文细粒度对齐语义信息为枢纽,采用门控机制将多模态细粒度信息对齐到文本信息上,实现图文多模态特征融合。在多模态机器翻译基准数据集Multi30K 英语→德语、英语→法语以及英语→捷克语翻译任务上的实验结果表明,论文提出方法的有效性,并且优于大多数最先进的多模态机器翻译方法。”

pdf
Adaptive Feature Discrimination and Denoising for Asymmetric Text Matching
Yan Li | Chenliang Li | Junjun Guo
Proceedings of the 29th International Conference on Computational Linguistics

Asymmetric text matching has becoming increasingly indispensable for many downstream tasks (e.g., IR and NLP). Here, asymmetry means that the documents involved for matching hold different amounts of information, e.g., a short query against a relatively longer document. The existing solutions mainly focus on modeling the feature interactions between asymmetric texts, but rarely go one step further to recognize discriminative features and perform feature denoising to enhance relevance learning. In this paper, we propose a novel adaptive feature discrimination and denoising model for asymmetric text matching, called ADDAX. For each asymmetric text pair, ADDAX is devised to explicitly distinguish discriminative features and filter out irrelevant features in a context-aware fashion. Concretely, a matching-adapted gating siamese cell (MAGS) is firstly devised to identify discriminative features and produce the corresponding hybrid representations for a text pair. Afterwards, we introduce a locality-constrained hashing denoiser to perform feature-level denoising by learning a discriminative low-dimensional binary codes for redundantly longer text. Extensive experiments on four real-world datasets from different downstream tasks demostrate that the proposed ADDAX obtains substantial performance gain over 36 up-to-date state-of-the-art alternatives.

pdf
Noise-robust Cross-modal Interactive Learning with Text2Image Mask for Multi-modal Neural Machine Translation
Junjie Ye | Junjun Guo | Yan Xiang | Kaiwen Tan | Zhengtao Yu
Proceedings of the 29th International Conference on Computational Linguistics

Multi-modal neural machine translation (MNMT) aims to improve textual level machine translation performance in the presence of text-related images. Most of the previous works on MNMT focus on multi-modal fusion methods with full visual features. However, text and its corresponding image may not match exactly, visual noise is generally inevitable. The irrelevant image regions may mislead or distract the textual attention and cause model performance degradation. This paper proposes a noise-robust multi-modal interactive fusion approach with cross-modal relation-aware mask mechanism for MNMT. A text-image relation-aware attention module is constructed through the cross-modal interaction mask mechanism, and visual features are extracted based on the text-image interaction mask knowledge. Then a noise-robust multi-modal adaptive fusion approach is presented by fusion the relevant visual and textual features for machine translation. We validate our method on the Multi30K dataset. The experimental results show the superiority of our proposed model, and achieve the state-of-the-art scores in all En-De, En-Fr and En-Cs translation tasks.

2021

pdf
基于中文信息与越南语句法指导的越南语事件检测(Vietnamese event detection based on Chinese information and Vietnamese syntax guidance)
Long Chen (陈龙) | Junjun Guo (郭军军) | Yafei Zhang (张亚飞) | Chengxiang Gao (高盛祥) | Zhengtao Yu (余正涛)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

当前基于深度学习的事件检测模型都依赖足够数量的标注数据,而标注数据的稀缺及事件类型歧义为越南语事件检测带来了极大的挑战。根据“表达相同观点但语言不同的句子通常有相同或相似的语义成分”这一多语言一致性特征,本文提出了一种基于中文信息与越南语句法指导的越南语事件检测框架。首先通过共享编码器策略和交叉注意力网络将中文信息融入到越南语中,然后使用图卷积网络融入越南语依存句法信息,最后在中文事件类型指导下实现越南语事件检测。实验结果表明,在中文信息和越南语句法的指导下越南语事件检测取得了较好的效果。

pdf
基于阅读理解的汉越跨语言新闻事件要素抽取方法(News Events Element Extraction of Chinese-Vietnamese Cross-language Using Reading Comprehension)
Enchang Zhu (朱恩昌) | Zhengtao Yu (余正涛) | Chengxiang Gao (高盛祥) | Yuxin Huang (黄宇欣) | Junjun Guo (郭军军)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

新闻事件要素抽取旨在抽取新闻文本中描述主题事件的事件要素,如时间、地点、人物和组织机构名等。传统的事件要素抽取方法在资源稀缺型语言上性能欠佳,且对长文本语义建模困难。对此,本文提出了基于阅读理解的汉越跨语言新闻事件要素抽取方法。该方法首先利用新闻长文本关键句检索模块过滤含噪声的句子。然后利用跨语言阅读理解模型将富资源语言知识迁移到越南语,提高越南语新闻事件要素抽取的性能。在自建的汉越双语新闻事件要素抽取数据集上的实验证明了本文方法的有效性。

2020

pdf
基于拼音约束联合学习的汉语语音识别(Chinese Speech Recognition Based on Pinyin Constraint Joint Learning)
Renfeng Liang (梁仁凤) | Zhengtao Yu (余正涛) | Shengxiang Gao (高盛祥) | Yuxin Huang (黄于欣) | Junjun Guo (郭军军) | Shuli Xu (许树理)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

当前的语音识别模型在英语、法语等表音文字中已经取得很好的效果。然而,汉语是 一种典型的表意文字,汉字与语音没有直接的对应关系,但拼音作为汉字读音的标注 符号,与汉字存在相互转换的内在联系。因此,在汉语语音识别中利用拼音作为解码 约束,引入一种更接近语音的归纳偏置。基于多任务学习框架,提出一种基于拼音约 束联合学习的汉语语音识别方法,以端到端的汉字语音识别为主任务,以拼音语音识 别为辅助任务,通过共享编码器,同时利用汉字与拼音识别结果作为监督信号,增强 编码器对汉语语音的表达能力。实验结果表明,相比基线模型,提出方法取得更优的 识别效果,词错误率WER降低了2.24个百分点