Zhi Qu
2024
Disentangling Pretrained Representation to Leverage Low-Resource Languages in Multilingual Machine Translation
Frederikus Hudi
|
Zhi Qu
|
Hidetaka Kamigaito
|
Taro Watanabe
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Multilingual neural machine translation aims to encapsulate multiple languages into a single model. However, it requires an enormous dataset, leaving the low-resource language (LRL) underdeveloped. As LRLs may benefit from shared knowledge of multilingual representation, we aspire to find effective ways to integrate unseen languages in a pre-trained model. Nevertheless, the intricacy of shared representation among languages hinders its full utilisation. To resolve this problem, we employed target language prediction and a central language-aware layer to improve representation in integrating LRLs. Focusing on improving LRLs in the linguistically diverse country of Indonesia, we evaluated five languages using a parallel corpus of 1,000 instances each, with experimental results measured by BLEU showing zero-shot improvement of 7.4 from the baseline score of 7.1 to a score of 15.5 at best. Further analysis showed that the gains in performance are attributed more to the disentanglement of multilingual representation in the encoder with the shift of the target language-specific representation in the decoder.
2022
Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space
Xincan Feng
|
Zhi Qu
|
Yuchang Cheng
|
Taro Watanabe
|
Nobuhiro Yugami
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing
A Knowledge Graph (KG) is the directed graphical representation of entities and relations in the real world. KG can be applied in diverse Natural Language Processing (NLP) tasks where knowledge is required. The need to scale up and complete KG automatically yields Knowledge Graph Embedding (KGE), a shallow machine learning model that is suffering from memory and training time consumption issues. To mitigate the computational load, we propose a parameter-sharing method, i.e., using conjugate parameters for complex numbers employed in KGE models. Our method improves memory efficiency by 2x in relation embedding while achieving comparable performance to the state-of-the-art non-conjugate models, with faster, or at least comparable, training time. We demonstrated the generalizability of our method on two best-performing KGE models 5★E (CITATION) and ComplEx (CITATION) on five benchmark datasets.
Adapting to Non-Centered Languages for Zero-shot Multilingual Translation
Zhi Qu
|
Taro Watanabe
Proceedings of the 29th International Conference on Computational Linguistics
Multilingual neural machine translation can translate unseen language pairs during training, i.e. zero-shot translation. However, the zero-shot translation is always unstable. Although prior works attributed the instability to the domination of central language, e.g. English, we supplement this viewpoint with the strict dependence of non-centered languages. In this work, we propose a simple, lightweight yet effective language-specific modeling method by adapting to non-centered languages and combining the shared information and the language-specific information to counteract the instability of zero-shot translation. Experiments with Transformer on IWSLT17, Europarl, TED talks, and OPUS-100 datasets show that our method not only performs better than strong baselines in centered data conditions but also can easily fit non-centered data conditions. By further investigating the layer attribution, we show that our proposed method can disentangle the coupled representation in the correct direction.
Search
Co-authors
- Taro Watanabe 3
- Xincan Feng 1
- Yuchang Cheng 1
- Nobuhiro Yugami 1
- Frederikus Hudi 1
- show all...