Kenji Imamura

2023

pdf abs
Pivot Translation for Zero-resource Language Pairs Based on a Multilingual Pretrained Model
Kenji Imamura | Masao Utiyama | Eiichiro Sumita
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track

A multilingual translation model enables a single model to handle multiple languages. However, the translation qualities of unlearned language pairs (i.e., zero-shot translation qualities) are still poor. By contrast, pivot translation translates source texts into target ones via a pivot language such as English, thus enabling machine translation without parallel texts between the source and target languages. In this paper, we perform pivot translation using a multilingual model and compare it with direct translation. We improve the translation quality without using parallel texts of direct translation by fine-tuning the model with machine-translated pseudo-translations. We also discuss what type of parallel texts are suitable for effectively improving the translation quality in multilingual pivot translation.

In this paper, we describe our NAIST-NICT submission to the WMT’23 English ↔ Japanese general machine translation task. Our system generates diverse translation candidates and reranks them using a two-stage reranking system to find the best translation. First, we generated 50 candidates each from 18 translation methods using a variety of techniques to increase the diversity of the translation candidates. We trained seven models per language direction using various combinations of hyperparameters. From these models we used various decoding algorithms, ensembling the models, and using kNN-MT (Khandelwal et al., 2021). We processed the 900 translation candidates through a two-stage reranking system to find the most promising candidate. In the first step, we compared 50 candidates from each translation method using DrNMT (Lee et al., 2021) and returned the candidate with the best score. We ranked the final 18 candidates using COMET-MBR (Fernandes et al., 2022) and returned the best score as the system output. We found that generating diverse translation candidates improved translation quality using the well-designed reranker model.

2022

In this paper, we describe our NAIST-NICT-TIT submission to the WMT22 general machine translation task. We participated in this task for the English ↔ Japanese language pair. Our system is characterized as an ensemble of Transformer big models, k-nearest-neighbor machine translation (kNN-MT) (Khandelwal et al., 2021), and reranking.In our translation system, we construct the datastore for kNN-MT from back-translated monolingual data and integrate kNN-MT into the ensemble model. We designed a reranking system to select a translation from the n-best translation candidates generated by the translation system. We also use a context-aware model to improve the document-level consistency of the translation.

2021

pdf abs
NICT-2 Translation System at WAT-2021: Applying a Pretrained Multilingual Encoder-Decoder Model to Low-resource Language Pairs
Kenji Imamura | Eiichiro Sumita
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

In this paper, we present the NICT system (NICT-2) submitted to the NICT-SAP shared task at the 8th Workshop on Asian Translation (WAT-2021). A feature of our system is that we used a pretrained multilingual BART (Bidirectional and Auto-Regressive Transformer; mBART) model. Because publicly available models do not support some languages in the NICT-SAP task, we added these languages to the mBART model and then trained it using monolingual corpora extracted from Wikipedia. We fine-tuned the expanded mBART model using the parallel corpora specified by the NICT-SAP task. The BLEU scores greatly improved in comparison with those of systems without the pretrained model, including the additional languages.

2020

pdf abs
Transformer-based Double-token Bidirectional Autoregressive Decoding in Neural Machine Translation
Kenji Imamura | Eiichiro Sumita
Proceedings of the 7th Workshop on Asian Translation

This paper presents a simple method that extends a standard Transformer-based autoregressive decoder, to speed up decoding. The proposed method generates a token from the head and tail of a sentence (two tokens in total) in each step. By simultaneously generating multiple tokens that rarely depend on each other, the decoding speed is increased while the degradation in translation quality is minimized. In our experiments, the proposed method increased the translation speed by around 113%-155% in comparison with a standard autoregressive decoder, while degrading the BLEU scores by no more than 1.03. It was faster than an iterative non-autoregressive decoder in many conditions.

2019

pdf
Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation
Aizhan Imankulova | Raj Dabre | Atsushi Fujita | Kenji Imamura
Proceedings of Machine Translation Summit XVII: Research Track

pdf abs
Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
Kenji Imamura | Eiichiro Sumita
Proceedings of the 6th Workshop on Asian Translation

This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks—ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en—using this system.

pdf abs
Recycling a Pre-trained BERT Encoder for Neural Machine Translation
Kenji Imamura | Eiichiro Sumita
Proceedings of the 3rd Workshop on Neural Generation and Translation

In this paper, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is applied to Transformer-based neural machine translation (NMT). In contrast to monolingual tasks, the number of unlearned model parameters in an NMT decoder is as huge as the number of learned parameters in the BERT model. To train all the models appropriately, we employ two-stage optimization, which first trains only the unlearned parameters by freezing the BERT model, and then fine-tunes all the sub-models. In our experiments, stable two-stage optimization was achieved, in contrast the BLEU scores of direct fine-tuning were extremely low. Consequently, the BLEU scores of the proposed method were better than those of the Transformer base model and the same model without pre-training. Additionally, we confirmed that NMT with the BERT encoder is more effective in low-resource settings.

2018

pdf abs
Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation
Kenji Imamura | Atsushi Fujita | Eiichiro Sumita
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

A large-scale parallel corpus is required to train encoder-decoder neural machine translation. The method of using synthetic parallel texts, in which target monolingual corpora are automatically translated into source sentences, is effective in improving the decoder, but is unreliable for enhancing the encoder. In this paper, we propose a method that enhances the encoder and attention using target monolingual corpora by generating multiple source sentences via sampling. By using multiple source sentences, diversity close to that of humans is achieved. Our experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a manually created parallel corpus was achieved.

pdf abs
NICT Self-Training Approach to Neural Machine Translation at NMT-2018
Kenji Imamura | Eiichiro Sumita
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

This paper describes the NICT neural machine translation system submitted at the NMT-2018 shared task. A characteristic of our approach is the introduction of self-training. Since our self-training does not change the model structure, it does not influence the efficiency of translation, such as the translation speed. The experimental results showed that the translation quality improved not only in the sequence-to-sequence (seq-to-seq) models but also in the transformer models.

pdf
Multilingual Parallel Corpus for Global Communication Plan
Kenji Imamura | Eiichiro Sumita
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf abs
Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017
Kenji Imamura | Eiichiro Sumita
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

In this paper, we describe the NICT-2 neural machine translation system evaluated at WAT2017. This system uses multiple models as an ensemble and combines models with opposite decoding directions by reranking (called bi-directional reranking). In our experimental results on small data sets, the translation quality improved when the number of models was increased to 32 in total and did not saturate. In the experiments on large data sets, improvements of 1.59-3.32 BLEU points were achieved when six-model ensembles were combined by the bi-directional reranking.

2016

pdf abs
NICT-2 Translation System for WAT2016: Applying Domain Adaptation to Phrase-based Statistical Machine Translation
Kenji Imamura | Eiichiro Sumita
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper describes the NICT-2 translation system for the 3rd Workshop on Asian Translation. The proposed system employs a domain adaptation method based on feature augmentation. We regarded the Japan Patent Office Corpus as a mixture of four domain corpora and improved the translation quality of each domain. In addition, we incorporated language models constructed from Google n-grams as external knowledge. Our domain adaptation method can naturally incorporate such external knowledge that contributes to translation quality.

pdf abs
Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation
Kenji Imamura | Eiichiro Sumita
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track

Domain adaptation is a major challenge when applying machine translation to practical tasks. In this paper, we present domain adaptation methods for machine translation that assume multiple domains. The proposed methods combine two model types: a corpus-concatenated model covering multiple domains and single-domain models that are accurate but sparse in specific domains. We combine the advantages of both models using feature augmentation for domain adaptation in machine learning. Our experimental results show that the BLEU scores of the proposed method clearly surpass those of single-domain models for low-resource domains. For high-resource domains, the scores of the proposed method were superior to those of both single-domain and corpusconcatenated models. Even in domains having a million bilingual sentences, the translation quality was at least preserved and even improved in some domains. These results demonstrate that state-of-the-art domain adaptation can be realized with appropriate settings, even when using standard log-linear models.

This paper proposes a new method of constructing arbitrary class-based related word dictionaries on interactive topic models; we assume that each class is described by a topic. We propose a new semi-supervised method that uses the simplest topic model yielded by the standard EM algorithm; model calculation is very rapid. Furthermore our approach allows a dictionary to be modified interactively and the final dictionary has a hierarchical structure. This paper makes three contributions. First, it proposes a word-based semi-supervised topic model. Second, we apply the semi-supervised topic model to interactive learning; this approach is called the Interactive Topic Model. Third, we propose a score function; it extracts the related words that occupy the middle layer of the hierarchical structure. Experiments show that our method can appropriately retrieve the words belonging to an arbitrary class.

2011

pdf
Entity Set Expansion using Topic information
Kugatsu Sadamitsu | Kuniko Saito | Kenji Imamura | Genichiro Kikui
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf
Standardizing Complex Functional Expressions in Japanese Predicates: Applying Theoretically-Based Paraphrasing Rules
Tomoko Izumi | Kenji Imamura | Genichiro Kikui | Satoshi Sato
Proceedings of the 2010 Workshop on Multiword Expressions: from Theory to Applications

2009

pdf
Discriminative Approach to Predicate-Argument Structure Analysis with Zero-Anaphora Resolution
Kenji Imamura | Kuniko Saito | Tomoko Izumi
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf
Tag Confidence Measure for Semi-Automatically Updating Named Entity Recognition
Kuniko Saito | Kenji Imamura
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2007

pdf
Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language
Kenji Imamura | Genichiro Kikui | Norihito Yasuda
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

2005

pdf abs
Practical Approach to Syntax-based Statistical Machine Translation
Kenji Imamura | Hideo Okuma | Eiichiro Sumita
Proceedings of Machine Translation Summit X: Papers

This paper presents a practical approach to statistical machine translation (SMT) based on syntactic transfer. Conventionally, phrase-based SMT generates an output sentence by combining phrase (multiword sequence) translation and phrase reordering without syntax. On the other hand, SMT based on tree-to-tree mapping, which involves syntactic information, is theoretical, so its features remain unclear from the viewpoint of a practical system. The SMT proposed in this paper translates phrases with hierarchical reordering based on the bilingual parse tree. In our experiments, the best translation was obtained when both phrases and syntactic information were used for the translation process.

2004

pdf
Example-based Machine Translation Based on Syntactic Transfer with Statistical Models
Kenji Imamura | Hideo Okuma | Taro Watanabe | Eiichiro Sumita
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf
Automatic Construction of Machine Translation Knowledge Using Translation Literalness
Kenji Imamura | Eiichiro Sumita | Yuji Matsumoto
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Feedback Cleaning of Machine Translation Rules Using Automatic Evaluation
Kenji Imamura | Eiichiro Sumita | Yuji Matsumoto
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf
Automatic Expansion of Equivalent Sentence Set Based on Syntactic Substitution
Kenji Imamura | Yasuhiro Akiba | Eiichiro Sumita
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

2002

pdf
Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT
Kenji Imamura
Proceedings of the 9th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

pdf
Statistical machine translation based on hierarchical phrase alignment
Taro Watanabe | Kenji Imamura | Eiichiro Sumita
Proceedings of the 9th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

pdf
Corpus-assisted expansion of manual MT knowledge:
Setsuo Yamada | Kenji Imamura | Kazuhide Yamamoto
Proceedings of the 9th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

bib
Example-based machine translation
Eiichiro Sumita | Kenji Imamura
Proceedings of the 9th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Tutorials

pdf
Comparing and Extracting Paraphrasing Words with 2-Way Bilingual Dictionaries
Kazutaka Takao | Kenji Imamura | Hideki Kashioka
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf abs
Using multiple edit distances to automatically rank machine translation output
Yasuhiro Akiba | Kenji Imamura | Eiichiro Sumita
Proceedings of Machine Translation Summit VIII

This paper addresses the challenging problem of automatically evaluating output from machine translation (MT) systems in order to support the developers of these systems. Conventional approaches to the problem include methods that automatically assign a rank such as A, B, C, or D to MT output according to a single edit distance between this output and a correct translation example. The single edit distance can be differently designed, but changing its design makes assigning a certain rank more accurate, but another rank less accurate. This inhibits improving accuracy of rank assignment. To overcome this obstacle, this paper proposes an automatic ranking method that, by using multiple edit distances, encodes machine-translated sentences with a rank assigned by humans into multi-dimensional vectors from which a classifier of ranks is learned in the form of a decision tree (DT). The proposed method assigns a rank to MT output through the learned DT. The proposed method is evaluated using transcribed texts of real conversations in the travel arrangement domain. Experimental results show that the proposed method is more accurate than the single-edit-distance-based ranking methods, in both closed and open tests. Moreover, the proposed method could estimate MT quality within 3% error in some cases.