Taro Watanabe

2021

pdf bib abs
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
Ukyo Honda | Yoshitaka Ushiku | Atsushi Hashimoto | Taro Watanabe | Yuji Matsumoto
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the images. In previous work, pseudo-captions, i.e., sentences that contain the detected object labels, were assigned to a given image. The focus of the previous work was on the alignment of input images and pseudo-captions at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. In this work, we investigate the effect of removing mismatched words from image-sentence alignment to determine how they make this task difficult. We propose a simple gating mechanism that is trained to align image features with only the most reliable words in pseudo-captions: the detected object labels. The experimental results show that our proposed method outperforms the previous methods without introducing complex sentence-level learning objectives. Combined with the sentence-level alignment method of previous work, our method further improves its performance. These results confirm the importance of careful alignment in word-level details.

pdf bib
Structured Refinement for Sequential Labeling
Yiran Wang | Hiroyuki Shindo | Yuji Matsumoto | Taro Watanabe
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs
A Text Editing Approach to Joint Japanese Word Segmentation, POS Tagging, and Lexical Normalization
Shohei Higashiyama | Masao Utiyama | Taro Watanabe | Eiichiro Sumita
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

Lexical normalization, in addition to word segmentation and part-of-speech tagging, is a fundamental task for Japanese user-generated text processing. In this paper, we propose a text editing model to solve the three task jointly and methods of pseudo-labeled data generation to overcome the problem of data deficiency. Our experiments showed that the proposed model achieved better normalization performance when trained on more diverse pseudo-labeled data.

pdf bib abs
User-Generated Text Corpus for Evaluating Japanese Morphological Analysis and Lexical Normalization
Shohei Higashiyama | Masao Utiyama | Taro Watanabe | Eiichiro Sumita
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Morphological analysis (MA) and lexical normalization (LN) are both important tasks for Japanese user-generated text (UGT). To evaluate and compare different MA/LN systems, we have constructed a publicly available Japanese UGT corpus. Our corpus comprises 929 sentences annotated with morphological and normalization information, along with category information we classified for frequent UGT-specific phenomena. Experiments on the corpus demonstrated the low performance of existing MA/LN methods for non-general words and non-standard forms, indicating that the corpus would be a challenging benchmark for further research on UGT.

pdf bib abs
Transliteration for Low-Resource Code-Switching Texts: Building an Automatic Cyrillic-to-Latin Converter for Tatar
Chihiro Taguchi | Yusuke Sakai | Taro Watanabe
Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching

We introduce a Cyrillic-to-Latin transliterator for the Tatar language based on subword-level language identification. The transliteration is a challenging task due to the following two reasons. First, because modern Tatar texts often contain intra-word code-switching to Russian, a different transliteration set of rules needs to be applied to each morpheme depending on the language, which necessitates morpheme-level language identification. Second, the fact that Tatar is a low-resource language, with most of the texts in Cyrillic, makes it difficult to prepare a sufficient dataset. Given this situation, we proposed a transliteration method based on subword-level language identification. We trained a language classifier with monolingual Tatar and Russian texts, and applied different transliteration rules in accord with the identified language. The results demonstrate that our proposed method outscores other Tatar transliteration tools, and imply that it correctly transcribes Russian loanwords to some extent.

pdf bib abs
Nested Named Entity Recognition via Explicitly Excluding the Influence of the Best Path
Yiran Wang | Hiroyuki Shindo | Yuji Matsumoto | Taro Watanabe
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper presents a novel method for nested named entity recognition. As a layered method, our method extends the prior second-best path recognition method by explicitly excluding the influence of the best path. Our method maintains a set of hidden states at each time step and selectively leverages them to build a different potential function for recognition at each level. In addition, we demonstrate that recognizing innermost entities first results in better performance than the conventional outermost entities first scheme. We provide extensive experimental results on ACE2004, ACE2005, and GENIA datasets to show the effectiveness and efficiency of our proposed method.

pdf bib abs
Neural Machine Translation with Synchronous Latent Phrase Structure
Shintaro Harada | Taro Watanabe
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop

It is reported that grammatical information is useful for machine translation (MT) task. However, the annotation of grammatical information requires the highly human resources. Furthermore, it is not trivial to adapt grammatical information to MT since grammatical annotation usually adapts tokenization standards which might not be suitable to capture the relation of two languages, and the use of sub-word tokenization, e.g., Byte-Pair-Encoding, to alleviate out-of-vocabulary problem might not be compatible with those annotations. In this work, we propose two methods to explicitly incorporate grammatical information without supervising annotation; first, latent phrase structure is induced in an unsupervised fashion from a multi-head attention mechanism; second, the induced phrase structures in encoder and decoder are synchronized so that they are compatible with each other using constraints during training. We demonstrate that our approach produces better performance and explainability in two tasks, translation and alignment tasks without extra resources. Although we could not obtain the high quality phrase structure in constituency parsing when evaluated monolingually, we find that the induced phrase structures enhance the explainability of translation through the synchronization constraint.

pdf bib abs
Zero Pronouns Identification based on Span prediction
Sei Iwata | Taro Watanabe | Masaaki Nagata
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop

The presence of zero-pronoun (ZP) greatly affects the downstream tasks of NLP in pro-drop languages such as Japanese and Chinese. To tackle the problem, the previous works identified ZPs as sequence labeling on the word sequence or the linearlized tree nodes of the input. We propose a novel approach to ZP identification by casting it as a query-based argument span prediction task. Given a predicate as a query, our model predicts the omission with ZP. In the experiments, our model surpassed the sequence labeling baseline.

pdf bib abs
Dependency Patterns of Complex Sentences and Semantic Disambiguation for Abstract Meaning Representation Parsing
Yuki Yamamoto | Yuji Matsumoto | Taro Watanabe
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

Abstract Meaning Representation (AMR) is a sentence-level meaning representation based on predicate argument structure. One of the challenges we find in AMR parsing is to capture the structure of complex sentences which expresses the relation between predicates. Knowing the core part of the sentence structure in advance may be beneficial in such a task. In this paper, we present a list of dependency patterns for English complex sentence constructions designed for AMR parsing. With a dedicated pattern matcher, all occurrences of complex sentence constructions are retrieved from an input sentence. While some of the subordinators have semantic ambiguities, we deal with this problem through training classification models on data derived from AMR and Wikipedia corpus, establishing a new baseline for future works. The developed complex sentence patterns and the corresponding AMR descriptions will be made public.

2020

We propose a simple method for nominal coordination boundary identification. As the main strength of our method, it can identify the coordination boundaries without training on labeled data, and can be applied even if coordination structure annotations are not available. Our system employs pre-trained word embeddings to measure the similarities of words and detects the span of coordination, assuming that conjuncts share syntactic and semantic similarities. We demonstrate that our method yields good results in identifying coordinated noun phrases in the GENIA corpus and is comparable to a recent supervised method for the case when the coordinator conjoins simple noun phrases.

2018

pdf bib abs
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
Wei Wang | Taro Watanabe | Macduff Hughes | Tetsuji Nakagawa | Ciprian Chelba
Proceedings of the Third Conference on Machine Translation: Research Papers

Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet. Denoising is concerned with a different type of data quality and tries to reduce the negative impact of data noise on MT training, in particular, neural MT (NMT) training. This paper generalizes methods for measuring and selecting data for domain MT and applies them to denoising NMT training. The proposed approach uses trusted data and a denoising curriculum realized by online data selection. Intrinsic and extrinsic evaluations of the approach show its significant effectiveness for NMT to train on data with severe noise.

2017

pdf bib
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Greg Kondrak | Taro Watanabe
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Greg Kondrak | Taro Watanabe
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2016

pdf bib abs
Phrase-based Machine Translation using Multiple Preordering Candidates
Yusuke Oda | Taku Kudo | Tetsuji Nakagawa | Taro Watanabe
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper, we propose a new decoding method for phrase-based statistical machine translation which directly uses multiple preordering candidates as a graph structure. Compared with previous phrase-based decoding methods, our method is based on a simple left-to-right dynamic programming in which no decoding-time reordering is performed. As a result, its runtime is very fast and implementing the algorithm becomes easy. Our system does not depend on specific preordering methods as long as they output multiple preordering candidates, and it is trivial to employ existing preordering methods into our system. In our experiments for translating diverse 11 languages into English, the proposed method outperforms conventional phrase-based decoder in terms of translation qualities under comparable or faster decoding time.

pdf bib
Optimization for Statistical Machine Translation: A Survey
Graham Neubig | Taro Watanabe
Computational Linguistics, Volume 42, Issue 1 - March 2016

2015

pdf bib
Transition-based Neural Constituent Parsing
Taro Watanabe | Eiichiro Sumita
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Hierarchical Back-off Modeling of Hiero Grammar based on Non-parametric Bayesian Model
Hidetaka Kamigaito | Taro Watanabe | Hiroya Takamura | Manabu Okumura | Eiichiro Sumita
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Leave-one-out Word Alignment without Garbage Collector Effects
Xiaolin Wang | Masao Utiyama | Andrew Finch | Taro Watanabe | Eiichiro Sumita
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib abs
The NICT translation system for IWSLT 2014
Xiaolin Wang | Andrew Finch | Masao Utiyama | Taro Watanabe | Eiichiro Sumita
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes NICT’s participation in the IWSLT 2014 evaluation campaign for the TED Chinese-English translation shared-task. Our approach used a combination of phrase-based and hierarchical statistical machine translation (SMT) systems. Our focus was in several areas, specifically system combination, word alignment, and various language modeling techniques including the use of neural network joint models. Our experiments on the test set from the 2013 shared task, showed that an improvement in BLEU score can be gained in translation performance through all of these techniques, with the largest improvements coming from using large data sizes to train the language model.

pdf bib
Recurrent Neural Networks for Word Alignment Model
Akihiro Tamura | Taro Watanabe | Eiichiro Sumita
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Recurrent Neural Network-based Tuple Sequence Model for Machine Translation
Youzheng Wu | Taro Watanabe | Chiori Hori
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM
Hidetaka Kamigaito | Taro Watanabe | Hiroya Takamura | Manabu Okumura
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Syntax-Augmented Machine Translation using Syntax-Label Clustering
Hideya Mino | Taro Watanabe | Eiichiro Sumita
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper describes NICT’s participation in the IWSLT 2010 evaluation campaign for the DIALOG translation (Chinese-English) and the BTEC (French-English) translation shared-tasks. For the DIALOG translation, the main challenge to this task is applying context information during translation. Context information can be used to decide on word choice and also to replace missing information during translation. We applied discriminative reranking using contextual information as additional features. In order to provide more choices for re-ranking, we generated n-best lists from multiple phrase-based statistical machine translation systems that varied in the type of Chinese word segmentation schemes used. We also built a model that merged the phrase tables generated by the different segmentation schemes. Furthermore, we used a lattice-based system combination model to combine the output from different systems. A combination of all of these systems was used to produce the n-best lists for re-ranking. For the BTEC task, a general approach that used latticebased system combination of two systems, a standard phrasebased system and a hierarchical phrase-based system, was taken. We also tried to process some unknown words by replacing them with the same words but different inflections that are known to the system.

2009

pdf bib
A Succinct N-gram Language Model
Taro Watanabe | Hajime Tsukada | Hideki Isozaki
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib abs
Structural support vector machines for log-linear approach in statistical machine translation
Katsuhiko Hayashi | Taro Watanabe | Hajime Tsukada | Hideki Isozaki
Proceedings of the 6th International Workshop on Spoken Language Translation: Papers

Minimum error rate training (MERT) is a widely used learning method for statistical machine translation. In this paper, we present a SVM-based training method to enhance generalization ability. We extend MERT optimization by maximizing the margin between the reference and incorrect translations under the L2-norm prior to avoid overfitting problem. Translation accuracy obtained by our proposed methods is more stable in various conditions than that obtained by MERT. Our experimental results on the French-English WMT08 shared task show that degrade of our proposed methods is smaller than that of MERT in case of small training data or out-of-domain test data.

2008

pdf bib abs
NTT statistical machine translation system for IWSLT 2008.
Katsuhito Sudoh | Taro Watanabe | Jun Suzuki | Hajime Tsukada | Hideki Isozaki
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

The NTT Statistical Machine Translation System consists of two primary components: a statistical machine translation decoder and a reranker. The decoder generates k-best translation canditates using a hierarchical phrase-based translation based on synchronous context-free grammar. The decoder employs a linear feature combination among several real-valued scores on translation and language models. The reranker reorders the k-best translation candidates using Ranking SVMs with a large number of sparse features. This paper describes the two components and presents the results for the evaluation campaign of IWSLT 2008.

2007

pdf bib abs
Larger feature set approach for machine translation in IWSLT 2007
Taro Watanabe | Jun Suzuki | Katsuhito Sudoh | Hajime Tsukada | Hideki Isozaki
Proceedings of the Fourth International Workshop on Spoken Language Translation

The NTT Statistical Machine Translation System employs a large number of feature functions. First, k-best translation candidates are generated by an efficient decoding method of hierarchical phrase-based translation. Second, the k-best translations are reranked. In both steps, sparse binary features — of the order of millions — are integrated during the search. This paper gives the details of the two steps and shows the results for the Evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007.

pdf bib
Online Large-Margin Training for Statistical Machine Translation
Taro Watanabe | Jun Suzuki | Hajime Tsukada | Hideki Isozaki
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
NTT statistical machine translation for IWSLT 2006
Taro Watanabe | Jun Suzuki | Hajime Tsukada | Hideki Isozaki
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
NTT System Description for the WMT2006 Shared Task
Taro Watanabe | Hajime Tsukada | Hideki Isozaki
Proceedings on the Workshop on Statistical Machine Translation

pdf bib
Left-to-Right Target Generation for Hierarchical Phrase-Based Translation
Taro Watanabe | Hajime Tsukada | Hideki Isozaki
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
The NTT Statistical Machine Translation System for IWSLT2005
Hajime Tsukada | Taro Watanabe | Jun Suzuki | Hideto Kazawa | Hideki Isozaki
Proceedings of the Second International Workshop on Spoken Language Translation

pdf bib
Empirical Study of Utilizing Morph-Syntactic Information in SMT
Young-Sook Hwang | Taro Watanabe | Yutaka Sasaki
Second International Joint Conference on Natural Language Processing: Full Papers

2004

pdf bib
Example-based Machine Translation Based on Syntactic Transfer with Statistical Models
Kenji Imamura | Hideo Okuma | Taro Watanabe | Eiichiro Sumita
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Reordering Constraints for Phrase-Based Statistical Machine Translation
Richard Zens | Hermann Ney | Taro Watanabe | Eiichiro Sumita
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation
Ruiqiang Zhang | Genichiro Kikui | Hirofumi Yamamoto | Frank Soong | Taro Watanabe | Wai Kit Lo
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Chunk-Based Statistical Translation
Taro Watanabe | Eiichiro Sumita | Hiroshi G. Okuno
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib abs
Example-based decoding for statistical machine translation
Taro Watanabe | Eiichiro Sumita
Proceedings of Machine Translation Summit IX: Papers

This paper presents a decoder for statistical machine translation that can take advantage of the example-based machine translation framework. The decoder presented here is based on the greedy approach to the decoding problem, but the search is initiated from a similar translation extracted from a bilingual corpus. The experiments on multilingual translations showed that the proposed method was far superior to a word-by-word generation beam search algorithm.