Ryu Iida

2021

pdf abs
BERTAC: Enhancing Transformer-based Language Models with Adversarially Pretrained Convolutional Neural Networks
Jong-Hoon Oh | Ryu Iida | Julien Kloetzer | Kentaro Torisawa
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Transformer-based language models (TLMs), such as BERT, ALBERT and GPT-3, have shown strong performance in a wide range of NLP tasks and currently dominate the field of NLP. However, many researchers wonder whether these models can maintain their dominance forever. Of course, we do not have answers now, but, as an attempt to find better neural architectures and training schemes, we pretrain a simple CNN using a GAN-style learning scheme and Wikipedia data, and then integrate it with standard TLMs. We show that on the GLUE tasks, the combination of our pretrained CNN with ALBERT outperforms the original ALBERT and achieves a similar performance to that of SOTA. Furthermore, on open-domain QA (Quasar-T and SearchQA), the combination of the CNN with ALBERT or RoBERTa achieved stronger performance than SOTA and the original TLMs. We hope that this work provides a hint for developing a novel strong network architecture along with its training scheme. Our source code and models are available at https://github.com/nict-wisdom/bertac.

2019

pdf abs
Open-Domain Why-Question Answering with Adversarial Learning to Encode Answer Texts
Jong-Hoon Oh | Kazuma Kadowaki | Julien Kloetzer | Ryu Iida | Kentaro Torisawa
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we propose a method for why-question answering (why-QA) that uses an adversarial learning framework. Existing why-QA methods retrieve “answer passages” that usually consist of several sentences. These multi-sentence passages contain not only the reason sought by a why-question and its connection to the why-question, but also redundant and/or unrelated parts. We use our proposed “Adversarial networks for Generating compact-answer Representation” (AGR) to generate from a passage a vector representation of the non-redundant reason sought by a why-question and exploit the representation for judging whether the passage actually answers the why-question. Through a series of experiments using Japanese why-QA datasets, we show that these representations improve the performance of our why-QA neural model as well as that of a BERT-based why-QA model. We show that they also improve a state-of-the-art distantly supervised open-domain QA (DS-QA) method on publicly available English datasets, even though the target task is not a why-QA.

pdf abs
Event Causality Recognition Exploiting Multiple Annotators’ Judgments and Background Knowledge
Kazuma Kadowaki | Ryu Iida | Kentaro Torisawa | Jong-Hoon Oh | Julien Kloetzer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose new BERT-based methods for recognizing event causality such as “smoke cigarettes” –> “die of lung cancer” written in web texts. In our methods, we grasp each annotator’s policy by training multiple classifiers, each of which predicts the labels given by a single annotator, and combine the resulting classifiers’ outputs to predict the final labels determined by majority vote. Furthermore, we investigate the effect of supplying background knowledge to our classifiers. Since BERT models are pre-trained with a large corpus, some sort of background knowledge for event causality may be learned during pre-training. Our experiments with a Japanese dataset suggest that this is actually the case: Performance improved when we pre-trained the BERT models with web texts containing a large number of event causalities instead of Wikipedia articles or randomly sampled web texts. However, this effect was limited. Therefore, we further improved performance by simply adding texts related to an input causality candidate as background knowledge to the input of the BERT models. We believe these findings indicate a promising future research direction.

This paper presents building a corpus of manually revised texts which includes both before and after-revision information. In order to create such a corpus, we propose a procedure for revising a text from a discourse perspective, consisting of dividing a text to discourse units, organising and reordering groups of discourse units and finally modifying referring and connective expressions, each of which imposes limits on freedom of revision. Following the procedure, six revisers who have enough experience in either teaching Japanese or scoring Japanese essays revised 120 Japanese essays written by Japanese native speakers. Comparing the original and revised texts, we found some specific manual revisions frequently occurred between the original and revised texts, e.g. thesis statements were frequently placed at the beginning of a text. We also evaluate text coherence using the original and revised texts on the task of pairwise information ordering, identifying a more coherent text. The experimental results using two text coherence models demonstrated that the two models did not outperform the random baseline.

2013

pdf
Annotation for annotation - Toward eliciting implicit linguistic knowledge through annotation - (Project Note)
Takenobu Tokunaga | Ryu Iida | Koh Mitsuda
Proceedings of the 9th Joint ISO - ACL SIGSEM Workshop on Interoperable Semantic Annotation

pdf
Automatic Voice Selection in Japanese based on Various Linguistic Information
Ryu Iida | Takenobu Tokunaga
Proceedings of the 14th European Workshop on Natural Language Generation

pdf
Investigation of annotator’s behaviour using eye-tracking data
Ryu Iida | Koh Mitsuda | Takenobu Tokunaga
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

pdf
Detecting Missing Annotation Disagreement using Eye Gaze Information
Koh Mitsuda | Ryu Iida | Takenobu Tokunaga
Proceedings of the 11th Workshop on Asian Language Resources

2012

pdf abs
The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues
Takenobu Tokunaga | Ryu Iida | Asuka Terai | Naoko Kuriyama
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes a collection of multimodal corpora of referring expressions, the REX corpora. The corpora have two notable features, namely (1) they include time-aligned extra-linguistic information such as participant actions and eye-gaze on top of linguistic information, (2) dialogues were collected with various configurations in terms of the puzzle type, hinting and language. After describing how the corpora were constructed and sketching out each, we present an analysis of various statistics for the corpora with respect to the various configurations mentioned above. The analysis showed that the corpora have different characteristics in the number of utterances and referring expressions in a dialogue, the task completion time and the attributes used in the referring expressions. In this respect, we succeeded in constructing a collection of corpora that included a variety of characteristics by changing the configurations for each set of dialogues, as originally planned. The corpora are now under preparation for publication, to be used for research on human reference behaviour.

pdf
A Metric for Evaluating Discourse Coherence based on Coreference Resolution
Ryu Iida | Takenobu Tokunaga
Proceedings of COLING 2012: Posters

pdf
Identifying Temporal Relations by Sentence and Document Optimizations
Katsumasa Yoshikawa | Masayuki Asahara | Ryu Iida
Proceedings of COLING 2012: Posters

pdf
A Unified Probabilistic Approach to Referring Expressions
Kotaro Funakoshi | Mikio Nakano | Takenobu Tokunaga | Ryu Iida
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
Sentence Compression with Semantic Role Constraints
Katsumasa Yoshikawa | Ryu Iida | Tsutomu Hirao | Manabu Okumura
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf
A Cross-Lingual ILP Solution to Zero Anaphora Resolution
Ryu Iida | Massimo Poesio
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Multi-modal Reference Resolution in Situated Dialogue by Integrating Linguistic and Extra-Linguistic Clues
Ryu Iida | Masaaki Yasuhara | Takenobu Tokunaga
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf
Construction of bilingual multimodal corpora of referring expressions in collaborative problem solving
Takenobu Tokunaga | Ryu Iida | Masaaki Yasuhara | Asuka Terai | David Morris | Anja Belz
Proceedings of the Eighth Workshop on Asian Language Resouces

pdf
Towards an Extrinsic Evaluation of Referring Expressions in Situated Dialogs
Philipp Spanger | Ryu Iida | Takenobu Tokunaga | Asuka Terai | Naoko Kuriyama
Proceedings of the 6th International Natural Language Generation Conference

pdf abs
Annotation Process Management Revisited
Dain Kaplan | Ryu Iida | Takenobu Tokunaga
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Proper annotation process management is crucial to the construction of corpora, which are in turn indispensable to the data-driven techniques that have come to the forefront in NLP during the last two decades. It is still common to see ad-hoc tools created for a specific annotation project, but it is time this changed; creation of such tools is labor and time expensive, and is secondary to corpus creation. In addition, such tools likely lack proper annotation process management, increasingly more important as corpora sizes grow in size and complexity. This paper first raises a list of ten needs that any general purpose annotation system should address moving forward, such as user & role management, delegation & monitoring of work, diffing & merging annotators work, versioning of corpora, multilingual support, import/export format flexibility, and so on. A framework to address these needs is then proposed, and how having proper annotation process management can be beneficial to the creation and maintenance of corpora explained. The paper then introduces SLATE (Segment and Link-based Annotation Tool Enhanced), the second iteration of a web-based annotation tool, which is being rewritten to implement the proposed framework.

pdf
Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue
Ryu Iida | Syumpei Kobayashi | Takenobu Tokunaga
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf
A Japanese Corpus of Referring Expressions Used in a Situated Collaboration Task
Philipp Spanger | Masaaki Yasuhara | Ryu Iida | Takenobu Tokunaga
Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

pdf
Automatic Extraction of Citation Contexts for Research Paper Summarization: A Coreference-chain based Approach
Dain Kaplan | Ryu Iida | Takenobu Tokunaga
Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL)

pdf
Capturing Salience with a Trainable Cache Model for Zero-anaphora Resolution
Ryu Iida | Kentaro Inui | Yuji Matsumoto
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Gloss-Based Semantic Similarity Metrics for Predominant Sense Acquisition
Ryu Iida | Diana McCarthy | Rob Koeling
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

2007

pdf
Annotating a Japanese Text Corpus with Predicate-Argument and Coreference Relations
Ryu Iida | Mamoru Komachi | Kentaro Inui | Yuji Matsumoto
Proceedings of the Linguistic Annotation Workshop

2006

pdf
Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution
Ryu Iida | Kentaro Inui | Yuji Matsumoto
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf abs
Augmenting a Semantic Verb Lexicon with a Large Scale Collection of Example Sentences
Kentaro Inui | Toru Hirano | Ryu Iida | Atsushi Fujita | Yuji Matsumoto
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

One of the crucial issues in semantic parsing is how to reduce costs of collecting a sufficiently large amount of labeled data. This paper presents a new approach to cost-saving annotation of example sentences with predicate-argument structure information, taking Japanese as a target language. In this scheme, a large collection of unlabeled examples are first clustered and selectively sampled, and for each sampled cluster, only one representative example is given a label by a human annotator. The advantages of this approach are empirically supported by the results of our preliminary experiments, where we use an existing similarity function and naive sampling strategy.