Tadashi Nomoto


2022

pdf
The Causal News Corpus: Annotating Causal Relations in Event Sentences from News
Fiona Anting Tan | Ali Hürriyetoğlu | Tommaso Caselli | Nelleke Oostdijk | Tadashi Nomoto | Hansi Hettiarachchi | Iqra Ameer | Onur Uca | Farhana Ferdousi Liza | Tiancheng Hu
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Despite the importance of understanding causality, corpora addressing causal relations are limited. There is a discrepancy between existing annotation guidelines of event causality and conventional causality corpora that focus more on linguistics. Many guidelines restrict themselves to include only explicit relations or clause-based arguments. Therefore, we propose an annotation schema for event causality that addresses these concerns. We annotated 3,559 event sentences from protest event news with labels on whether it contains causal relations or not. Our corpus is known as the Causal News Corpus (CNC). A neural network built upon a state-of-the-art pre-trained language model performed well with 81.20% F1 score on test set, and 83.46% in 5-folds cross-validation. CNC is transferable across two external corpora: CausalTimeBank (CTB) and Penn Discourse Treebank (PDTB). Leveraging each of these external datasets for training, we achieved up to approximately 64% F1 on the CNC test set without additional fine-tuning. CNC also served as an effective training and pre-training dataset for the two external corpora. Lastly, we demonstrate the difficulty of our task to the layman in a crowd-sourced annotation exercise. Our annotated corpus is publicly available, providing a valuable resource for causal text mining researchers.

2021

pdf
Grounding NBA Matchup Summaries
Tadashi Nomoto
Proceedings of the 14th International Conference on Natural Language Generation

The present paper summarizes an attempt we made to meet a shared task challenge on grounding machine-generated summaries of NBA matchups (https://github.com/ehudreiter/accuracySharedTask.git). In the first half, we discuss methods and in the second, we report results, together with a discussion on what feature may have had an effect on the performance.

2020

pdf
Meeting the 2020 Duolingo Challenge on a Shoestring
Tadashi Nomoto
Proceedings of the Fourth Workshop on Neural Generation and Translation

What is given below is a brief description of the two systems, called gFCONV and c-VAE, which we built in a response to the 2020 Duolingo Challenge. Both are neural models that aim at disrupting a sentence representation the encoder generates with an eye on increasing the diversity of sentences that emerge out of the process. Importantly, we decided not to turn to external sources for extra ammunition, curious to know how far we can go while confining ourselves to the data released by Duolingo. gFCONV works by taking over a pre-trained sequence model, and intercepting the output its encoder produces on its way to the decoder. c-VAE is a conditional variational auto-encoder, seeking the diversity by blurring the representation that the encoder derives. Experiments on a corpus constructed out of the public dataset from Duolingo, containing some 4 million pairs of sentences, found that gFCONV is a consistent winner over c-VAE though both suffered heavily from a low recall.

2019

pdf
Generating Paraphrases with Lean Vocabulary
Tadashi Nomoto
Proceedings of the 12th International Conference on Natural Language Generation

In this work, we examine whether it is possible to achieve the state of the art performance in paraphrase generation with reduced vocabulary. Our approach consists of building a convolution to sequence model (Conv2Seq) partially guided by the reinforcement learning, and training it on the subword representation of the input. The experiment on the Quora dataset, which contains over 140,000 pairs of sentences and corresponding paraphrases, found that with less than 1,000 token types, we were able to achieve performance which exceeded that of the current state of the art.

2016

pdf
NEAL: A Neurally Enhanced Approach to Linking Citation and Reference
Tadashi Nomoto
Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL)

2015

pdf
MediaMeter: A Global Monitor for Online News Coverage
Tadashi Nomoto
Proceedings of the First Workshop on Computing News Storylines

2014

pdf
Lexico-syntactic text simplification and compression with typed dependencies
Mandya Angrosh | Tadashi Nomoto | Advaith Siddharthan
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2009

pdf
A Comparison of Model Free versus Model Intensive Approaches to Sentence Compression
Tadashi Nomoto
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf
A Generic Sentence Trimmer with CRFs
Tadashi Nomoto
Proceedings of ACL-08: HLT

2005

pdf
Bayesian Learning in Text Summarization
Tadashi Nomoto
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf
Multi-Engine Machine Translation with Voted Language Model
Tadashi Nomoto
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf
Predictive models of performance in multi-engine machine translation
Tadashi Nomoto
Proceedings of Machine Translation Summit IX: Papers

The paper describes a novel approach to Multi-Engine Machine Translation. We build statistical models of performance of translations and use them to guide us in combining and selecting from outputs from multiple MT engines. We empirically demonstrate that the MEMT system based on the models outperforms any of its component engine.

2002

pdf
Supervised Ranking in Open-Domain Text Summarization
Tadashi Nomoto | Yuji Matsumoto
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

1999

pdf
Learning Discourse Relations with Active Data Selection
Tadashi Nomoto | Yuji Matsumoto
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1998

pdf
Discourse Parsing: A Decision Tree Approach
Tadashi Nomoto | Yuji Matsumoto
Sixth Workshop on Very Large Corpora

1997

pdf
Data Reliability and Its Effects on Automatic Abstracting
Tadashi Nomoto | Yuji Matsumoto
Fifth Workshop on Very Large Corpora

1996

pdf
Exploiting Text Structure for Topic Identification
Tadashi Nomoto | Yuji Matsumoto
Fourth Workshop on Very Large Corpora

1994

pdf
A Grammatico-Statistical Approach to Discourse Partitioning
Tadashi Nomoto | Yoshihiko Nitta
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

1993

pdf
Resolving Zero Anaphora in Japanese
Tadashi Nomoto | Yoshihiko Nitta
Sixth Conference of the European Chapter of the Association for Computational Linguistics