Noriko Kando

2020

In this study, we construct a corpus of Japanese local assembly minutes. All speeches in an assembly were transcribed into a local assembly minutes based on the local autonomy law. Therefore, the local assembly minutes form an extremely large amount of text data. Our ultimate objectives were to summarize and present the arguments in the assemblies, and to use the minutes as primary information for arguments in local politics. To achieve this, we structured all statements in assembly minutes. We focused on the structure of the discussion, i.e., the extraction of question and answer pairs. We organized the shared task “QA Lab-PoliInfo” in NTCIR 14. We conducted a “segmentation task” to identify the scope of one question and answer in the minutes as a sub task of the shared task. For the segmentation task, 24 runs from five teams were submitted. Based on the obtained results, the best recall was 1.000, best precision was 0.940, and best F-measure was 0.895.

2019

pdf abs
Opinion Mining with Deep Contextualized Embeddings
Wen-Bin Han | Noriko Kando
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Detecting opinion expression is a potential and essential task in opinion mining that can be extended to advanced tasks. In this paper, we considered opinion expression detection as a sequence labeling task and exploited different deep contextualized embedders into the state-of-the-art architecture, composed of bidirectional long short-term memory (BiLSTM) and conditional random field (CRF). Our experimental results show that using different word embeddings can cause contrasting results, and the model can achieve remarkable scores with deep contextualized embeddings. Especially, using BERT embedder can significantly exceed using ELMo embedder.

2018

pdf abs
Measuring Beginner Friendliness of Japanese Web Pages explaining Academic Concepts by Integrating Neural Image Feature and Text Features
Hayato Shiokawa | Kota Kawaguchi | Bingcai Han | Takehito Utsuro | Yasuhide Kawada | Masaharu Yoshioka | Noriko Kando
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

Search engine is an important tool of modern academic study, but the results are lack of measurement of beginner friendliness. In order to improve the efficiency of using search engine for academic study, it is necessary to invent a technique of measuring the beginner friendliness of a Web page explaining academic concepts and to build an automatic measurement system. This paper studies how to integrate heterogeneous features such as a neural image feature generated from the image of the Web page by a variant of CNN (convolutional neural network) as well as text features extracted from the body text of the HTML file of the Web page. Integration is performed through the framework of the SVM classifier learning. Evaluation results show that heterogeneous features perform better than each individual type of features.

2016

pdf bib
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)
Key-Sun Choi | Christina Unger | Piek Vossen | Jin-Dong Kim | Noriko Kando | Axel-Cyrille Ngonga Ngomo
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)

2013

pdf
Time Series Topic Modeling and Bursty Topic Detection of Correlated News and Twitter
Daichi Koike | Yusuke Takahashi | Takehito Utsuro | Masaharu Yoshioka | Noriko Kando
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

2010

pdf
Towards an optimal weighting of context words based on distance
Bernard Brosseau-Villeneuve | Jian-Yun Nie | Noriko Kando
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
RALI: Automatic Weighting of Text Window Distances
Bernard Brosseau-Villeneuve | Noriko Kando | Jian-Yun Nie
Proceedings of the 5th International Workshop on Semantic Evaluation

2009

2008

pdf abs
A Japanese-English Technical Lexicon for Translation and Language Research
Fredric Gey | David Kirk Evans | Noriko Kando
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present a Japanese-English Bilingual lexicon of technical terms. The lexicon was derived from the first and second NTCIR evaluation collections for research into cross-language information retrieval for Asian languages. While it can be utilized for translation between Japanese and English, the lexicon is also suitable for language research and language engineering. Since it is collection-derived, it contains instances of word variants and miss-spellings which make it eminently suitable for further research. For a subset of the lexicon we make available the collection statistics. In addition we make available a Katakana subset suitable for transliteration research.

2006

pdf abs
Test Collections for Patent Retrieval and Patent Classification in the Fifth NTCIR Workshop
Atsushi Fujii | Makoto Iwayama | Noriko Kando
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the test collections produced for the Patent Retrieval Task in the Fifth NTCIR Workshop. We performed the invalidity search task, in which each participant group searches a patent collection for the patents that can invalidate the demand in an existing claim. For this purpose, we performed both document and passage retrieval tasks. We also performed the automatic patent classification task using the F-term classification system. The test collections will be available to the public for research purposes.

pdf bib
WoZ Simulation of Interactive Question Answering
Tsuneaki Kato | Jun’ichi Fukumoto | Fumito Masui | Noriko Kando
Proceedings of the Interactive Question Answering Workshop at HLT-NAACL 2006