2020
pdf
abs
MRC Examples Answerable by BERT without a Question Are Less Effective in MRC Model Training
Hongyu Li
|
Tengyang Chen
|
Shuting Bai
|
Takehito Utsuro
|
Yasuhide Kawada
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop
Models developed for Machine Reading Comprehension (MRC) are asked to predict an answer from a question and its related context. However, there exist cases that can be correctly answered by an MRC model using BERT, where only the context is provided without including the question. In this paper, these types of examples are referred to as “easy to answer”, while others are as “hard to answer”, i.e., unanswerable by an MRC model using BERT without being provided the question. Based on classifying examples as answerable or unanswerable by BERT without the given question, we propose a method based on BERT that splits the training examples from the MRC dataset SQuAD1.1 into those that are “easy to answer” or “hard to answer”. Experimental evaluation from a comparison of two models, one trained only with “easy to answer” examples and the other with “hard to answer” examples demonstrates that the latter outperforms the former.
pdf
abs
Developing a How-to Tip Machine Comprehension Dataset and its Evaluation in Machine Comprehension by BERT
Tengyang Chen
|
Hongyu Li
|
Miho Kasamatsu
|
Takehito Utsuro
|
Yasuhide Kawada
Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)
In the field of factoid question answering (QA), it is known that the state-of-the-art technology has achieved an accuracy comparable to that of humans in a certain benchmark challenge. On the other hand, in the area of non-factoid QA, there is still a limited number of datasets for training QA models, i.e., machine comprehension models. Considering such a situation within the field of the non-factoid QA, this paper aims to develop a dataset for training Japanese how-to tip QA models. This paper applies one of the state-of-the-art machine comprehension models to the Japanese how-to tip QA dataset. The trained how-to tip QA model is also compared with a factoid QA model trained with a Japanese factoid QA dataset. Evaluation results revealed that the how-to tip machine comprehension performance was almost comparative with that of the factoid machine comprehension even with the training data size reduced to around 4% of the factoid machine comprehension. Thus, the how-to tip machine comprehension task requires much less training data compared with the factoid machine comprehension task.
2018
pdf
abs
Measuring Beginner Friendliness of Japanese Web Pages explaining Academic Concepts by Integrating Neural Image Feature and Text Features
Hayato Shiokawa
|
Kota Kawaguchi
|
Bingcai Han
|
Takehito Utsuro
|
Yasuhide Kawada
|
Masaharu Yoshioka
|
Noriko Kando
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Search engine is an important tool of modern academic study, but the results are lack of measurement of beginner friendliness. In order to improve the efficiency of using search engine for academic study, it is necessary to invent a technique of measuring the beginner friendliness of a Web page explaining academic concepts and to build an automatic measurement system. This paper studies how to integrate heterogeneous features such as a neural image feature generated from the image of the Web page by a variant of CNN (convolutional neural network) as well as text features extracted from the body text of the HTML file of the Web page. Integration is performed through the framework of the SVM classifier learning. Evaluation results show that heterogeneous features perform better than each individual type of features.
2016
pdf
abs
Analyzing Time Series Changes of Correlation between Market Share and Concerns on Companies measured through Search Engine Suggests
Takakazu Imada
|
Yusuke Inoue
|
Lei Chen
|
Syunya Doi
|
Tian Nie
|
Chen Zhao
|
Takehito Utsuro
|
Yasuhide Kawada
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper proposes how to utilize a search engine in order to predict market shares. We propose to compare rates of concerns of those who search for Web pages among several companies which supply products, given a specific products domain. We measure concerns of those who search for Web pages through search engine suggests. Then, we analyze whether rates of concerns of those who search for Web pages have certain correlation with actual market share. We show that those statistics have certain correlations. We finally propose how to predict the market share of a specific product genre based on the rates of concerns of those who search for Web pages.
2011
pdf
Example-based Translation of Japanese Functional Expressions utilizing Semantic Equivalence Classes
Yusuke Abe
|
Takafumi Suzuki
|
Bing Liang
|
Takehito Utsuro
|
Mikio Yamamoto
|
Suguru Matsuyoshi
|
Yasuhide Kawada
Proceedings of the 4th Workshop on Patent Translation