Masatoshi Yoshikawa


2021

pdf
Multi-TimeLine Summarization (MTLS): Improving Timeline Summarization by Generating Multiple Summaries
Yi Yu | Adam Jatowt | Antoine Doucet | Kazunari Sugiyama | Masatoshi Yoshikawa
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we address a novel task, Multiple TimeLine Summarization (MTLS), which extends the flexibility and versatility of Time-Line Summarization (TLS). Given any collection of time-stamped news articles, MTLS automatically discovers important yet different stories and generates a corresponding time-line for each story. To achieve this, we propose a novel unsupervised summarization framework based on two-stage affinity propagation. We also introduce a quantitative evaluation measure for MTLS based on previousTLS evaluation methods. Experimental results show that our MTLS framework demonstrates high effectiveness and MTLS task can give bet-ter results than TLS.

2020

pdf
Annotating and Analyzing Biased Sentences in News Articles using Crowdsourcing
Sora Lim | Adam Jatowt | Michael Färber | Masatoshi Yoshikawa
Proceedings of the Twelfth Language Resources and Evaluation Conference

The spread of biased news and its consumption by the readers has become a considerable issue. Researchers from multiple domains including social science and media studies have made efforts to mitigate this media bias issue. Specifically, various techniques ranging from natural language processing to machine learning have been used to help determine news bias automatically. However, due to the lack of publicly available datasets in this field, especially ones containing labels concerning bias on a fine-grained level (e.g., on sentence level), it is still challenging to develop methods for effectively identifying bias embedded in new articles. In this paper, we propose a novel news bias dataset which facilitates the development and evaluation of approaches for detecting subtle bias in news articles and for understanding the characteristics of biased sentences. Our dataset consists of 966 sentences from 46 English-language news articles covering 4 different events and contains labels concerning bias on the sentence level. For scalability reasons, the labels were obtained based on crowd-sourcing. Our dataset can be used for analyzing news bias, as well as for developing and evaluating methods for news bias detection. It can also serve as resource for related researches including ones focusing on fake news detection.

2003

pdf
Learning Bilingual Translations from Comparable Corpora to Cross-Language Information Retrieval: Hybrid Statistics-based and Linguistics-based Approach
Fatiha Sadat | Masatoshi Yoshikawa | Shunsuke Uemura
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages

pdf
Cross-Language Information Retrieval Based on Category Matching Between Language Versions of a Web Directory
Fuminori Kimura | Akira Maeda | Masatoshi Yoshikawa | Shunsuke Uemura
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages

pdf
Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval
Fatiha Sadat | Masatoshi Yoshikawa | Shunsuke Uemura
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics