Construction of Text Summarization Corpus for the Credibility of Information on the Web
Masahiro Nakano, Hideyuki Shibuki, Rintaro Miyazaki, Madoka Ishioroshi, Koichi Kaneko, Tatsunori Mori
Abstract
Recently, the credibility of information on the Web has become an important issue. In addition to telling about content of source documents, indicating how to interpret the content, especially showing interpretation of the relation between statements appeared to contradict each other, is important for helping a user judge the credibility of information. In this paper, we will describe the purpose and the way in the construction of a text summarization corpus. Our purpose in the construction of the corpus includes the following three points; to collect Web documents relevant to several query sentences, to prepare gold standard data to evaluate smaller sub-processes in the extraction process and the summary generation process, to investigate the summaries made by human summarizers. The constructed corpus contains six query sentences, 24 manually-constructed summaries, and 24 collections of source Web documents. We also investigated how the descriptions of interpretation, which help a user judge the credibility of other descriptions in the summary, appear in the corpus. As a result, we confirmed that showing interpretation on conflicts is important for helping a user judge the credibility of information.- Anthology ID:
- L10-1083
- Volume:
- Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
- Month:
- May
- Year:
- 2010
- Address:
- Valletta, Malta
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/135_Paper.pdf
- DOI:
- Cite (ACL):
- Masahiro Nakano, Hideyuki Shibuki, Rintaro Miyazaki, Madoka Ishioroshi, Koichi Kaneko, and Tatsunori Mori. 2010. Construction of Text Summarization Corpus for the Credibility of Information on the Web. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
- Cite (Informal):
- Construction of Text Summarization Corpus for the Credibility of Information on the Web (Nakano et al., LREC 2010)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/135_Paper.pdf