Shi Yuan


An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines
Shi Yuan | Bei Yu
Proceedings of the Workshop Events and Stories in the News 2018

This study evaluates the performance of four information extraction tools (extractors) on identifying health claims in health news headlines. A health claim is defined as a triplet: IV (what is being manipulated), DV (what is being measured) and their relation. Tools that can identify health claims provide the foundation for evaluating the accuracy of these claims against authoritative resources. The evaluation result shows that 26% headlines do not in-clude health claims, and all extractors face difficulty separating them from the rest. For those with health claims, OPENIE-5.0 performed the best with F-measure at 0.6 level for ex-tracting “IV-relation-DV”. However, the characteristic linguistic structures in health news headlines, such as incomplete sentences and non-verb relations, pose particular challenge to existing tools.