The Weltmodell: A Data-Driven Commonsense Knowledge Base

Alan Akbik, Thilo Michael


Abstract
We present the Weltmodell, a commonsense knowledge base that was automatically generated from aggregated dependency parse fragments gathered from over 3.5 million English language books. We leverage the magnitude and diversity of this dataset to arrive at close to ten million distinct N-ary commonsense facts using techniques from open-domain Information Extraction (IE). Furthermore, we compute a range of measures of association and distributional similarity on this data. We present the results of our efforts using a browsable web demonstrator and publicly release all generated data for use and discussion by the research community. In this paper, we give an overview of our knowledge acquisition method and representation model, and present our web demonstrator.
Anthology ID:
L14-1351
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3272–3276
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/409_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Alan Akbik and Thilo Michael. 2014. The Weltmodell: A Data-Driven Commonsense Knowledge Base. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3272–3276, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
The Weltmodell: A Data-Driven Commonsense Knowledge Base (Akbik & Michael, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/409_Paper.pdf