Abstract
We present the Weltmodell, a commonsense knowledge base that was automatically generated from aggregated dependency parse fragments gathered from over 3.5 million English language books. We leverage the magnitude and diversity of this dataset to arrive at close to ten million distinct N-ary commonsense facts using techniques from open-domain Information Extraction (IE). Furthermore, we compute a range of measures of association and distributional similarity on this data. We present the results of our efforts using a browsable web demonstrator and publicly release all generated data for use and discussion by the research community. In this paper, we give an overview of our knowledge acquisition method and representation model, and present our web demonstrator.- Anthology ID:
- L14-1351
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3272–3276
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/409_Paper.pdf
- DOI:
- Cite (ACL):
- Alan Akbik and Thilo Michael. 2014. The Weltmodell: A Data-Driven Commonsense Knowledge Base. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3272–3276, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- The Weltmodell: A Data-Driven Commonsense Knowledge Base (Akbik & Michael, LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/409_Paper.pdf