Supplementary Data for
'KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts'
---

We include a Software domain Knowledge Base extracted using the KB-LDA model.
The induced KB was trained with a target of 50 topics.

Included files:
--

- topicTokens.txt - List of learned instance topics. For each topic k, we provide the top 10 tokens, as ranked by sigma_k.
- topicConcepts.txt - List of learned topic concepts. For each topic k, we provide the top 10 concepts recovered for that topic.
- top100Relations.txt - List of the top 100 relations learned by the model.
  Each entry describes a relation between an instance topic for the subject (kS),
                                          a relation topic for the verb (kV), and
                                          an instance topic for the object (kO).
  We additionally provide the relation probability according to pi_r and the top 5 token for each of the topics.
- subsumptionByMaxSpanningTree.txt - List of 49 subsumptions forming a maximum spanning tree over topic relations.
  These subsumptions make up the KB ontology.
  Each entry describes a subsumption between a topic for the concept/hypernym (kC), and a topic for the instance/hyponym (kI).
  We additionally provide the subsumption probability according to pi_o and the top 5 token for each of the topics.
