Modeling topic dependencies in semantically coherent text spans with copulas

Georgios Balikas, Hesam Amoualian, Marianne Clausel, Eric Gaussier, Massih R. Amini


Abstract
The exchangeability assumption in topic models like Latent Dirichlet Allocation (LDA) often results in inferring inconsistent topics for the words of text spans like noun-phrases, which are usually expected to be topically coherent. We propose copulaLDA, that extends LDA by integrating part of the text structure to the model and relaxes the conditional independence assumption between the word-specific latent topics given the per-document topic distributions. To this end, we assume that the words of text spans like noun-phrases are topically bound and we model this dependence with copulas. We demonstrate empirically the effectiveness of copulaLDA on both intrinsic and extrinsic evaluation tasks on several publicly available corpora.
Anthology ID:
C16-1166
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1767–1776
Language:
URL:
https://aclanthology.org/C16-1166
DOI:
Bibkey:
Cite (ACL):
Georgios Balikas, Hesam Amoualian, Marianne Clausel, Eric Gaussier, and Massih R. Amini. 2016. Modeling topic dependencies in semantically coherent text spans with copulas. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1767–1776, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Modeling topic dependencies in semantically coherent text spans with copulas (Balikas et al., COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/C16-1166.pdf
Code
 balikasg/topicModelling