Can Topic Modelling benefit from Word Sense Information?
Adriana Ferrugento, Hugo Gonçalo Oliveira, Ana Alves, Filipe Rodrigues
Abstract
This paper proposes a new topic model that exploits word sense information in order to discover less redundant and more informative topics. Word sense information is obtained from WordNet and the discovered topics are groups of synsets, instead of mere surface words. A key feature is that all the known senses of a word are considered, with their probabilities. Alternative configurations of the model are described and compared to each other and to LDA, the most popular topic model. However, the obtained results suggest that there are no benefits of enriching LDA with word sense information.- Anthology ID:
- L16-1540
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3387–3393
- Language:
- URL:
- https://preview.aclanthology.org/remove-affiliations/L16-1540/
- DOI:
- Cite (ACL):
- Adriana Ferrugento, Hugo Gonçalo Oliveira, Ana Alves, and Filipe Rodrigues. 2016. Can Topic Modelling benefit from Word Sense Information?. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3387–3393, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Can Topic Modelling benefit from Word Sense Information? (Ferrugento et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/remove-affiliations/L16-1540.pdf
- Code
- aferrugento/SemLDA