Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions
Pierpaolo Basile, Annalina Caputo, Seamus Lawless, Giovanni Semeraro
Abstract
In the last few years, the increasing availability of large corpora spanning several time periods has opened new opportunities for the diachronic analysis of language. This type of analysis can bring to the light not only linguistic phenomena related to the shift of word meanings over time, but it can also be used to study the impact that societal and cultural trends have on this language change. This paper introduces a new resource for performing the diachronic analysis of named entities built upon Wikipedia page revisions. This resource enables the analysis over time of changes in the relations between entities (concepts), surface forms (words), and the contexts surrounding entities and surface forms, by analysing the whole history of Wikipedia internal links. We provide some useful use cases that prove the impact of this resource on diachronic studies and delineate some possible future usage.- Anthology ID:
- R19-1011
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 84–91
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/R19-1011/
- DOI:
- 10.26615/978-954-452-056-4_011
- Cite (ACL):
- Pierpaolo Basile, Annalina Caputo, Seamus Lawless, and Giovanni Semeraro. 2019. Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 84–91, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Diachronic Analysis of Entities by Exploiting Wikipedia Page revisions (Basile et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/R19-1011.pdf