Abstract
We present and discuss problems in creating a lemmatised index to transcriptions of Bulgarian speech, including the prerequisites for such an index, and why we consider an index preferable to a search engine for this particular kind of text.- Anthology ID:
- 2018.clib-1.22
- Volume:
- Proceedings of the Third International Conference on Computational Linguistics in Bulgaria (CLIB 2018)
- Month:
- May
- Year:
- 2018
- Address:
- Sofia, Bulgaria
- Venue:
- CLIB
- SIG:
- Publisher:
- Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences
- Note:
- Pages:
- 177–184
- Language:
- URL:
- https://aclanthology.org/2018.clib-1.22
- DOI:
- Cite (ACL):
- Marina Dzhonova, Kjetil Røa Hauge, and Yovka Tisheva. 2018. Parallel Web Display of Transcribed Spoken Bulgarian with its Normalised Version and an Indexed List of Lemmas. In Proceedings of the Third International Conference on Computational Linguistics in Bulgaria (CLIB 2018), pages 177–184, Sofia, Bulgaria. Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences.
- Cite (Informal):
- Parallel Web Display of Transcribed Spoken Bulgarian with its Normalised Version and an Indexed List of Lemmas (Dzhonova et al., CLIB 2018)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2018.clib-1.22.pdf