Abstract
In the paper we present the latest changes introduce to Inforex — a web-based system for qualitative and collaborative text corpora annotation and analysis. One of the most important news is the release of source codes. Now the system is available on the GitHub repository (https://github.com/CLARIN-PL/Inforex) as an open source project. The system can be easily setup and run in a Docker container what simplifies the installation process. The major improvements include: semi-automatic text annotation, multilingual text preprocessing using CLARIN-PL web services, morphological tagging of XML documents, improved editor for annotation attribute, batch annotation attribute editor, morphological disambiguation, extended word sense annotation. This paper contains a brief description of the mentioned improvements. We also present two use cases in which various Inforex features were used and tested in real-life projects.- Anthology ID:
- R19-1083
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 711–719
- Language:
- URL:
- https://aclanthology.org/R19-1083
- DOI:
- 10.26615/978-954-452-056-4_083
- Cite (ACL):
- Michał Marcińczuk and Marcin Oleksy. 2019. Inforex — a Collaborative Systemfor Text Corpora Annotation and Analysis Goes Open. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 711–719, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Inforex — a Collaborative Systemfor Text Corpora Annotation and Analysis Goes Open (Marcińczuk & Oleksy, RANLP 2019)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/R19-1083.pdf
- Code
- CLARIN-PL/Inforex