Abstract
Recent developments in computer vision applications that are based on machine learning models allow real-time object detection, segmentation and captioning in image or video streams. The paper presents the development of an extension of the 80 COCO categories into a novel ontology with more than 700 classes covering 130 thematic subdomains related to Sport, Transport, Arts and Security. The development of an image dataset of object segmentation was accelerated by machine learning for automatic generation of objects’ boundaries and classes. The Multilingual image dataset contains over 20,000 images and 200,000 annotations. It was used to pre-train 130 models for object detection and classification. We show the established approach for the development of the new models and their integration into an application and evaluation framework.- Anthology ID:
- 2022.clib-1.22
- Volume:
- Proceedings of the 5th International Conference on Computational Linguistics in Bulgaria (CLIB 2022)
- Month:
- September
- Year:
- 2022
- Address:
- Sofia, Bulgaria
- Venue:
- CLIB
- SIG:
- Publisher:
- Department of Computational Linguistics, IBL -- BAS
- Note:
- Pages:
- 190–201
- Language:
- URL:
- https://aclanthology.org/2022.clib-1.22
- DOI:
- Cite (ACL):
- Jordan Kralev and Svetla Koeva. 2022. Image Models for large-scale Object Detection and Classification. In Proceedings of the 5th International Conference on Computational Linguistics in Bulgaria (CLIB 2022), pages 190–201, Sofia, Bulgaria. Department of Computational Linguistics, IBL -- BAS.
- Cite (Informal):
- Image Models for large-scale Object Detection and Classification (Kralev & Koeva, CLIB 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.clib-1.22.pdf
- Data
- MS COCO