Comparative Analysis of Portuguese Named Entities Recognition Tools
Daniela Amaral, Evandro Fonseca, Lucelene Lopes, Renata Vieira
Abstract
This paper describes an experiment to compare four tools to recognize named entities in Portuguese texts. The experiment was made over the HAREM corpora, a golden standard for named entities recognition in Portuguese. The tools experimented are based on natural language processing techniques and also machine learning. Specifically, one of the tools is based on Conditional random fields, an unsupervised machine learning model that has being used to named entities recognition in several languages, while the other tools follow more traditional natural language approaches. The comparison results indicate advantages for different tools according to the different classes of named entities. Despite of such balance among tools, we conclude pointing out foreseeable advantages to the machine learning based tool.- Anthology ID:
- L14-1425
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 2554–2558
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/513_Paper.pdf
- DOI:
- Cite (ACL):
- Daniela Amaral, Evandro Fonseca, Lucelene Lopes, and Renata Vieira. 2014. Comparative Analysis of Portuguese Named Entities Recognition Tools. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 2554–2558, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Comparative Analysis of Portuguese Named Entities Recognition Tools (Amaral et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/513_Paper.pdf