Abstract
We present a 3-step framework that learns categories and their instances from natural language text based on given training examples. Step 1 extracts contexts of training examples as rules describing this category from text, considering part of speech, capitalization and category membership as features. Step 2 selects high quality rules using two consequent filters. The first filter is based on the number of rule occurrences, the second filter takes two non-independent characteristics into account: a rule's precision and the amount of instances it acquires. Our framework adapts the filter's threshold values to the respective category and the textual genre by automatically evaluating rule sets resulting from different filter settings and selecting the best performing rule set accordingly. Step 3 then identifies new instances of a category using the filtered rules applied within a previously proposed algorithm. We inspect the rule filters' impact on rule set quality and evaluate our framework by learning first names, last names, professions and cities from a hitherto unexplored textual genre -- search engine result snippets -- and achieve high precision on average.- Anthology ID:
- L12-1045
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1235–1239
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/181_Paper.pdf
- DOI:
- Cite (ACL):
- Antje Schlaf and Robert Remus. 2012. Learning Categories and their Instances by Contextual Features. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1235–1239, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Learning Categories and their Instances by Contextual Features (Schlaf & Remus, LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/181_Paper.pdf