Abstract
We present the problem of categorizing web services according to a shallow ontology for presentation on a specialist portal, using their WSDL and associated textual documents found by a crawler. We treat this as a text classification problem and apply first information extraction (IE) techniques (voting using keywords weight according to their context), then machine learning (ML), and finally a combined approach in which ML has priority over weighted keywords, but the latter can still make up categorizations for services for which ML does not produce enough. We evaluate the techniques (using data manually annotated through the portal, which we also use as the training data for ML) according to standard IE measures for flat categorization as well as the Balanced Distance Metric (more suitable for ontological classification) and compare them with related work in web service categorization. The ML and combined categorization results are good and the system is designed to take users' contributions through the portal's Web 2.0 features as additional training data.- Anthology ID:
- L10-1110
- Volume:
- Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
- Month:
- May
- Year:
- 2010
- Address:
- Valletta, Malta
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/170_Paper.pdf
- DOI:
- Cite (ACL):
- Adam Funk and Kalina Bontcheva. 2010. Ontology-Based Categorization of Web Services with Machine Learning. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
- Cite (Informal):
- Ontology-Based Categorization of Web Services with Machine Learning (Funk & Bontcheva, LREC 2010)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/170_Paper.pdf