A Study on Expert Sourcing Enterprise Question Collection and Classification
Yuan Luo, Thomas Boucher, Tolga Oral, David Osofsky, Sara Weber
Abstract
Large enterprises, such as IBM, accumulate petabytes of free-text data within their organizations. To mine this big data, a critical ability is to enable meaningful question answering beyond keywords search. In this paper, we present a study on the characteristics and classification of IBM sales questions. The characteristics are analyzed both semantically and syntactically, from where a question classification guideline evolves. We adopted an enterprise level expert sourcing approach to gather questions, annotate questions based on the guideline and manage the quality of annotations via enhanced inter-annotator agreement analysis. We developed a question feature extraction system and experimented with rule-based, statistical and hybrid question classifiers. We share our annotated corpus of questions and report our experimental results. Statistical classifiers separately based on n-grams and hand-crafted rule features give reasonable macro-f1 scores at 61.7% and 63.1% respectively. Rule based classifier gives a macro-f1 at 77.1%. The hybrid classifier with n-gram and rule features using a second guess model further improves the macro-f1 to 83.9%.- Anthology ID:
- L14-1233
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 181–188
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/25_Paper.pdf
- DOI:
- Cite (ACL):
- Yuan Luo, Thomas Boucher, Tolga Oral, David Osofsky, and Sara Weber. 2014. A Study on Expert Sourcing Enterprise Question Collection and Classification. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 181–188, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- A Study on Expert Sourcing Enterprise Question Collection and Classification (Luo et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/25_Paper.pdf