Abstract
Enabling code reuse is an important goal in software engineering, and it depends crucially on effective code search interfaces. We propose to ground word meanings in source code and use such language-code mappings in order to enable a search engine for programming library code where users can pose queries in English. We exploit the fact that there are large programming language libraries which are documented both via formally specified function or method signatures as well as descriptions written in natural language. Automatically learned associations between words in descriptions and items in signatures allows us to use queries formulated in English to retrieve methods which are not documented via natural language descriptions, only based on their signatures. We show that the rankings returned by our model substantially outperforms a strong term-matching baseline.- Anthology ID:
- L14-1042
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3248–3252
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/106_Paper.pdf
- DOI:
- Cite (ACL):
- Huijing Deng and Grzegorz Chrupała. 2014. Semantic approaches to software component retrieval with English queries. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3248–3252, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Semantic approaches to software component retrieval with English queries (Deng & Chrupała, LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/106_Paper.pdf