Learning to Define Terms in the Software Domain

Vidhisha Balachandran; Dheeraj Rajagopal; Rose Catherine Kanjirathinkal; William Cohen

doi:10.18653/v1/W18-6122

Learning to Define Terms in the Software Domain

Vidhisha Balachandran, Dheeraj Rajagopal, Rose Catherine Kanjirathinkal, William Cohen

Abstract

One way to test a person’s knowledge of a domain is to ask them to define domain-specific terms. Here, we investigate the task of automatically generating definitions of technical terms by reading text from the technical domain. Specifically, we learn definitions of software entities from a large corpus built from the user forum Stack Overflow. To model definitions, we train a language model and incorporate additional domain-specific information like word co-occurrence, and ontological category information. Our approach improves previous baselines by 2 BLEU points for the definition generation task. Our experiments also show the additional challenges associated with the task and the short-comings of language-model based architectures for definition generation.

Anthology ID:: W18-6122
Volume:: Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Month:: November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 164–172
Language:
URL:: https://aclanthology.org/W18-6122
DOI:: 10.18653/v1/W18-6122
Bibkey:
Cite (ACL):: Vidhisha Balachandran, Dheeraj Rajagopal, Rose Catherine Kanjirathinkal, and William Cohen. 2018. Learning to Define Terms in the Software Domain. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, pages 164–172, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Learning to Define Terms in the Software Domain (Balachandran et al., WNUT 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-2023-videos/W18-6122.pdf

PDF Search