@inproceedings{kim-etal-2019-data,
    title = "Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding",
    author = "Kim, Hwa-Yeon  and
      Roh, Yoon-Hyung  and
      Kim, Young-Kil",
    editor = "Kar, Sudipta  and
      Nadeem, Farah  and
      Burdick, Laura  and
      Durrett, Greg  and
      Han, Na-Rae",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Student Research Workshop",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/display_plenaries/N19-3014/",
    doi = "10.18653/v1/N19-3014",
    pages = "97--102",
    abstract = "One of the main challenges in Spoken Language Understanding (SLU) is dealing with `open-vocabulary' slots. Recently, SLU models based on neural network were proposed, but it is still difficult to recognize the slots of unknown words or `open-vocabulary' slots because of the high cost of creating a manually tagged SLU dataset. This paper proposes data noising, which reflects the characteristics of the `open-vocabulary' slots, for data augmentation. We applied it to an attention based bi-directional recurrent neural network (Liu and Lane, 2016) and experimented with three datasets: Airline Travel Information System (ATIS), Snips, and MIT-Restaurant. We achieved performance improvements of up to 0.57{\%} and 3.25 in intent prediction (accuracy) and slot filling (f1-score), respectively. Our method is advantageous because it does not require additional memory and it can be applied simultaneously with the training process of the model."
}Markdown (Informal)
[Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding](https://preview.aclanthology.org/display_plenaries/N19-3014/) (Kim et al., NAACL 2019)
ACL