Building Korean Linguistic Resource for NLU Data Generation of Banking App CS Dialog System
Jeongwoo Yoon, Onyu Park, Changhoe Hwang, Gwanghoon Yoo, Eric Laporte, Jeesun Nam
Abstract
Natural language understanding (NLU) is integral to task-oriented dialog systems, but demands a considerable amount of annotated training data to increase the coverage of diverse utterances. In this study, we report the construction of a linguistic resource named FIAD (Financial Annotated Dataset) and its use to generate a Korean annotated training data for NLU in the banking customer service (CS) domain. By an empirical examination of a corpus of banking app reviews, we identified three linguistic patterns occurring in Korean request utterances: TOPIC (ENTITY, FEATURE), EVENT, and DISCOURSE MARKER. We represented them in LGGs (Local Grammar Graphs) to generate annotated data covering diverse intents and entities. To assess the practicality of the resource, we evaluate the performances of DIET-only (Intent: 0.91 /Topic [entity+feature]: 0.83), DIET+ HANBERT (I:0.94/T:0.85), DIET+ KoBERT (I:0.94/T:0.86), and DIET+ KorBERT (I:0.95/T:0.84) models trained on FIAD-generated data to extract various types of semantic items.- Anthology ID:
- 2022.pandl-1.4
- Volume:
- Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Laura Chiticariu, Yoav Goldberg, Gus Hahn-Powell, Clayton T. Morrison, Aakanksha Naik, Rebecca Sharp, Mihai Surdeanu, Marco Valenzuela-Escárcega, Enrique Noriega-Atala
- Venue:
- PANDL
- SIG:
- Publisher:
- International Conference on Computational Linguistics
- Note:
- Pages:
- 29–37
- Language:
- URL:
- https://aclanthology.org/2022.pandl-1.4
- DOI:
- Cite (ACL):
- Jeongwoo Yoon, Onyu Park, Changhoe Hwang, Gwanghoon Yoo, Eric Laporte, and Jeesun Nam. 2022. Building Korean Linguistic Resource for NLU Data Generation of Banking App CS Dialog System. In Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning, pages 29–37, Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
- Cite (Informal):
- Building Korean Linguistic Resource for NLU Data Generation of Banking App CS Dialog System (Yoon et al., PANDL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.pandl-1.4.pdf