Jungyun Seo
Also published as: Jung Yun Seo
2024
Exploring the Use of Natural Language Descriptions of Intents for Large Language Models in Zero-shot Intent Classification
Taesuk Hong | Youbin Ahn | Dongkyu Lee | Joongbo Shin | Seungpil Won | Janghoon Han | Stanley Jungkyu Choi | Jungyun Seo
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Taesuk Hong | Youbin Ahn | Dongkyu Lee | Joongbo Shin | Seungpil Won | Janghoon Han | Stanley Jungkyu Choi | Jungyun Seo
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
In task-oriented dialogue systems, intent classification is crucial for accurately understanding user queries and providing appropriate services. This study explores the use of intent descriptions with large language models for unseen domain intent classification. By examining the effects of description quality, quantity, and input length management, we identify practical guidelines for optimizing performance. Our experiments using FLAN-T5 3B demonstrate that 1) high-quality descriptions for both training and testing significantly improve accuracy, 2) diversity in training descriptions doesn’t greatly affect performance, and 3) off-the-shelf rankers selecting around ten intent options reduce input length without compromising performance. We emphasize that high-quality testing descriptions have a greater impact on accuracy than training descriptions. These findings provide practical guidelines for using intent descriptions with large language models to achieve effective and efficient intent classification in low-resource settings.
2021
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems
Janghoon Han | Taesuk Hong | Byoungjae Kim | Youngjoong Ko | Jungyun Seo
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Janghoon Han | Taesuk Hong | Byoungjae Kim | Youngjoong Ko | Jungyun Seo
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Retrieval-based dialogue systems display an outstanding performance when pre-trained language models are used, which includes bidirectional encoder representations from transformers (BERT). During the multi-turn response selection, BERT focuses on training the relationship between the context with multiple utterances and the response. However, this method of training is insufficient when considering the relations between each utterance in the context. This leads to a problem of not completely understanding the context flow that is required to select a response. To address this issue, we propose a new fine-grained post-training method that reflects the characteristics of the multi-turn dialogue. Specifically, the model learns the utterance level interactions by training every short context-response pair in a dialogue session. Furthermore, by using a new training objective, the utterance relevance classification, the model understands the semantic relevance and coherence between the dialogue utterances. Experimental results show that our model achieves new state-of-the-art with significant margins on three benchmark datasets. This suggests that the fine-grained post-training method is highly effective for the response selection task.
2020
Multi-Task Learning for Knowledge Graph Completion with Pre-trained Language Models
Bosung Kim | Taesuk Hong | Youngjoong Ko | Jungyun Seo
Proceedings of the 28th International Conference on Computational Linguistics
Bosung Kim | Taesuk Hong | Youngjoong Ko | Jungyun Seo
Proceedings of the 28th International Conference on Computational Linguistics
As research on utilizing human knowledge in natural language processing has attracted considerable attention in recent years, knowledge graph (KG) completion has come into the spotlight. Recently, a new knowledge graph completion method using a pre-trained language model, such as KG-BERT, is presented and showed high performance. However, its scores in ranking metrics such as Hits@k are still behind state-of-the-art models. We claim that there are two main reasons: 1) failure in sufficiently learning relational information in knowledge graphs, and 2) difficulty in picking out the correct answer from lexically similar candidates. In this paper, we propose an effective multi-task learning method to overcome the limitations of previous works. By combining relation prediction and relevance ranking tasks with our target link prediction, the proposed model can learn more relational properties in KGs and properly perform even when lexical similarity occurs. Experimental results show that we not only largely improve the ranking performances compared to KG-BERT but also achieve the state-of-the-art performances in Mean Rank and Hits@10 on the WN18RR dataset.
NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer
Hwijeen Ahn | Jimin Sun | Chan Young Park | Jungyun Seo
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Hwijeen Ahn | Jimin Sun | Chan Young Park | Jungyun Seo
Proceedings of the Fourteenth Workshop on Semantic Evaluation
This paper describes our approach to the task of identifying offensive languages in a multilingual setting. We investigate two data augmentation strategies: using additional semi-supervised labels with different thresholds and cross-lingual transfer with data selection. Leveraging the semi-supervised dataset resulted in performance improvements compared to the baseline trained solely with the manually-annotated dataset. We propose a new metric, Translation Embedding Distance, to measure the transferability of instances for cross-lingual data selection. We also introduce various preprocessing steps tailored for social media text along with methods to fine-tune the pre-trained multilingual BERT (mBERT) for offensive language identification. Our multilingual systems achieved competitive results in Greek, Danish, and Turkish at OffensEval 2020.
2019
ThisIsCompetition at SemEval-2019 Task 9: BERT is unstable for out-of-domain samples
Cheoneum Park | Juae Kim | Hyeon-gu Lee | Reinald Kim Amplayo | Harksoo Kim | Jungyun Seo | Changki Lee
Proceedings of the 13th International Workshop on Semantic Evaluation
Cheoneum Park | Juae Kim | Hyeon-gu Lee | Reinald Kim Amplayo | Harksoo Kim | Jungyun Seo | Changki Lee
Proceedings of the 13th International Workshop on Semantic Evaluation
This paper describes our system, Joint Encoders for Stable Suggestion Inference (JESSI), for the SemEval 2019 Task 9: Suggestion Mining from Online Reviews and Forums. JESSI is a combination of two sentence encoders: (a) one using multiple pre-trained word embeddings learned from log-bilinear regression (GloVe) and translation (CoVe) models, and (b) one on top of word encodings from a pre-trained deep bidirectional transformer (BERT). We include a domain adversarial training module when training for out-of-domain samples. Our experiments show that while BERT performs exceptionally well for in-domain samples, several runs of the model show that it is unstable for out-of-domain samples. The problem is mitigated tremendously by (1) combining BERT with a non-BERT encoder, and (2) using an RNN-based classifier on top of BERT. Our final models obtained second place with 77.78% F-Score on Subtask A (i.e. in-domain) and achieved an F-Score of 79.59% on Subtask B (i.e. out-of-domain), even without using any additional external data.
2017
A Method to Generate a Machine-Labeled Data for Biomedical Named Entity Recognition with Various Sub-Domains
Juae Kim | Sunjae Kwon | Youngjoong Ko | Jungyun Seo
Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)
Juae Kim | Sunjae Kwon | Youngjoong Ko | Jungyun Seo
Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)
Biomedical Named Entity (NE) recognition is a core technique for various works in the biomedical domain. In previous studies, using machine learning algorithm shows better performance than dictionary-based and rule-based approaches because there are too many terminological variations of biomedical NEs and new biomedical NEs are constantly generated. To achieve the high performance with a machine-learning algorithm, good-quality corpora are required. However, it is difficult to obtain the good-quality corpora because an-notating a biomedical corpus for ma-chine-learning is extremely time-consuming and costly. In addition, most previous corpora are insufficient for high-level tasks because they cannot cover various domains. Therefore, we propose a method for generating a large amount of machine-labeled data that covers various domains. To generate a large amount of machine-labeled data, firstly we generate an initial machine-labeled data by using a chunker and MetaMap. The chunker is developed to extract only biomedical NEs with manually annotated data. MetaMap is used to annotate the category of bio-medical NE. Then we apply the self-training approach to bootstrap the performance of initial machine-labeled data. In our experiments, the biomedical NE recognition system that is trained with our proposed machine-labeled data achieves much high performance. As a result, our system outperforms biomedical NE recognition system that using MetaMap only with 26.03%p improvements on F1-score.
2016
KSAnswer: Question-answering System of Kangwon National University and Sogang University in the 2016 BioASQ Challenge
Hyeon-gu Lee | Minkyoung Kim | Harksoo Kim | Juae Kim | Sunjae Kwon | Jungyun Seo | Yi-reun Kim | Jung-Kyu Choi
Proceedings of the Fourth BioASQ workshop
Hyeon-gu Lee | Minkyoung Kim | Harksoo Kim | Juae Kim | Sunjae Kwon | Jungyun Seo | Yi-reun Kim | Jung-Kyu Choi
Proceedings of the Fourth BioASQ workshop
2015
A Simultaneous Recognition Framework for the Spoken Language Understanding Module of Intelligent Personal Assistant Software on Smart Phones
Changsu Lee | Youngjoong Ko | Jungyun Seo
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Changsu Lee | Youngjoong Ko | Jungyun Seo
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Improved Entity Linking with User History and News Articles
Soyun Jeong | Youngmin Park | Sangwoo Kang | Jungyun Seo
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters
Soyun Jeong | Youngmin Park | Sangwoo Kang | Jungyun Seo
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters
2008
Speakers’ Intention Prediction Using Statistics of Multi-level Features in a Schedule Management Domain
Donghyun Kim | Hyunjung Lee | Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of ACL-08: HLT, Short Papers
Donghyun Kim | Hyunjung Lee | Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of ACL-08: HLT, Short Papers
Information extraction using finite state automata and syllable n-grams in a mobile environment
Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of the ACL-08: HLT Workshop on Mobile Language Processing
Choong-Nyoung Seon | Harksoo Kim | Jungyun Seo
Proceedings of the ACL-08: HLT Workshop on Mobile Language Processing
2005
Improving Korean Speech Acts Analysis by Using Shrinkage and Discourse Stack
Kyungsun Kim | Youngjoong Ko | Jungyun Seo
Second International Joint Conference on Natural Language Processing: Full Papers
Kyungsun Kim | Youngjoong Ko | Jungyun Seo
Second International Joint Conference on Natural Language Processing: Full Papers
2004
Learning with Unlabeled Data for Text Categorization Using a Bootstrapping and a Feature Projection Technique
Youngjoong Ko | Jungyun Seo
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Youngjoong Ko | Jungyun Seo
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
2002
The Grammatical Function Analysis between Korean Adnoun Clause and Noun Phrase by Using Support Vector Machines
Songwook Lee | Tae-Yeoub Jang | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
Songwook Lee | Tae-Yeoub Jang | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
Text Categorization using Feature Projections
Youngjoong Ko | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
Youngjoong Ko | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
Automatic Text Categorization using the Importance of Sentences
Youngjoong Ko | Jinwoo Park | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
Youngjoong Ko | Jinwoo Park | Jungyun Seo
COLING 2002: The 19th International Conference on Computational Linguistics
A Reliable Indexing Method for a Practical QA System
Harksoo Kim | Jungyun Seo
COLING-02: Multilingual Summarization and Question Answering
Harksoo Kim | Jungyun Seo
COLING-02: Multilingual Summarization and Question Answering
2001
MAYA: A Fast Question-answering System Based on a Predictive Answer Indexer
Harksoo Kim | Kyungsun Kim | Gary Geunbae Lee | Jungyun Seo
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering
Harksoo Kim | Kyungsun Kim | Gary Geunbae Lee | Jungyun Seo
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering
2000
Automatic Text Categorization by Unsupervised Learning
Youngjoong Ko | Jungyun Seo
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
Youngjoong Ko | Jungyun Seo
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
Corpus-Based Learning of Compound Noun Indexing
Byung-Kwan Kwak | Jee-Hyub Kim | Geunbae Lee | Jung Yun Seo
ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval
Byung-Kwan Kwak | Jee-Hyub Kim | Geunbae Lee | Jung Yun Seo
ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval
1999
Analysis System of Speech Acts and Discourse Structures Using Maximum Entropy Model
Won Seug Choi | Jeong-Mi Cho | Jungyun Seo
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics
Won Seug Choi | Jeong-Mi Cho | Jungyun Seo
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics
Anaphora Resolution using Extended Centen’ng Algorithm in a Multi-modal Dialogue System
Harksoo Kim | Jeong-Mi Cho | Jungyun Seo
The Relation of Discourse/Dialogue Structure and Reference
Harksoo Kim | Jeong-Mi Cho | Jungyun Seo
The Relation of Discourse/Dialogue Structure and Reference
Dual Distributional Verb Sense Disambiguation with Small Corpora and Machine Readable Dictionaries
Jeong-Mi Cho | Jungyun Seo | Gil Chang Kim
Unsupervised Learning in Natural Language Processing
Jeong-Mi Cho | Jungyun Seo | Gil Chang Kim
Unsupervised Learning in Natural Language Processing
1995
A Robust Parser Based on Syntactic Information
Kong Joo Lee | Cheol Jung Kweon | Jungyun Seo | Gil Chang Kim
Seventh Conference of the European Chapter of the Association for Computational Linguistics
Kong Joo Lee | Cheol Jung Kweon | Jungyun Seo | Gil Chang Kim
Seventh Conference of the European Chapter of the Association for Computational Linguistics
1990
Transforming Syntactic Graphs Into Semantic Graphs
Hae-Chang Rim | Robert F. Simmons | Jungyun Seo
28th Annual Meeting of the Association for Computational Linguistics
Hae-Chang Rim | Robert F. Simmons | Jungyun Seo
28th Annual Meeting of the Association for Computational Linguistics
1989
Search
Fix author
Co-authors
- Youngjoong Ko 9
- Harksoo Kim 7
- Jeong-Mi Cho 3
- Taesuk Hong 3
- Juae Kim 3
- Janghoon Han 2
- Gil Chang Kim 2
- Kyungsun Kim 2
- Sunjae Kwon 2
- Hyeon-gu Lee 2
- Gary Geunbae Lee 2
- Choong-Nyoung Seon 2
- Robert F. Simmons 2
- Hwijeen Ahn 1
- Youbin Ahn 1
- Reinald Kim Amplayo 1
- Stanley Jungkyu Choi 1
- Won Seug Choi 1
- Jung-Kyu Choi 1
- Tae-Yeoub Jang 1
- Soyun Jeong 1
- Sangwoo Kang 1
- Bosung Kim 1
- Byoungjae Kim 1
- Donghyun Kim 1
- Jee-Hyub Kim 1
- Minkyoung Kim 1
- Yi-Reun Kim 1
- Byung-Kwan Kwak 1
- Cheol Jung Kweon 1
- Dongkyu Lee 1
- Songwook Lee 1
- Kong Joo Lee 1
- Hyunjung Lee 1
- Changsu Lee 1
- Changki Lee 1
- Chan Young Park 1
- Jinwoo Park 1
- Cheoneum Park 1
- Youngmin Park 1
- Hae Chang Rim 1
- Joongbo Shin 1
- Jimin Sun 1
- Seungpil Won 1