Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models
Yuki Zenimoto, Ryo Hasegawa, Takehito Utsuro, Masaharu Yoshioka, Noriko Kando
Abstract
Survey research using open-ended responses is an important method thatcontributes to the discovery of unknown issues and new needs. However,survey research generally requires time and cost-consuming manual dataprocessing, indicating that it is difficult to analyze large dataset.To address this issue, we propose an LLM-based method to automate partsof the grounded theory approach (GTA), a representative approach of thequalitative data analysis. We generated and annotated pseudo open-endedresponses, and used them as the training data for the coding proceduresof GTA. Through evaluations, we showed that the models trained withpseudo open-ended responses are quite effective compared with thosetrained with manually annotated open-ended responses. We alsodemonstrate that the LLM-based approach is highly efficient andcost-saving compared to human-based approach.- Anthology ID:
- 2024.naacl-srw.26
- Volume:
- Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Yang (Trista) Cao, Isabel Papadimitriou, Anaelia Ovalle
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 242–254
- Language:
- URL:
- https://aclanthology.org/2024.naacl-srw.26
- DOI:
- Cite (ACL):
- Yuki Zenimoto, Ryo Hasegawa, Takehito Utsuro, Masaharu Yoshioka, and Noriko Kando. 2024. Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 242–254, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models (Zenimoto et al., NAACL 2024)
- PDF:
- https://preview.aclanthology.org/ingestion-checklist/2024.naacl-srw.26.pdf