Bridging the Socioeconomic Gap in Education: A Hybrid AI and Human Annotation Approach

Nahed Abdelgaber, Labiba Jahan, Arham Vinit Doshi, Rishi Suri, Hamza Reza Pavel, Jia Zhang


Abstract
Students’ academic performance is influenced by various demographic factors, with socioeconomic class being a prominently researched and debated factor. Computer Science research traditionally prioritizes computationally definable problems, yet challenges such as the scarcity of high-quality labeled data and ethical concerns surrounding the mining of personal information can pose barriers to exploring topics like the impact of SES on students’ education. Overcoming these barriers may involve automating the collection and annotation of high-quality language data from diverse social groups through human collaboration. Therefore, our focus is on gathering unstructured narratives from Internet forums written by students with low socioeconomic status (SES) using machine learning models and human insights. We developed a hybrid data collection model that semi-automatically retrieved narratives from the Reddit website and created a dataset five times larger than the seed dataset. Additionally, we compared the performance of traditional ML models with recent large language models (LLMs) in classifying narratives written by low-SES students, and analyzed the collected data to extract valuable insights into the socioeconomic challenges these students encounter and the solutions they pursue.
Anthology ID:
2025.conll-1.23
Volume:
Proceedings of the 29th Conference on Computational Natural Language Learning
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Gemma Boleda, Michael Roth
Venues:
CoNLL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
348–364
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.conll-1.23/
DOI:
Bibkey:
Cite (ACL):
Nahed Abdelgaber, Labiba Jahan, Arham Vinit Doshi, Rishi Suri, Hamza Reza Pavel, and Jia Zhang. 2025. Bridging the Socioeconomic Gap in Education: A Hybrid AI and Human Annotation Approach. In Proceedings of the 29th Conference on Computational Natural Language Learning, pages 348–364, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Bridging the Socioeconomic Gap in Education: A Hybrid AI and Human Annotation Approach (Abdelgaber et al., CoNLL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.conll-1.23.pdf
Software:
 2025.conll-1.23.software.zip