Annisa Maulida Ningtyas
2026
AnnoHID: LLM-Assisted Annotation Framework for Low-Resource Medical Texts
Annisa Maulida Ningtyas | Guntur Budi Herwanto | Yunita Sari | Rifki Afina Putri | Filip Kovacevic | Alaa El-Ebshihy | Varvara Arzt | Florina Piroi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Annisa Maulida Ningtyas | Guntur Budi Herwanto | Yunita Sari | Rifki Afina Putri | Filip Kovacevic | Alaa El-Ebshihy | Varvara Arzt | Florina Piroi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
This paper introduces AnnoHID, a semi-automated annotation framework designed for medical texts in low-resource languages. The system integrates large language models (LLMs) for pre-annotation and human validation to support efficient and consistent annotation. We demonstrate its application to Bahasa Indonesia medical social media texts from Alodokter, a medical Q A platform, for Named Entity Recognition (NER) and Medical Concept Normalization (MCN). We conducted a user study with 21 participants to demonstrate the effectiveness of AnnoHID. The results show that LLM-assisted annotation yields higher inter-annotator agreement for both NER (𝜅 = 0.76) and MCN (𝜅 = 0.63) and that human review improves raw LLM NER output, raising the F1 score from 0.39 to 0.45. However, LLM assistance did not reduce annotation time and may introduce normalization bias in MCN. The framework is multilingual, human-in-the-loop, and interoperable with standard medical terminologies, such as SNOMED-CT. Future work focuses on mitigating pre-annotation bias, reducing annotation overhead, and scaling evaluations to larger datasets and additional low-resource languages.
2020
ARTU / TU Wien and Artificial Researcher@ LongSumm 20
Alaa El-Ebshihy | Annisa Maulida Ningtyas | Linda Andersson | Florina Piroi | Andreas Rauber
Proceedings of the First Workshop on Scholarly Document Processing
Alaa El-Ebshihy | Annisa Maulida Ningtyas | Linda Andersson | Florina Piroi | Andreas Rauber
Proceedings of the First Workshop on Scholarly Document Processing
In this paper, we present our approach to solve the LongSumm 2020 Shared Task, at the 1st Workshop on Scholarly Document Processing. The objective of the long summaries task is to generate long summaries that cover salient information in scientific articles. The task is to generate abstractive and extractive summaries of a given scientific article. In the proposed approach, we are inspired by the concept of Argumentative Zoning (AZ) that de- fines the main rhetorical structure in scientific articles. We define two aspects that should be covered in scientific paper summary, namely Claim/Method and Conclusion/Result aspects. We use Solr index to expand the sentences of the paper abstract. We formulate each abstract sentence in a given publication as query to retrieve similar sentences from the text body of the document itself. We utilize a sentence selection algorithm described in previous literature to select sentences for the final summary that covers the two aforementioned aspects.