Ege Erdogan


2024

Summarizing long pieces of text is a principal task in natural language processing with Machine Learning-based text generation models such as Large Language Models (LLM) being particularly suited to it. Yet these models are often used as black-boxes, making them hard to interpret and debug. This has led to calls by practitioners and regulatory bodies to improve the explainability of such models as they find ever more practical use. In this survey, we present a dual-perspective review of the intersection between explainability and summarization by reviewing the current state of explainable text summarization and also highlighting how summarization techniques are effectively employed to improve explanations.

2023

While recent advancements in the capabilities and widespread accessibility of generative language models, such as ChatGPT (OpenAI, 2022), have brought about various benefits by generating fluent human-like text, the task of distinguishing between human- and large language model (LLM) generated text has emerged as a crucial problem. These models can potentially deceive by generating artificial text that appears to be human-generated. This issue is particularly significant in domains such as law, education, and science, where ensuring the integrity of text is of the utmost importance. This survey provides an overview of the current approaches employed to differentiate between texts generated by humans and ChatGPT. We present an account of the different datasets constructed for detecting ChatGPT-generated text, the various methods utilized, what qualitative analyses into the characteristics of human versus ChatGPT-generated text have been performed, and finally, summarize our findings into general insights.