Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security

Ruslan Mitkov, Saad Ezzini, Tharindu Ranasinghe, Ignatius Ezeani, Nouran Khallaf, Cengiz Acarturk, Matthew Bradbury, Mo El-Haj, Paul Rayson (Editors)

Anthology ID:: 2024.nlpaics-1
Month:: July
Year:: 2024
Address:: Lancaster, UK
Venue:: NLPAICS
SIG:
Publisher:: International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1/
DOI:
Bib Export formats:: BibTeX
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1.pdf

PDF (full) BibTeX Search

pdf bib abs
Predatory Publication of AI-Generated Research Papers
Lizzie Burgiss | Ben Tatum | Christopher Henshaw | Madison Boswell | Alan Michaels

In an academic ecosystem where faculty face a “publish or perish” mantra, there are distinct openings for predatory publishers. Defined loosely, these are journals who value profits over scholarly cultivation and prey upon unsuspecting authors. Prior research has built lists of suspected predatory publishers to inform colleagues of risks, yet few quantify common characteristics exhibited by these publishers. To test hypotheses around these journals, we probed the behavior of 256 suspected predatory journals drawn from Beall’s and Kscien’s lists. Using active open source intelligence techniques, we tested the existence and extent of review processes, publication fees, operating location, and communication patterns. We submitted five different ChatGPT4-authored papers to our targeted publishers – these papers were accepted and/or published by 55 journals. By characterizing the responses, we developed a journal assessment rubric to aid authors seeking to publish their work. In the process, we also identified a presumptive shadow network of publishing companies using these practices based on analysis of websites, addresses, and shared employees. All underlying data for our study is open sourced for other researchers to draw their own conclusions.

pdf bib abs
Explainability of machine learning approaches in forensic linguistics: a case study in geolinguistic authorship profiling
Dana Roemling | Yves Scherrer | Aleksandra Miletić

Forensic authorship profiling uses linguistic markers to infer characteristics about an author of a text. This task is paralleled in dialect classification, where a prediction is made about the linguistic variety of a text based on the text itself. While there have been significant advances in recent years in variety classification, forensic linguistics rarely relies on these approaches due to their lack of transparency, among other reasons. In this paper we therefore explore the explainability of machine learning approaches considering the forensic context. We focus on variety classification as a means of geolinguistic profiling of unknown texts based on social media data from the German-speaking area. For this, we identify the lexical items that are the most impactful for the variety classification. We find that the extracted lexical features are indeed representative of their respective varieties and note that the trained models also rely on place names for classifications.

pdf bib abs
Metric-Oriented Pretraining of Neural Source Code Summarisation Transformers to Enable more Secure Software Development
Jesse Phillips | Mo El-Haj | Tracy Hall

Source code summaries give developers and maintainers vital information about source code methods. These summaries aid with the security of software systems as they can be used to improve developer and maintainer understanding of code, with the aim of reducing the number of bugs and vulnerabilities. However writing these summaries takes up the developers’ time and these summaries are often missing, incomplete, or outdated. Neural source code summarisation solves these issues by summarising source code automatically. Current solutions use Transformer neural networks to achieve this. We present CodeSumBART - a BART-base model for neural source code summarisation, pretrained on a dataset of Java source code methods and English method summaries. We present a new approach to training Transformers for neural source code summarisation by using epoch validation results to optimise the performance of the model. We found that in our approach, using larger n-gram precision BLEU metrics for epoch validation, such as BLEU-4, produces better performing models than other common NLG metrics.

pdf bib abs
Comprehensive threat analysis and systematic mapping of CVEs to MITRE framework
Stefano Simonetto | Peter Bosch

This research addresses the significance of threat intelligence by presenting a practical approach to generate a labeled dataset for mapping CVEs to MITRE. By linking Common Vulnerabilities and Exposures (CVEs) with the MITRE ATT&CK framework, the paper outlines a scheme that integrates the extensive CVE database with the techniques and tactics of the ATT&CK knowledge base. The core contribution lies in a detailed methodology designed to map CVEs onto corresponding ATT&CK techniques and, in turn, to tactics through a data-driven perspective, centering specifically on the labeling provided by NIST. This procedure enhances our understanding of cybersecurity threats and yields a structured, labeled dataset essential for practical threat analysis. It facilitates and improves the recognition and categorization of cybersecurity threats. Furthermore, the paper analyses the dataset in the context of cyber-threat intelligence. It highlights how vulnerability understanding and awareness have improved over the years through the continuous effort to place vulnerabilities in the context of an attack by linking it to abstract techniques. The dataset allows for a comprehensive cyber attack stage and kill-chain analysis. It serves as a training resource for algorithm development in various use cases, such as threat detection and large language model fine-tuning.

pdf bib abs
Predicting Software Vulnerability Trends with Multi-Recurrent Neural Networks: A Time Series Forecasting Approach
Abanisenioluwa K. Orojo | Webster C. Elumelu | Oluwatamilore O. Orojo

Predicting software vulnerabilities effectively is crucial for enhancing cybersecurity measures in an increasingly digital world. Traditional forecasting models often struggle with the complexity and dynamics of software vulnerability data, necessitating more advanced methodologies. This paper introduces a novel approach using Multi-Recurrent Neural Networks (MRN), which integrates multiple memory mechanisms and offers a balanced complexity suitable for time-series data. We compare MRNs against traditional models like ARIMA, Feedforward Multilayer Perceptrons (FFMLP), Simple Recurrent Networks (SRN), and Long Short-Term Memory (LSTM) networks. Our results demonstrate that MRNs consistently outperform these models, especially in settings with limited data or shorter forecasting horizons. MRNs show a remarkable ability to handle complex patterns and long-term dependencies more efficiently than other models, highlighting their potential for broader applications beyond cybersecurity. The findings suggest that MRNs can significantly improve the accuracy and efficiency of predictive analytics in cybersecurity, paving the way for their adoption in practical applications and further exploration in other predictive tasks.

pdf bib abs
Measuring the Effect of Induced Persona on Agenda Creation in Language-based Agents for Cyber Deception
Lewis Newsham | Daniel Prince | Ryan Hyland

This paper presents the SANDMAN architecture for cyber deception, employing Language Agents to create convincing human simulacra. These “Deceptive Agents” serve as advanced cyber decoys, designed to engage attackers to extend the observation period of attack behaviours. This research demonstrates the viability of persona-driven Deceptive Agents to generate plausible human activity to enhance the effectiveness of cyber deception strategies. Through experimentation, measurement and analysis, we illustrate how a prompt schema induces specific “personalities”, defined by the five-factor model of personality, in Large Language Models to generate measurably diverse, and plausible, behaviours.

pdf bib abs
Comparative Analysis of Natural Language Processing Models for Malware Spam Email Identification
Francisco Jáñez-Martino | Eduardo Fidalgo | Rocío Alaiz-Rodríguez | Andrés Carofilis | Alicia Martínez-Mendoza

Spam email is one of the main vectors of cyberattacks containing scams and spreading malware. Spam emails can contain malicious and external links and attachments with hidden malicious code. Hence, cybersecurity experts seek to detect this type of email to provide earlier and more detailed warnings for organizations and users. This work is based on a binary classification system (with and without malware) and evaluates models that have achieved high performance in other natural language applications, such as fastText, BERT, RoBERTa, DistilBERT, XLM-RoBERTa, and Large Language Models such as LLaMA and Mistral. Using the Spam Email Malware Detection (SEMD-600) dataset, we compare these models regarding precision, recall, F1 score, accuracy, and runtime. DistilBERT emerges as the most suitable option, achieving a recall of 0.792 and a runtime of 1.612 ms per email.

Spam emails constitute a significant proportion of emails received by users, and can result in financial losses or in the download of malware on the victim’s device. Cyberattackers create spam campaigns to deliver spam messages on a large scale and benefit from the low economic investment and anonymity required to create the attacks. In addition to spam filters, raising awareness about active email scams is a relevant measure that helps mitigate the consequences of spam. Therefore, detecting campaigns becomes a relevant task in identifying and alerting the targets of spam. In this paper, we propose an unsupervised learning algorithm, SpamClus_1, an iterative algorithm that groups spam email campaigns using agglomerative clustering. The measures employed to determine the clusters are the minimum number of samples and minimum percentage of similarity within a cluster. Evaluating SpamClus_1 on a set of emails provided by the Spanish National Cybersecurity Institute (INCIBE), we found that the optimal values are 50 minimum samples and a minimum cosine similarity of 0.8. The clustering results show 19 spam datasets with 3048 spam samples out of 6702 emails from a range of three consecutive days and eight spam clusters with 870 spam samples out of 1469 emails from one day.

pdf bib abs
LSTM-PSO: NLP-based model for detecting Phishing Attacks
Abdulrahman A. Alshdadi

Detecting phishing attacks involves recognizing and stopping attempts to trick users into revealing information, like passwords, credit card details or personal data without authorization. While most recent related work focus on detecting phishing attacks by analyzing, URLs, email header and content and web pages based on their content, regardless of entering text sequentially into Deep Learning (DL) algorithms. This aapproach causes the intrinsic richness of the relationship between words and part of speech to be lost. This study main contribution is to detect phishing attacks by introducing an integrated model that emphasizes on analyzing the text content of suspicious web pages a model that detects not on URL addresses. The approach of the proposed model is based on using Natural Language Processing (NLP) for processing webpage content, Particle swarm optimization algorithm (PSO) for optimizing feature extraction process and Deep Learning (DL) algorithms for classifying web page content into phishing or legitimate. NLP techniquees are used to preprocess webpage content and word2vector embeddings for Word Representation to extract and select best features into DL algorithm. Two different approaches Long Short-Term Memory (LSTM) are assessed: traditional LSTM and enhanced LSTM-PSO. The results show promising outcomes by the proposed model in detecting phishing attacks as both LSTM and LSTM-PSO achieved an accuracy of 97% and 98.3 respectively.

pdf bib abs
The Influence of the Perplexity Score in the Detection of Machine-generated Texts
Alberto José Gutiérrez Megías | L. Alfonso Ureña-López | Eugenio Martínez Cámara

The high performance of large language models (LLM) generating natural language represents a real threat, since they can be leveraged to generate any kind of deceptive content. Since there are still disparities among the language generated by machines and the human language, we claim that perplexity may be used as classification signal to discern between machine and human text. We propose a classification model based on XLM-RoBERTa, and we evaluate it on the M4 dataset. The results show that the perplexity score is useful for the identification of machine generated text, but it is constrained by the differences among the LLMs used in the training and test sets.

pdf bib abs
Variation between credible and non-credible news across topics
Emilie Francis

‘Fake News’ continues to undermine trust in modern journalism and politics. Despite continued efforts to study fake news, results have been conflicting. Previous attempts to analyse and combat fake news have largely focused on distinguishing fake news from truth, or differentiating between its various subtypes (such as propaganda, satire, misinformation, etc.) This paper conducts a linguistic and stylistic analysis of fake news, focusing on variation between various news topics. It builds on related work identifying features from discourse and linguistics in deception detection by analysing five distinct news topics: Economy, Entertainment, Health, Science, and Sports. The results emphasize that linguistic features vary between credible and deceptive news in each domain and highlight the importance of adapting classification tasks to accommodate variety-based stylistic and linguistic differences in order to achieve better real-world performance.

pdf bib abs
Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation
Deshan Koshala Sumanathilaka | Nicholas Micallef | Julian Hough

Ambiguous words are often found within modern digital communications. Lexical ambiguity challenges traditional Word Sense Disambiguation (WSD) methods, due to limited data. Consequently, the efficiency of translation, information retrieval, and question-answering systems is hindered by these limitations. This study investigates the use of Large Language Models (LLMs) to improve WSD using a novel approach combining a systematic prompt augmentation mechanism with a knowledge base (KB) consisting of different sense interpretations. The proposed method incorporates a human-in-loop approach for prompt augmentation where prompt is supported by Part-of-Speech (POS) tagging, synonyms of ambiguous words, aspect-based sense filtering and few-shot prompting to guide the LLM. By utilizing a few-shot Chain of Thought (COT) prompting-based approach, this work demonstrates a substantial improvement in performance. The evaluation was conducted using FEWS test data and sense tags. This research advances accurate word interpretation in social media and digital communication.

pdf bib abs
Privacy Preservation in Federated Market Basket Analysis using Homomorphic Encryption
Sameeka Saini | Durga Toshniwal

Our proposed work introduces a novel approach to privacy-preserving federated learning market basket analysis using Homomorphic encryption. By encrypting frequent mining operations using Homomorphic encryption, our method ensures data privacy without compromising analysis efficiency. Experiments on diverse datasets validate its effectiveness in maintaining data integrity while preserving privacy.

pdf bib abs
WAVE-27K: Bringing together CTI sources to enhance threat intelligence models
Felipe Castaño | Amaia Gil-Lerchundi | Raul Orduna-Urrutia | Eduardo Fidalgo Fernandez | Rocío Alaiz-Rodríguez

Considering the growing flow of information on the internet, and the increased incident-related data from diverse sources, unstructured text processing gains importance. We have presented an automated approach to link several CTI sources through the mapping of external references. Our method facilitates the automatic construction of datasets, allowing for updates and the inclusion of new samples and labels. Following this method we built a new dataset of unstructured CTI descriptions called Weakness, Attack, Vulnerabilities, and Events 27k (WAVE-27k). Our dataset includes information about 27 different MITRE techniques, containing 22539 samples related one technique and 5262 related to two or more techniques simultaneously. We evaluated five BERT-based models into the WAVE-27K dataset concluding that SecRoBERTa reaches the highest performance with a 77.52% F1 score. Additionally, we compare the performance of the SecRoBERTa on the WAVE-27K dataset and other public datasets. The results show that the model using the WAVE-27K dataset outperforms the others. These results demonstrate that the data within WAVE-27K contains relevant information and that the proposed method effectively built a dataset with a level of quality sufficient to train a machine-learning model.

pdf bib abs
Human-in-the-loop Anomaly Detection and Contextual Intelligence for Enhancing Cybersecurity Management
Thomas Schaberreiter | Jerry Andriessen | Cinzia Cappiello | Alex Papanikolaou | Mirjam Pardijs

Cybersecurity management is a sociotechnical problem comprising organisational knowledge management of humans and technology. Focusing on risk and incident management, we present our approach for enhancing cybersecurity awareness in organisations and ecosystems. By augmenting our cybersecurity awareness platform with human-in-the-loop anomaly detection and machine learning, we are able to handle the dynamics of organisational human activity, as well as the continuous developments in the cybersecurity domain. We illustrate the potential impact of our approach with a realistic example in the healthcare context

pdf bib abs
Is it Offensive or Abusive? An Empirical Study of Hateful Language Detection of Arabic Social Media Texts
Salim Al Mandhari | Mo El-Haj | Paul Rayson

Among many potential subjects studied in Sentiment Analysis, widespread offensive and abusive language on social media has triggered interest in reducing its risks on users; children in particular. This paper centres on distinguishing between offensive and abusive language detec- tion within Arabic social media texts through the employment of various machine and deep learning techniques. The techniques include Naïve Bayes (NB), Support Vector Machine (SVM), fastText, keras, and RoBERTa XML multilingual embeddings, which have demon- strated superior performance compared to other statistical machine learning methods and dif- ferent kinds of embeddings like fastText. The methods were implemented on two separate corpora from YouTube comments totalling 47K comments. The results demonstrated that all models, except NB, reached an accuracy of 82%. It was also shown that word tri-grams en- hance classification performance, though other tuning techniques were applied such as TF-IDF and grid-search. The linguistic findings, aimed at distinguishing between offensive and abu- sive language, were consistent with machine learning (ML) performance, which effectively classified the two distinct classes of sentiment: offensive and abusive.

pdf bib abs
The Elsagate Corpus: Characterising Commentary on Alarming Video Content
Panagiotis Soustas | Matthew Edwards

Identifying disturbing online content being targeted at children is an important content moderation problem. However, previous approaches to this problem have focused on features of the content itself, and neglected potentially helpful insights from the reactions expressed by its online audience. To help remedy this, we present the Elsagate Corpus, a collection of over 22 million comments on more than 18,000 videos that have been associated with disturbing content. We describe the how we collected this corpus and present some insights from our initial explorations, including the surprisingly positive reactions from audiences to this content, some unusual non-linguistic commenting behavior of uncertain purpose and references to some concerning themes.

pdf bib abs
Abusive Speech Detection in Serbian using Machine Learning
Danka Jokić | Ranka Stanković | Branislava Šandrih Todorović

The increase in the use of abusive language on social media and virtual platforms has emphasized the importance of developing efficient hate speech detection systems. While there have been considerable advancements in creating such systems for the English language, resources are scarce for other languages, such as Serbian. This research paper explores the use of machine learning and deep learning techniques to identify abusive language in Serbian text. The authors used AbCoSER, a dataset of Serbian tweets that have been labeled as abusive or non-abusive. They evaluated various algorithms to classify tweets, and the best-performing model is based on the deep learning transformer architecture. The model attained an F1 macro score of 0.827, a figure that is commensurate with the benchmarks established for offensive speech datasets of a similar magnitude in other languages.

pdf bib abs
Fighting Cyber-malice: A Forensic Linguistics Approach to Detecting AI-generated Malicious Texts
Rui Sousa-Silva

Technology has long been used for criminal purposes, but the technological developments of the last decades have allowed users to remain anonymous online, which in turn increased the volume and heterogeneity of cybercrimes and made it more difficult for law enforcement agencies to detect and fight them. However, as they ignore the very nature of language, cybercriminals tend to overlook the potential of linguistic analysis to positively identify them by the language that they use. Forensic linguistics research and practice has therefore proven reliable in fighting cybercrime, either by analysing authorship to confirm or reject the law enforcement agents’ suspicions, or by sociolinguistically profiling the author of the cybercriminal communications to provide the investigators with sociodemographic information to help guide the investigation. However, large language models and generative AI have raised new challenges: not only has cybercrime increased as a result of AI-generated texts, but also generative AI makes it more difficult for forensic linguists to attribute the authorship of the texts to the perpetrators. This paper argues that, although a shift of focus is required, forensic linguistics plays a core role in detecting and fighting cybercrime. A focus on deep linguistic features, rather than low-level and purely stylistic elements, has the potential to discriminate between human- and AI-generated texts and provide the investigation with vital information. We conclude by discussing the foreseeable future limitations, especially resulting from the developments expected from language models.

pdf bib abs
Deciphering Cyber Threats: A Unifying Framework with GPT-3.5, BERTopic and Feature Importance
Chun Man Tsang | Tom Bell | Antonios Gouglidis | Mo El-Haj

This paper presents a methodology for the categorisation and attribute quantification of cyber threats. The data was sourced from Common Weakness Enumeration (CWE) entries, encompassing 503 hardware and software vulnerabilities. For each entry, GPT-3.5 generated detailed descriptions for 12 key threat attributes. Employing BERTopic for topic modelling, our research focuses on clustering cyber threats and evaluates the efficacy of various dimensionality reduction and clustering algorithms, notably finding that UMAP combined with HDBSCAN, optimised through parameterisation, outperforms other configurations. The study further explores feature importance analysis by converting topic modelling results into a classification paradigm, achieving classification accuracies between 60% and 80% with algorithms such as Random Forest, XGBoost, and Linear SVM. This feature importance analysis quantifies the significance of each threat attribute, with SHAP identified as the most effective method for this calculation.

pdf bib abs
CECILIA: Enhancing CSIRT Effectiveness with Transformer-Based Cyber Incident Classification
Juan Jose Delgado Sotes | Alicia Martinez Mendoza | Andres Carofilis Vasco | Eduardo Fidalgo Fernandez | Enrique Alegre Gutierrez

This paper introduces an approach to improv ing incident response times by applying various Artificial Intelligence (AI) classification algorithms based on transformers to analyze the efficacy of these models in categorizing cyber incidents. As a first contribution, we developed a cyber incident dataset, CECILIA-10C-900, collecting cyber incident reports from six qualified web sources. The contribution of creating a dataset on cyber incident detection is remarkable due to the scarcity of such datasets. Each incident has been tagged by hand according to the cyber incident taxonomy defined by the CERT (Computer Emergency Response Team) of the National Institute of Cybersecurity (INCIBE). This dataset is highly unbalanced, so we decided to unify the four least represented classes under the label “others”, leaving a dataset with six categories (CECILIA-6C-900). With these reliable datasets, we performed a comparison of the best algorithms specifically for the cyber incident classification problem, evaluating eight different metrics on two conventional classifiers and six other transformer-based classifiers. Our study highlights the importance of having a rapid classification mechanism for CSIRTs (Computer Security Incident Response Teams) and showcases the potential of machine learning algorithms to improve cyber defense mechanisms. The findings from our analysis provide valuable insights into the strengths and limitations of different classification techniques. It can be used in future work on cyber incident response strategies

pdf bib abs
U-BERTopic: An Urgency-Aware BERT-Topic Modeling Approach for Detecting CyberSecurity Issues via Social Media
Majed Albarrak | Gabriele Pergola | Arshad Jhumka

For computer systems to remain secure, timely information about system vulnerabilities and security threats are vital. Such information can be garnered from various sources, most notably from social media platforms. However, such information may often lack context and structure and, more importantly, are often unlabelled. For such media to act as alert systems, it is important to be able to first distinguish among the topics being discussed. Subsequently, identifying the nature of the threat or vulnerability is of importance as this will influence the remedial actions to be taken, e.g., is the threat imminent? In this paper, we propose U-BERTopic, an urgency-aware BERTtopic modelling approach for detecting cybersecurity issues through social media, by integrating sentiment analysis with contextualized topic modelling like BERTopic. We compare UBERTopic against three other topic modelling techniques using four different evaluation metrics for topic modelling and cybersecurity classification by running on a 2018 cyber security-related Twitter dataset. Our results show that (i) for topic modelling and under certain settings (e.g., number of topics), U-BERTopic often outperforms all other topic modelling techniques and (ii) for attack classification, U-BERTopic performs better for some attacks such as vulnerability identification in some settings.

pdf bib abs
A Proposal Framework Security Assessment for Large Language Models
Daniel Mendonça Colares | Raimir Holanda Filho | Luis Borges Gouveia

Large Language Models (LLMs), despite their numerous applications and the significant benefits they offer, have proven to be extremely susceptible to attacks of various natures. Due to their large number of vulnerabilities, often unknown, and which consequently become potential targets for attacks, investing in the implementation of this technology becomes a gamble. Ensuring the security of LLMs is of utmost importance, but unfortunately, providing effective security for so many different vulnerabilities is a costly task, especially for companies seeking rapid growth. Many studies focus on analyzing the security of LLMs for specific types of vulnerabilities, such as prompt inject or jailbreaking, but they rarely assess the security of the model as a whole. Therefore, this study aims to facilitate the evaluation of vulnerabilities across various models and identify their main weaknesses. To achieve this, our work sought to develop a comprehensive framework capable of utilizing various scanners to assess the security of LLMs, allowing for a detailed analysis of their vulnerabilities. Through the use of the framework, we tested and evaluated multiple models, and with the results collected from these assessments of various vulnerabilities for each model tested, we analyzed the obtained data. Our results not only demonstrated potential weaknesses in certain models but also revealed a possible relationship between model security and the number of parameters for similar models.

pdf bib abs
Not Everything Is Online Grooming: False Risk Finding in Large Language Model Assessments of Human Conversations
Ellie Prosser | Matthew Edwards

Large Language Models (LLMs) have rapidly been adopted by the general public, and as usage of these models becomes commonplace, they naturally will be used for increasingly human-centric tasks, including security advice and risk identification for personal situations. It is imperative that systems used in such a manner are well-calibrated. In this paper, 6 popular LLMs were evaluated for their propensity towards false or over-cautious risk finding in online interactions between real people, with a focus on the risk of online grooming, the advice generated for such contexts, and the impact of prompt specificity. Through an analysis of 3840 generated answers, it was found that models could find online grooming in even the most harmless of interactions, and that the generated advice could be harmful, judgemental, and controlling. We describe these shortcomings, and identify areas for improvement, including suggestions for future research directions.

pdf bib abs
Redacted Contextual Question Answering with Generative Large Language Models
Jacob Lichtefeld | Joe A. Cecil | Alex Hedges | Jeremy Abramson | Marjorie Freedman

Many contexts, such as medicine, finance, and cybersecurity, require controlled release of private or internal information. Traditionally, manually redacting sensitive information for release is an arduous and costly process, and while generative Large Language Models (gLLM) show promise at document-based ques- tion answering and summarization, their ability to do so while redacting sensitive information has not been widely explored. To address this, we introduce a new task, called redacted contextual question answering (RC-QA). This explores a gLLM’s ability to collaborate with a trusted user in a question-answer task as a proxy for drafting a public release informed by the redaction of potentially sensitive information, presented here in the form of constraints on the answers. We introduce a sample question-answer dataset for this task using publicly available data with four sample constraints. We present evaluation results for five language models and two refined models. Our results show that most models—especially open-source models—struggle to accurately answer questions under these constraints. We hope that these preliminary results help catalyze further exploration into this topic, and to that end, we make our code and data avail- able at https://github.com/isi-vista/ redacted-contextual-question-answering.

pdf bib abs
Unlocking LLMs: Addressing Scarce Data and Bias Challenges in Mental Health and Therapeutic Counselling
Vivek Kumar | Pushpraj Singh Rajwat | Giacomo Medda | Eirini Ntoutsi | Diego Reforgiato Recupero

abstract Large language models (LLMs) have shown promising capabilities in healthcare analysis but face several challenges like hallucinations, parroting, and bias manifestation. These challenges are exacerbated in complex, sensitive, and low-resource domains. Therefore, in this work, we introduce IC-AnnoMI, an expert-annotated motivational interviewing (MI) dataset built upon AnnoMI, by generating in-context conversational dialogues leveraging LLMs, particularly ChatGPT. IC-AnnoMI employs targeted prompts accurately engineered through cues and tailored information, taking into account therapy style (empathy, reflection), contextual relevance, and false semantic change. Subsequently, the dialogues are annotated by experts, strictly adhering to the Motivational Interviewing Skills Code (MISC), focusing on both the psychological and linguistic dimensions of MI dialogues. We comprehensively evaluate the IC-AnnoMI dataset and ChatGPT’s emotional reasoning ability and understanding of domain intricacies by modeling novel classification tasks employing several classical machine learning and current state-of-the-art transformer approaches. Finally, we discuss the effects of progressive prompting strategies and the impact of augmented data in mitigating the biases manifested in IC-AnnoM. Our contributions provide the MI community with not only a comprehensive dataset but also valuable insights for using LLMs in empathetic text generation for conversational therapy in supervised settings.