Thanet Markchom


2024

pdf
NCL-UoR at SemEval-2024 Task 8: Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection
Feng Xiong | Thanet Markchom | Ziwei Zheng | Subin Jung | Varun Ojha | Huizhi Liang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

SemEval-2024 Task 8 introduces the challenge of identifying machine-generated texts from diverse Large Language Models (LLMs) in various languages and domains. The task comprises three subtasks: binary classification in monolingual and multilingual (Subtask A), multi-class classification (Subtask B), and mixed text detection (Subtask C). This paper focuses on Subtask A & B. To tackle this task, this paper proposes two methods: 1) using traditional machine learning (ML) with natural language preprocessing (NLP) for feature extraction, and 2) fine-tuning LLMs for text classification. For fine-tuning, we use the train datasets provided by the task organizers. The results show that transformer models like LoRA-RoBERTa and XLM-RoBERTa outperform traditional ML models, particularly in multilingual subtasks. However, traditional ML models performed better than transformer models for the monolingual task, demonstrating the importance of considering the specific characteristics of each subtask when selecting an appropriate approach.

pdf
NU-RU at SemEval-2024 Task 6: Hallucination and Related Observable Overgeneration Mistake Detection Using Hypothesis-Target Similarity and SelfCheckGPT
Thanet Markchom | Subin Jung | Huizhi Liang
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

One of the key challenges in Natural Language Generation (NLG) is “hallucination,” in which the generated output appears fluent and grammatically sound but may contain incorrect information. To address this challenge, “SemEval-2024 Task 6 - SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes” is introduced. This task focuses on detecting overgeneration hallucinations in texts generated from Large Language Models for various NLG tasks. To tackle this task, this paper proposes two methods: (1) hypothesis-target similarity, which measures text similarity between a generated text (hypothesis) and an intended reference text (target), and (2) a SelfCheckGPT-based method to assess hallucinations via predefined prompts designed for different NLG tasks. Experiments were conducted on the dataset provided in this task. The results show that both of the proposed methods can effectively detect hallucinations in LLM-generated texts with a possibility for improvement.

2023

pdf
UoR-NCL at SemEval-2023 Task 1: Learning Word-Sense and Image Embeddings for Word Sense Disambiguation
Thanet Markchom | Huizhi Liang | Joyce Gitau | Zehao Liu | Varun Ojha | Lee Taylor | Jake Bonnici | Abdullah Alshadadi
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In SemEval-2023 Task 1, a task of applying Word Sense Disambiguation in an image retrieval system was introduced. To resolve this task, this work proposes three approaches: (1) an unsupervised approach considering similarities between word senses and image captions, (2) a supervised approach using a Siamese neural network, and (3) a self-supervised approach using a Bayesian personalized ranking framework. According to the results, both supervised and self-supervised approaches outperformed the unsupervised approach. They can effectively identify correct images of ambiguous words in the dataset provided in this task.

pdf
nclu_team at SemEval-2023 Task 6: Attention-based Approaches for Large Court Judgement Prediction with Explanation
Nicolay Rusnachenko | Thanet Markchom | Huizhi Liang
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Legal documents tend to be large in size. In this paper, we provide an experiment with attention-based approaches complemented by certain document processing techniques for judgment prediction. For the prediction of explanation, we consider this as an extractive text summarization problem based on an output of (1) CNN with attention mechanism and (2) self-attention of language models. Our extensive experiments show that treating document endings at first results in a 2.1% improvement in judgment prediction across all the models. Additional content peeling from non-informative sentences allows an improvement of explanation prediction performance by 4% in the case of attention-based CNN models. The best submissions achieved 8’th and 3’rd ranks on judgment prediction (C1) and prediction with explanation (C2) tasks respectively among 11 participating teams. The results of our experiments are published

2022

pdf
UoR-NCL at SemEval-2022 Task 3: Fine-Tuning the BERT-Based Models for Validating Taxonomic Relations
Thanet Markchom | Huizhi Liang | Jiaoyan Chen
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In human languages, there are many presuppositional constructions that impose a constrain on the taxonomic relations between two nouns depending on their order. These constructions create a challenge in validating taxonomic relations in real-world contexts. In SemEval2022-Task3 Presupposed Taxonomies: Evaluating Neural Network Semantics (PreTENS), the organizers introduced a task regarding validating the taxonomic relations within a variety of presuppositional constructions. This task is divided into two subtasks: classification and regression. Each subtask contains three datasets in multiple languages, i.e., English, Italian and French. To tackle this task, this work proposes to fine-tune different BERT-based models pre-trained on different languages. According to the experimental results, the fine-tuned BERT-based models are effective compared to the baselines in classification. For regression, the fine-tuned models show promising performance with the possibility of improvement.

2021

pdf
UoR at SemEval-2021 Task 4: Using Pre-trained BERT Token Embeddings for Question Answering of Abstract Meaning
Thanet Markchom | Huizhi Liang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Most question answering tasks focuses on predicting concrete answers, e.g., named entities. These tasks can be normally achieved by understanding the contexts without additional information required. In Reading Comprehension of Abstract Meaning (ReCAM) task, the abstract answers are introduced. To understand abstract meanings in the context, additional knowledge is essential. In this paper, we propose an approach that leverages the pre-trained BERT Token embeddings as a prior knowledge resource. According to the results, our approach using the pre-trained BERT outperformed the baselines. It shows that the pre-trained BERT token embeddings can be used as additional knowledge for understanding abstract meanings in question answering.

pdf
UOR at SemEval-2021 Task 12: On Crowd Annotations; Learning with Disagreements to optimise crowd truth
Emmanuel Osei-Brefo | Thanet Markchom | Huizhi Liang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Crowdsourcing has been ubiquitously used for annotating enormous collections of data. However, the major obstacles to using crowd-sourced labels are noise and errors from non-expert annotations. In this work, two approaches dealing with the noise and errors in crowd-sourced labels are proposed. The first approach uses Sharpness-Aware Minimization (SAM), an optimization technique robust to noisy labels. The other approach leverages a neural network layer called softmax-Crowdlayer specifically designed to learn from crowd-sourced annotations. According to the results, the proposed approaches can improve the performance of the Wide Residual Network model and Multi-layer Perception model applied on crowd-sourced datasets in the image processing domain. It also has similar and comparable results with the majority voting technique when applied to the sequential data domain whereby the Bidirectional Encoder Representations from Transformers (BERT) is used as the base model in both instances.

2020

pdf
UoR at SemEval-2020 Task 4: Pre-trained Sentence Transformer Models for Commonsense Validation and Explanation
Thanet Markchom | Bhuvana Dhruva | Chandresh Pravin | Huizhi Liang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

SemEval Task 4 Commonsense Validation and Explanation Challenge is to validate whether a system can differentiate natural language statements that make sense from those that do not make sense. Two subtasks, A and B, are focused in this work, i.e., detecting against-common-sense statements and selecting explanations of why they are false from the given options. Intuitively, commonsense validation requires additional knowledge beyond the given statements. Therefore, we propose a system utilising pre-trained sentence transformer models based on BERT, RoBERTa and DistillBERT architectures to embed the statements before classification. According to the results, these embeddings can improve the performance of the typical MLP and LSTM classifiers as downstream models of both subtasks compared to regular tokenised statements. These embedded statements are shown to comprise additional information from external resources which help validate common sense in natural language.