Meng Li

Also published as:


2025

pdf bib
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li | Guangda Huzhang | Haibo Zhang | Xiting Wang | Anxiang Zeng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Direct Preference Optimization (DPO) has emerged as a promising framework for aligning Large Language Models (LLMs) with human preferences by directly optimizing the log-likelihood difference between chosen and rejected responses. However, existing methods assign equal importance to all tokens in the response, while humans focus on more meaningful parts. This leads to suboptimal preference optimization, as irrelevant or noisy tokens disproportionately influence DPO loss. To address this limitation, we propose Optimal Transport-based token weighting scheme for enhancing direct Preference Optimization (OTPO). By emphasizing semantically meaningful token pairs and de-emphasizing less relevant ones, our method introduces a context-aware token weighting scheme that yields a more contrastive reward difference estimate. This adaptive weighting enhances reward stability, improves interpretability, and ensures that preference optimization focuses on meaningful differences between responses. Extensive experiments have validated OTPO’s effectiveness in improving instruction-following ability across various settings.

pdf bib
Internal Value Alignment in Large Language Models through Controlled Value Vector Activation
Haoran Jin | Meng Li | Xiting Wang | Zhihao Xu | Minlie Huang | Yantao Jia | Defu Lian
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Aligning Large Language Models (LLMs) with human values has attracted increasing attention since it provides clarity, transparency, and the ability to adapt to evolving scenarios. In this paper, we introduce a Controlled Value Vector Activation (ConVA) method that directly aligns the internal values of LLMs by interpreting how a value is encoded in their latent representations and modifies relevant activations to ensure consistent values in LLMs. To ensure an accurate and unbiased interpretation, we propose a context-controlled value vector identification method. To consistently control values without sacrificing model performance, we introduce a gated value vector activation method for effective and minimum degree of value control. Experiments show that our method achieves the highest control success rate across 10 basic values without hurting LLM performance and fluency, and ensures target values even with opposite and potentially malicious input prompts. Source code and data are available at https://github.com/hr-jin/ConVA.

pdf bib
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
Meng Li | Michael Vrazitulis | David Schlangen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Rational speakers are supposed to know what they know and what they do not know, and to generate expressions matching the strength of evidence. In contrast, it is still a challenge for current large language models to generate corresponding utterances based on the assessment of facts and confidence in an uncertain real-world environment. While it has recently become popular to estimate and calibrate confidence of LLMs with verbalized uncertainty, what is lacking is a careful examination of the linguistic knowledge of uncertainty encoded in the latent space of LLMs. In this paper, we draw on typological frameworks of epistemic expressions to evaluate LLMs’ knowledge of epistemic modality, using controlled stories. Our experiments show that the performance of LLMs in generating epistemic expressions is limited and not robust, and hence the expressions of uncertainty generated by LLMs are not always reliable. To build uncertainty-aware LLMs, it is necessary to enrich semantic knowledge of epistemic modality in LLMs.

2024

pdf bib
基于意合图语义理论的结构标注体系与资源建设∗(System and Resource Construction Based on the Semantic Theory of Chinese-Parataxis-Graph)
Mengxi Guo (郭梦溪) | Meng Li (李梦) | Endong Xun (荀恩东) | Gaoqi Rao (饶高琦) | Zhongyang Yu (于钟洋)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“意合图是一种以事件为中心的多层次语义表示方法,由事件结构与实体结构构成,通过多层次语义体系设计,实现对事件的多层次分析。本文细化并制定了意合图标注规范,采用分层分级的标注策略,在自主研发的在线标注系统中对新闻语料和国际中文教育阅读语料进行了意合图QNP标注工作。通过本次标注,验证了意合图体系的合理性和可标注性,并构建了意合图语义资源库。”

pdf bib
意合图:中文多层次语义表示方法∗(Parataxis Graph: Multi-level Semantic Representation Method for Chinese)
Mengxi Guo (郭梦溪) | Endong Xun (荀恩东) | Meng Li (李梦) | Gaoqi Rao (饶高琦)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“基于参数的语义表示虽取得成就,但符号化的语义表示仍具有不可忽视的意义。我们在语义学基础上,充分考虑符号化语义表示在NLP领域落地中的需求,提出了一种兼具通用性与扩展性的多层次语义表示方法——意合图。意合图以事件为核心,由事件结构与实体结构构成,通过多层次语义体系设计,提升与场景结合的能力,并力求对不同层级的语言单元作一贯式表示。在资源建设和相关分析实验中取得良好效果。本文将重点介绍意合图设计理念与多层次语义体系。”

pdf bib
中文意合图语义解析评测
Mengxi Guo (郭梦溪) | Meng Li (李梦) | Zeying Jin (靳泽莹) | Xiaojing Wu (吴晓靖) | Gaoqi Rao (饶高琦) | Gongbo Tang (唐共波) | Endong Xun (荀恩东)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“中文意合图是近年提出的中文语义表示方法。本次评测是首次基于意合图理论的语义分析评测,旨在探索面向意合图理论的语义计算方法,评估机器的语义分析能力。本次评测共有14支队伍报名,最终有7支队伍提交结果,其中有5支队伍提交技术报告与模型,均成功复现。在评测截止时间内,表现最好的队伍使用大语言模型LoRA微调方法获得了F1值为72.06%的成绩。在最终提交技术报告的5支队伍中,有4支队伍使用了大语言模型微调方法,在一定程度上表明了目前技术发展的趋势。”

pdf bib
Evaluating Readability and Faithfulness of Concept-based Explanations
Meng Li | Haoran Jin | Ruixuan Huang | Zhihao Xu | Defu Lian | Zijia Lin | Di Zhang | Xiting Wang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

With the growing popularity of general-purpose Large Language Models (LLMs), comes a need for more global explanations of model behaviors. Concept-based explanations arise as a promising avenue for explaining high-level patterns learned by LLMs. Yet their evaluation poses unique challenges, especially due to their non-local nature and high dimensional representation in a model’s hidden space. Current methods approach concepts from different perspectives, lacking a unified formalization. This makes evaluating the core measures of concepts, namely faithfulness or readability, challenging. To bridge the gap, we introduce a formal definition of concepts generalizing to diverse concept-based explanations’ settings. Based on this, we quantify the faithfulness of a concept explanation via perturbation. We ensure adequate perturbation in the high-dimensional space for different concepts via an optimization problem. Readability is approximated via an automatic and deterministic measure, quantifying the coherence of patterns that maximally activate a concept while aligning with human understanding. Finally, based on measurement theory, we apply a meta-evaluation method for evaluating these measures, generalizable to other types of explanations or tasks as well. Extensive experimental analysis has been conducted to inform the selection of explanation evaluation measures.

pdf bib
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts
Ganesh Jawahar | Haichuan Yang | Yunyang Xiong | Zechun Liu | Dilin Wang | Fei Sun | Meng Li | Aasish Pappu | Barlas Oguz | Muhammad Abdul-Mageed | Laks Lakshmanan | Raghuraman Krishnamoorthi | Vikas Chandra
Findings of the Association for Computational Linguistics: ACL 2024

Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed due to weight sharing. In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification.This study introduces a solution called mixture-of-supernets, a generalized supernet formulation leveraging mixture-of-experts (MoE) to enhance supernet model expressiveness with minimal training overhead. Unlike conventional supernets, this method employs an architecture-based routing mechanism, enabling indirect sharing of model weights among subnetworks. This customization of weights for specific architectures, learned through gradient descent, minimizes retraining time, significantly enhancing training efficiency in NLP. The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models, exhibiting a superior latency-BLEU tradeoff compared to HAT, the SoTA NAS framework for machine translation. Furthermore, it excels in NAS for building memory-efficient task-agnostic BERT models, surpassing NAS-BERT and AutoDistil across various model sizes. The code can be found at: https://github.com/UBC-NLP/MoS.

2023

pdf bib
Target-to-Source Augmentation for Aspect Sentiment Triplet Extraction
Yice Zhang | Yifan Yang | Meng Li | Bin Liang | Shiwei Chen | Ruifeng Xu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Aspect Sentiment Triplet Extraction (ASTE) is an important task in sentiment analysis, aiming to extract aspect-level opinions and sentiments from user-generated reviews. The fine-grained nature of ASTE incurs a high annotation cost, while the scarcity of annotated data limits the performance of existing methods. This paper exploits data augmentation to address this issue. Traditional augmentation methods typically modify the input sentences of existing samples via heuristic rules or language models, which have shown success in text classification tasks. However, applying these methods to fine-grained tasks like ASTE poses challenges in generating diverse augmented samples while maintaining alignment between modified sentences and origin labels. Therefore, this paper proposes a target-to-source augmentation approach for ASTE. Our approach focuses on learning a generator that can directly generate new sentences based on labels and syntactic templates. With this generator, we can generate a substantial number of diverse augmented samples by mixing labels and syntactic templates from different samples. Besides, to ensure the quality of the generated sentence, we introduce fluency and alignment discriminators to provide feedback on the generated sentence and then use this feedback to optimize the generator via a reinforcement learning framework. Experiments demonstrate that our approach significantly enhances the performance of existing ASTE models.

2018

pdf bib
ISCLAB at SemEval-2018 Task 1: UIR-Miner for Affect in Tweets
Meng Li | Zhenyuan Dong | Zhihao Fan | Kongming Meng | Jinghua Cao | Guanqi Ding | Yuhan Liu | Jiawei Shan | Binyang Li
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents a UIR-Miner system for emotion and sentiment analysis evaluation in Twitter in SemEval 2018. Our system consists of three main modules: preprocessing module, stacking module to solve the intensity prediction of emotion and sentiment, LSTM network module to solve multi-label classification, and the hierarchical attention network module for solving emotion and sentiment classification problem. According to the metrics of SemEval 2018, our system gets the final scores of 0.636, 0.531, 0.731, 0.708, and 0.408 on 5 subtasks, respectively.