Liangying Shao
2026
M2PO: Multi-Perspective Multi-Pair Preference Optimization for Machine Translation
Hao Wang | Linlong Xu | Heng Liu | Yangyang Liu | Xiaohu Zhao | Bo Zeng | Liangying Shao | Yichen Dong | Xinwei Wu | Jiang Zhou | Tianyu Dong | Xiangxiang Zeng | Longyue Wang | Weihua Luo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hao Wang | Linlong Xu | Heng Liu | Yangyang Liu | Xiaohu Zhao | Bo Zeng | Liangying Shao | Yichen Dong | Xinwei Wu | Jiang Zhou | Tianyu Dong | Xiangxiang Zeng | Longyue Wang | Weihua Luo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Aligning Large Language Models (LLMs) to human preferences is pivotal for Machine Translation (MT), yet current approaches are often hindered by misleading reward signals. Our analysis reveals that prevailing Quality Estimation (QE) models exhibit a systematic blind spot towards **partial errors**—specifically partial hallucinations and omissions—often favoring superficially fluent but unfaithful translations. To address this, we propose **M2PO** (**M**ulti-Perspective **M**ulti-Pair **P**reference **O**ptimization), a data-centric framework for preference optimization in machine translation. First, to correct the bias towards fluency, M2PO uses a multi-perspective alignment mechanism that decouples semantic fidelity from fluency, prioritizing faithfulness via a curriculum strategy. Second, with the bias corrected, partial errors fall between perfect and severely incorrect translations, making them inefficient to learn via standard best-versus-worst comparisons. We thus introduce a multi-pair objective that leverages the full candidate list to capture these fine-grained error signals. Experiments on WMT23, WMT24, and FLORES-200 show that M2PO enables a 9B model to outperform leading open-source baselines and achieve parity with proprietary models like GPT-4o and Gemini-2.0-Flash, demonstrating significant potential for efficient, high-fidelity LLM-based translation.
2025
Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models
Huifeng Yin | Yu Zhao | Minghao Wu | Xuanfan Ni | Bo Zeng | Hao Wang | Tianqi Shi | Liangying Shao | Chenyang Lyu | Longyue Wang | Weihua Luo | Kaifu Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Huifeng Yin | Yu Zhao | Minghao Wu | Xuanfan Ni | Bo Zeng | Hao Wang | Tianqi Shi | Liangying Shao | Chenyang Lyu | Longyue Wang | Weihua Luo | Kaifu Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Reasoning Models (LRMs) such as OpenAI o1 and DeepSeek-R1 have shown remarkable reasoning capabilities by scaling test-time compute and generating long Chain-of-Thought (CoT). Distillation post-training on LRMs-generated data is a straightforward yet effective method to enhance the reasoning abilities of smaller models, but faces a critical bottleneck: we found that distilled long CoT data poses learning difficulty for small models and leads to the inheritance of biases (i.e., formalistic long-time thinking) when using Supervised Fine-tuning (SFT) and Reinforcement Learning (RL) methods. To alleviate this bottleneck, we propose constructing data from scratch using Monte Carlo Tree Search (MCTS). We then exploit a set of CoT-aware approaches, including Thoughts Length Balance, Fine-grained DPO, and Joint Post-training Objective, to enhance SFT and RL on the MCTS data. We conducted evaluation on various benchmarks such as math (GSM8K, MATH, AIME). instruction-following (Multi-IF) and planning (Blocksworld), results demonstrate our CoT-aware approaches substantially improve the reasoning performance of distilled models compared to standard distilled models via reducing the hallucinations in long-time thinking.
2024
Multi-Level Cross-Modal Alignment for Speech Relation Extraction
Liang Zhang | Zhen Yang | Biao Fu | Ziyao Lu | Liangying Shao | Shiyu Liu | Fandong Meng | Jie Zhou | Xiaoli Wang | Jinsong Su
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Liang Zhang | Zhen Yang | Biao Fu | Ziyao Lu | Liangying Shao | Shiyu Liu | Fandong Meng | Jie Zhou | Xiaoli Wang | Jinsong Su
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Speech Relation Extraction (SpeechRE) aims to extract relation triplets from speech data. However, existing studies usually use synthetic speech to train and evaluate SpeechRE models, hindering the further development of SpeechRE due to the disparity between synthetic and real speech. Meanwhile, the modality gap issue, unexplored in SpeechRE, limits the performance of existing models. In this paper, we construct two real SpeechRE datasets to facilitate subsequent researches and propose a Multi-level Cross-modal Alignment Model (MCAM) for SpeechRE. Our model consists of three components: 1) a speech encoder, extracting speech features from the input speech; 2) an alignment adapter, mapping these speech features into a suitable semantic space for the text decoder; and 3) a text decoder, autoregressively generating relation triplets based on the speech features. During training, we first additionally introduce a text encoder to serve as a semantic bridge between the speech encoder and the text decoder, and then train the alignment adapter to align the output features of speech and text encoders at multiple levels. In this way, we can effectively train the alignment adapter to bridge the modality gap between the speech encoder and the text decoder. Experimental results and in-depth analysis on our datasets strongly demonstrate the efficacy of our method.
One2Set + Large Language Model: Best Partners for Keyphrase Generation
Liangying Shao | Liang Zhang | Minlong Peng | Guoqi Ma | Hao Yue | Mingming Sun | Jinsong Su
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Liangying Shao | Liang Zhang | Minlong Peng | Guoqi Ma | Hao Yue | Mingming Sun | Jinsong Su
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Keyphrase generation (KPG) aims to automatically generate a collection of phrases representing the core concepts of a given document. The dominant paradigms in KPG include one2seq and one2set. Recently, there has been increasing interest in applying large language models (LLMs) to KPG. Our preliminary experiments reveal that it is challenging for a single model to excel in both recall and precision. Further analysis shows that: 1) the one2set paradigm owns the advantage of high recall, but suffers from improper assignments of supervision signals during training; 2) LLMs are powerful in keyphrase selection, but existing selection methods often make redundant selections. Given these observations, we introduce a generate-then-select framework decomposing KPG into two steps, where we adopt a one2set-based model as generator to produce candidates and then use an LLM as selector to select keyphrases from these candidates. Particularly, we make two important improvements on our generator and selector: 1) we design an Optimal Transport-based assignment strategy to address the above improper assignments; 2) we model the keyphrase selection as a sequence labeling task to alleviate redundant selections. Experimental results on multiple benchmark datasets show that our framework significantly surpasses state-of-the-art models, especially in absent keyphrase prediction.
2023
A Sequence-to-Sequence&Set Model for Text-to-Table Generation
Tong Li | Zhihao Wang | Liangying Shao | Xuling Zheng | Xiaoli Wang | Jinsong Su
Findings of the Association for Computational Linguistics: ACL 2023
Tong Li | Zhihao Wang | Liangying Shao | Xuling Zheng | Xiaoli Wang | Jinsong Su
Findings of the Association for Computational Linguistics: ACL 2023
Recently, the text-to-table generation task has attracted increasing attention due to its wide applications. In this aspect, the dominant model formalizes this task as a sequence-to-sequence generation task and serializes each table into a token sequence during training by concatenating all rows in a top-down order. However, it suffers from two serious defects: 1) the predefined order introduces a wrong bias during training, which highly penalizes shifts in the order between rows; 2) the error propagation problem becomes serious when the model outputs a long token sequence. In this paper, we first conduct a preliminary study to demonstrate the generation of most rows is order-insensitive. Furthermore, we propose a novel sequence-to-sequence&set text-to-table generation model. Specifically, in addition to a text encoder encoding the input text, our model is equipped with a table header generator to first output a table header, i.e., the first row of the table, in the manner of sequence generation. Then we use a table body generator with learnable row embeddings and column embeddings to generate a set of table body rows in parallel. Particularly, to deal with the issue that there is no correspondence between each generated table body row and target during training, we propose a target assignment strategy based on the bipartite matching between the first cells of generated table body rows and targets. Experiment results show that our model significantly surpasses the baselines, achieving state-of-the-art performance on commonly-used datasets.
Search
Fix author
Co-authors
- Jinsong Su 3
- Weihua Luo 2
- Longyue Wang 2
- Xiaoli Wang 2
- Bo Zeng 2
- Liang Zhang 2
- Tianyu Dong 1
- Yichen Dong 1
- Biao Fu (付彪) 1
- Tong Li 1
- Heng Liu 1
- Shiyu Liu 1
- Yangyang Liu 1
- Ziyao Lu 1
- Chenyang Lyu 1
- Guoqi Ma 1
- Fandong Meng 1
- Xuanfan Ni 1
- Minlong Peng 1
- Tianqi Shi 1
- Mingming Sun 1
- Hao Wang 1
- Hao Wang 1
- Zhihao Wang 1
- Minghao Wu 1
- Xinwei Wu 1
- Linlong Xu 1
- Zhen Yang 1
- Huifeng Yin 1
- Hao Yue 1
- Xiangxiang Zeng 1
- Kaifu Zhang 1
- Xiaohu Zhao 1
- Yu Zhao 1
- Xuling Zheng 1
- Jiang Zhou 1
- Jie Zhou 1