2025
pdf
bib
abs
Cool-Fusion: Fuse Large Language Models without Training
Cong Liu
|
Xiaojun Quan
|
Yan Pan
|
Weigang Wu
|
Xu Chen
|
Liang Lin
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
We focus on the problem of fusing two or more heterogeneous large language models (LLMs) to leverage their complementary strengths. One of the challenges of model fusion is high computational load, specifically in fine-tuning or aligning vocabularies. To address this, we propose Cool-Fusion, a simple yet effective approach that fuses the knowledge of source LLMs, which does not require training. Unlike ensemble methods, Cool-Fusion is applicable to any set of source LLMs that have different vocabularies. To overcome the vocabulary discrepancies among LLMs, we ensemble LLMs on text level, allowing them to rerank the generated texts by each other with different granularities. Extensive experiments have been conducted across a variety of benchmark datasets. On GSM8K, Cool-Fusion increases accuracy from three strong source LLMs by a significant margin of 17.4%.
pdf
bib
abs
Chain of Methodologies: Scaling Test Time Computation without Training
Cong Liu
|
Jie Wu
|
Weigang Wu
|
Xu Chen
|
Liang Lin
|
Wei-Shi Zheng
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Models (LLMs) often struggle with complex reasoning tasks due to insufficient in-depth insights in their training data, which are frequently absent in publicly available documents. This paper introduces the Chain of Methodologies (CoM), a simple and innovative iterative prompting framework designed to build structured reasoning processes by injecting human methodological insights, thereby enabling LLMs to perform long and effective reasoning for complex tasks. Assuming that LLMs possess certain metacognitive abilities, CoM leverages user-defined methodologies to stimulate the cognitive insights that LLMs have learned implicitly from training data. Experimental results indicate that CoM outperforms competitive baselines, highlighting the potential of training-free prompting methods as general solutions for complex reasoning tasks and the possibility of incorporating human-like methodological insights to bridge the gap to human-level reasoning.
2024
pdf
bib
abs
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Ruibin Yuan
|
Hanfeng Lin
|
Yi Wang
|
Zeyue Tian
|
Shangda Wu
|
Tianhao Shen
|
Ge Zhang
|
Yuhang Wu
|
Cong Liu
|
Ziya Zhou
|
Liumeng Xue
|
Ziyang Ma
|
Qin Liu
|
Tianyu Zheng
|
Yizhi Li
|
Yinghao Ma
|
Yiming Liang
|
Xiaowei Chi
|
Ruibo Liu
|
Zili Wang
|
Chenghua Lin
|
Qifeng Liu
|
Tao Jiang
|
Wenhao Huang
|
Wenhu Chen
|
Jie Fu
|
Emmanouil Benetos
|
Gus Xia
|
Roger Dannenberg
|
Wei Xue
|
Shiyin Kang
|
Yike Guo
Findings of the Association for Computational Linguistics: ACL 2024
While LLMs demonstrate impressive capabilities in musical knowledge, we find that music reasoning is still an unsolved task.We introduce ChatMusician, an open-source large language model (LLM) that integrates intrinsic musical abilities. It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the music is treated as a second language.ChatMusician can understand and generate music with a pure text tokenizer without external multi-modal neural structures or tokenizers. Interestingly, endowing musical abilities does not harm language abilities, even achieving a slightly higher MMLU score.ChatMusician is capable of composing well-structured, full-length music, condition on texts, chords, melodies, motifs, musical forms, etc.On our meticulously curated college-level music understanding benchmark, MusicTheoryBench, ChatMusician surpasses LLaMA2 and GPT-3.5 by a noticeable margin. We show that ChatMusician preserves or even surpasses the original LLaMA2 7B’s language abilities by evaluating on MMLU benchmark.Our work reveals that LLMs can be an excellent compressor for music, which can be seen as humanity’s creative language, but there remains significant territory to be conquered.We release our 5B token music-language corpora MusicPiles, the collected MusicTheoryBench, code, model and demo.