Ho-Lam Chung

2026

LLM-Codec: Neural Audio Codec Meets Language Model Objectives
Ho-Lam Chung | Yiming Chen | Hung-yi Lee
Findings of the Association for Computational Linguistics: ACL 2026

Neural audio codecs are widely used as tokenizers for spoken language models, but they are optimized for waveform reconstruction rather than autoregressive prediction.This mismatch injects acoustically driven uncertainty into the discrete token space and increases language-model perplexity.We propose , which augments codec training with language-model-facing objectives while keeping both codec and LLM architectures unchanged.introduces (i) future token prediction with Medusa-style multi-step heads to encourage multi-step predictability, and (ii) semantic alignment that matches audio and text representations via a memory-bank contrastive loss.A differentiable Gumbel bridge enables end-to-end gradients from these objectives to the codec encoder.On SALMon speech coherence, token LMs trained on reach 61.6% accuracy (+12.1 points over AUV) while reducing perplexity 35×.On Codec-SUPERB-tiny, improves speech Mel distance by 5.0% over AUV while simultaneously achieving the learnability gains, demonstrating that reconstruction fidelity and token predictability can be improved together.

2024

pdf bib abs

The sound codec’s dual roles in minimizing data transmission latency and serving as tokenizers underscore its critical importance.Recent years have witnessed significant developments in codec models.The ideal sound codec should preserve content, paralinguistics, speakers, and audio information.However, the question of which codec achieves optimal sound information preservation remains unanswered, as in different papers, models are evaluated on their selected experimental settings.This study introduces Codec-SUPERB, an acronym for Codec sound processing Universal PERformance Benchmark.It is an ecosystem designed to assess codec models across representative sound applications and signal-level metrics rooted in sound domain knowledge.Codec-SUPERB simplifies result sharing through an online leaderboard, promoting collaboration within a community-driven benchmark database, thereby stimulating new development cycles for codecs.Furthermore, we undertake an in-depth analysis to offer insights into codec models from both application and signal perspectives, diverging from previous codec papers mainly concentrating on signal-level comparisons.Finally, we will release codes, the leaderboard, and data to accelerate progress within the community.

2022

pdf bib

Keyword Provision Question Generation for Facilitating Educational Reading Comprehension Preparation
Ying-Hong Chan | Ho-Lam Chung | Yao-Chung Fan
Proceedings of the 15th International Conference on Natural Language Generation

2020

pdf bib abs

A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies.
Ho-Lam Chung | Ying-Hong Chan | Yao-Chung Fan
Findings of the Association for Computational Linguistics: EMNLP 2020

In this paper, we investigate the following two limitations for the existing distractor generation (DG) methods. First, the quality of the existing DG methods are still far from practical use. There are still room for DG quality improvement. Second, the existing DG designs are mainly for single distractor generation. However, for practical MCQ preparation, multiple distractors are desired. Aiming at these goals, in this paper, we present a new distractor generation scheme with multi-tasking and negative answer training strategies for effectively generating multiple distractors. The experimental results show that (1) our model advances the state-of-the-art result from 28.65 to 39.81 (BLEU 1 score) and (2) the generated multiple distractors are diverse and shows strong distracting power for multiple choice question.

Co-authors

Venues

Findings3
INLG1

Fix author