Lan Yang


2026

Existing state-of-the-art symbolic music generation models represent symbolic music as a sequence of attribute tokens with fixed unidirectional dependencies. However, from the perspective of music theory, the attributes of a musical note are inherently a set rather than a sequence. Building on this insight, we propose Amadeus, a novel symbolic music generation framework that adopts a two-level architecture: an autoregressive model for note sequences and a bidirectional discrete diffusion model for note attributes. This design enables flexible attribute control and adjustable decoding speed during inference. To further enhance sequential modeling, we introduce the Conditional Information Enhancement Module (CIEM). We also constructed AMD (Amadeus MIDI Dataset)—the largest open-source symbolic music dataset to date—supporting both pre-training and fine-tuning. We trained two models of different scales, Amadeus and Amadeus-M, and conducted extensive experiments, demonstrating substantial improvements over state-of-the-art methods across both objective and subjective metrics.

2025

Oracle Bone Script (OBS) is a vital treasure of human civilization, rich in insights from ancient societies. However, the evolution of written language over millennia complicates its decipherment. In this paper, we propose V-Oracle, an innovative framework that utilizes Large Multi-modal Models (LMMs) for interpreting OBS. V-Oracle applies principles of pictographic character formation and frames the task as a visual question-answering (VQA) problem, establishing a multi-step reasoning chain. It proposes a multi-dimensional data augmentation for synthesizing high-quality OBS samples, and also implements a multi-phase oracle alignment tuning to improve LMMs’ visual reasoning capabilities. Moreover, to bridge the evaluation gap in the OBS field, we further introduce Oracle-Bench, a comprehensive benchmark that emphasizes process-oriented assessment and incorporates both standard and out-of-distribution setups for realistic evaluation. Extensive experimental results can demonstrate the effectiveness of our method in providing quantitative analyses and superior deciphering capability.