Xinyuan Song


2026

Automated assessment of patent quality is increasingly important given the growth of patent filings and the adoption of AI-assisted drafting. Existing methods often rely on modular pipelines or generic detectors, resulting in fragmented decisions and limited integration across quality dimensions. We propose P-QuASAR (Patent Quality Assurance via Structured Assessment and Refinement), a unified probabilistic framework that represents patent specifications as Quality Graphs. Multiple interdependent quality dimensions—such as regulatory compliance, technical coherence, and figure–text consistency—are jointly modeled using uncertainty-aware Quality Assessment Functions with learned edge potentials. Cross-dimensional evidence propagation via loopy belief propagation enables calibrated defect detection, while Optimal Intervention Paths translate inferred quality states into prioritized and actionable refinement recommendations. Evaluated on 500 patents across eight IPC domains against seven state-of-the-art baselines, P-QuASAR achieves substantial improvements: 99.86% balanced accuracy on regulatory compliance, 88.91% on technical coherence, and 94.70% on figure consistency, outperforming the strongest baselines by 3.0%, 9.0%, and 7.1%, respectively. Ablation studies confirm that joint graph reasoning contributes 3.66 points to average performance. When applied for refinement, P-QuASAR reduces average defects in AI-generated patents from 9.04–12.15 to 3.21 per document, surpassing human-authored patents.
Neural speech codecs provide discrete representations for speech language models, but emotional cues are often degraded during quantization. Existing codecs mainly optimize acoustic reconstruction, leaving emotion expressiveness insufficiently modeled at the representation level. We propose an emotion-guided neural speech codec that explicitly preserves emotional information while maintaining semantic fidelity and prosodic naturalness. Our framework combines emotion–semantic guided latent modulation, relation-preserving emotional–semantic distillation, and emotion-weighted semantic alignment to retain emotionally salient cues under compression. Extensive evaluations across speech reconstruction, emotion recognition, and downstream text to speech generation demonstrate improved emotion consistency and perceptual quality without sacrificing content accuracy.

2025

Recently, Large Language Models (LLMs) have shown strong potential in recommendation tasks due to their broad world knowledge and reasoning capabilities. However, applying them to serendipity-oriented recommendation remains challenging, mainly due to a domain gap of LLMs in modeling personalized user behavior and the scarcity of labeled serendipitous interactions. In this paper, we introduce **SOLAR** (**S**erendipity-**O**ptimized **L**anguage model **A**ligned for **R**ecommendation), a two-stage framework that addresses these challenges. To alleviate label scarcity, we adopt a weak supervision strategy: a sequential ID-based recommender generates candidate items, which are then reranked by an LLM acting as a preference judge to produce serendipity-aware pseudo-labels. To bridge the domain gap, we propose a domain-adaptive instruction tuning method (SUN) that aligns LLMs with recommendation tasks. Experiments on three real-world datasets show that **SOLAR** consistently improves both accuracy and serendipity over strong baselines, showing its effectiveness in enabling more diverse, user-centric recommendations. Code and dataset are released at [https://github.com/SOLAR2025ARR/SOLAR](https://github.com/SOLAR2025ARR/SOLAR).