Yu Zou

Also published as:


2026

Understanding financial documents is critical for high-stakes decision-making yet hindered by systemic semantic implicitness: key facts are rarely explicit in surface text and often determined by global structural cues. Missing these cues invites semantic misinterpretations, such as misreading what a number refers to, an outcome unacceptable in high-stakes environments. However, existing Retrieval-Augmented Generation (RAG) systems typically treat structure as a physical navigational skeleton rather than intrinsic semantic knowledge. To address this, we introduce Fin-STAR (Financial STructure-As-Semantics Retrieval), a framework redefining hierarchy as intrinsic semantics. Fin-STAR incorporates a novel Structure-Enriched Semantic Indexing mechanism that augments the hierarchical lineage with snippet-derived virtual nodes, and injects this enriched context via a semantic cross-attention paradigm, rendering implicit cues explicit. By grounding evidence within its structural scope, we preserve factual invariance and ensure contextual integrity. Addressing the lack of granular public datasets, we conduct experiments on FinTierQA Gold, a curated expert benchmark. Results show that Fin-STAR outperforms state-of-the-art hierarchical and graph-based baselines across diverse query complexities, document types, and markets. Notably, ablations confirm that our semantic injection consistently outperforms alternative strategies. Finally, we release FinTierQA, comprising 3.9M pairs automatically constructed from 78k documents via our framework .

2020

Seq2seq神经网络模型在中英文文本摘要的研究中取得了良好的效果,但在低资源语言的文本摘要研究还处于探索阶段,尤其是在藏语中。此外,目前还没有大规模的标注语料库进行摘要提取。本文提出了一种生成藏文新闻摘要的统一模型。利用TextRank算法解决了藏语标注训练数据不足的问题。然后,采用两层双GRU神经网络提取代表原始新闻的句子,减少冗余信息。最后,使用基于注意力机制的Seq2Seq来生成理解式摘要。同时,我们加入了指针网络来处理未登录词的问题。实验结果表明,ROUGE-1评分比传统模型提高了2%。 关键词:文本摘要;藏文;TextRank; 指针网络;Bi-GRU

2010

2006