Che Lin
2026
NASH: Numerically Aware Scoring Heuristic for Robust Semantic Similarity
Yu-Shiang Huang | Yun-Yu Lee | Tzu-Hsin Chou | Che Lin | Chuan-Ju Wang
Findings of the Association for Computational Linguistics: ACL 2026
Yu-Shiang Huang | Yun-Yu Lee | Tzu-Hsin Chou | Che Lin | Chuan-Ju Wang
Findings of the Association for Computational Linguistics: ACL 2026
Numerical precision is critical in financial NLP, yet embedding-based semantic similarity metrics exhibit numerical blindness—failing to distinguish contradictory values within similar contexts. We introduce NASH (Numerically Aware Scoring Hueristic), a model-agnostic metric that decouples numerical verification from textual semantic evaluation through a three-stage pipeline: (1) modal separation via numeric masking, (2) dual-channel similarity estimation through masked-text similarity and context-aware numeric alignment, and (3) IDF-weighted aggregation. NASH functions as a drop-in enhancement to existing embedding-based metrics. Validated on our proposed NumFinE financial numerical evaluation benchmark and established semantic similarity datasets (STS-B, Financial-STS), NASH achieves substantial improvements in numerical sensitivity (up to +159.6% on listwise ranking) while preserving general semantic performance, establishing a reliable standard for numeracy-aware evaluation.
2023
A Compare-and-contrast Multistage Pipeline for Uncovering Financial Signals in Financial Reports
Jia-Huei Ju | Yu-Shiang Huang | Cheng-Wei Lin | Che Lin | Chuan-Ju Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jia-Huei Ju | Yu-Shiang Huang | Cheng-Wei Lin | Che Lin | Chuan-Ju Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this paper, we address the challenge of discovering financial signals in narrative financial reports. As these documents are often lengthy and tend to blend routine information with new information, it is challenging for professionals to discern critical financial signals. To this end, we leverage the inherent nature of the year-to-year structure of reports to define a novel signal-highlighting task; more importantly, we propose a compare-and-contrast multistage pipeline that recognizes different relationships between the reports and locates relevant rationales for these relationships. We also create and publicly release a human-annotated dataset for our task. Our experiments on the dataset validate the effectiveness of our pipeline, and we provide detailed analyses and ablation studies to support our findings.