Jia-Huei Ju

Also published as: Jia-huei Ju

2025

pdf bib abs
SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models
Thong Nguyen | Yibin Lei | Jia-Huei Ju | Andrew Yates
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Visual Document Retrieval (VDR) typically operates as text-to-image retrieval using specialized bi-encoders trained to directly embed document images. We revisit a zero-shot generate-and-encode pipeline: a vision–language model first produces a detailed textual description of each document image, which is then embedded by a standard text encoder. On the ViDoRe-v2 benchmark, the method reaches 63.4% nDCG@5, surpassing the strongest specialised multi-vector visual document encoder, and it scales similarly on MIRACL-VISION with broader multilingual coverage. Analysis shows that modern vision–language models capture complex textual and visual cues with sufficient granularity to act as a reusable semantic proxy. By off-loading modality alignment to pretrained vision–language models, our approach removes the need for computationally intensive text-image contrastive training and establishes a strong zero-shot baseline for future VDR systems.

pdf bib abs
Controlled Retrieval-augmented Context Evaluation for Long-form RAG
Jia-Huei Ju | Suzan Verberne | Maarten de Rijke | Andrew Yates
Findings of the Association for Computational Linguistics: EMNLP 2025

Retrieval-augmented generation (RAG) enhances large language models by incorporating context retrieved from external knowledge sources. While the effectiveness of the retrieval module is typically evaluated with relevance-based ranking metrics, such metrics may be insufficient to reflect the retrieval’s impact on the final RAG result, especially in long-form generation scenarios. We argue that providing a comprehensive retrieval-augmented context is important for long-form RAG tasks like report generation and propose metrics for assessing the context independent of generation. We introduce CRUX, a Controlled Retrieval-aUgmented conteXt evaluation framework designed to directly assess retrieval-augmented contexts. This framework uses human-written summaries to control the information scope of knowledge, enabling us to measure how well the context covers information essential for long-form generation. CRUX uses question-based evaluation to assess RAG’s retrieval in a fine-grained manner. Empirical results show that CRUX offers more reflective and diagnostic evaluation. Our findings also reveal substantial room for improvement in current retrieval methods, pointing to promising directions for advancing RAG’s retrieval. Our data and code are publicly available to support and advance future research on retrieval for RAG. Github: https://github.com/DylanJoo/crux

2024

pdf bib abs
Relevance-aware Diverse Query Generation for Out-of-domain Text Ranking
Jia-Huei Ju | Huck Chao-Han Yang | Szu-Wei Fu | Ming-Feng Tsai | Chuan-Ju Wang
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)

Domain adaptation presents significant challenges for out-of-domain text ranking, especially when supervised data is limited. In this paper, we present ReadQG (Relevance-Aware Diverse Query Generation), a method to generate informative synthetic queries to facilitate the adaptation process of text ranking models. Unlike previous approaches focusing solely on relevant query generation, our ReadQG generates diverse queries with continuous relevance scores. Specifically, we propose leveraging soft-prompt tuning and diverse generation objectives to control query generation according to the given relevance. Our experiments show that integrating negative queries into the learning process enhances the effectiveness of text ranking models in out-of-domain information retrieval (IR) benchmarks. Furthermore, we measure the quality of query generation, highlighting the underlying beneficial characteristics of negative queries. Our empirical results and analysis also shed light on potential directions for more advanced data augmentation in IR. The data and code have been released.

2023

pdf bib abs
A Compare-and-contrast Multistage Pipeline for Uncovering Financial Signals in Financial Reports
Jia-Huei Ju | Yu-Shiang Huang | Cheng-Wei Lin | Che Lin | Chuan-Ju Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we address the challenge of discovering financial signals in narrative financial reports. As these documents are often lengthy and tend to blend routine information with new information, it is challenging for professionals to discern critical financial signals. To this end, we leverage the inherent nature of the year-to-year structure of reports to define a novel signal-highlighting task; more importantly, we propose a compare-and-contrast multistage pipeline that recognizes different relationships between the reports and locates relevant rationales for these relationships. We also create and publicly release a human-annotated dataset for our task. Our experiments on the dataset validate the effectiveness of our pipeline, and we provide detailed analyses and ablation studies to support our findings.

pdf bib abs
FISH: A Financial Interactive System for Signal Highlighting
Ta-wei Huang | Jia-huei Ju | Yu-shiang Huang | Cheng-wei Lin | Yi-shyuan Chiang | Chuan-ju Wang
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In this system demonstration, we seek to streamline the process of reviewing financial statements and provide insightful information for practitioners. We develop FISH, an interactive system that extracts and highlights crucial textual signals from financial statements efficiently and precisely. To achieve our goal, we integrate pre-trained BERT representations and a fine-tuned BERT highlighting model with a newly-proposed two-stage classify-then-highlight pipeline. We also conduct the human evaluation, showing FISH can provide accurate financial signals. FISH overcomes the limitations of existing research andmore importantly benefits both academics and practitioners in finance as they can leverage state-of-the-art contextualized language models with their newly gained insights. The system is available online at https://fish-web-fish.de.r.appspot.com/, and a short video for introduction is at https://youtu.be/ZbvZQ09i6aw.