Heinz Koeppl
2026
CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval
Jiahui Geng | Fengyu Cai | Shaobo Cui | Qing Li | Liangwei Chen | Chenyang Lyu | Haonan Li | Derui Zhu | Alexander Pretschner | Heinz Koeppl | Fakhri Karray
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiahui Geng | Fengyu Cai | Shaobo Cui | Qing Li | Liangwei Chen | Chenyang Lyu | Haonan Li | Derui Zhu | Alexander Pretschner | Heinz Koeppl | Fakhri Karray
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Code retrieval is vital to modern software engineering as it boosts reuse and speeds up debugging. However, current benchmarks primarily emphasize functional relevance while neglecting code quality. To address this gap, we introduce CoQuIR, the first large-scale, multilingual benchmark specifically designed to evaluate quality-aware code retrieval across four critical dimensions: correctness, efficiency, security, and maintainability. CoQuIR includes fine-grained quality annotations over 42,725 queries and 134,907 code snippets in 11 programming languages. Evaluating 23 retrievers (both open-source and proprietary) shows that even state-of-the-art models often fail to separate buggy or insecure code from robust counterparts. We further investigate methods for explicitly training retrievers to recognize code quality, demonstrating that quality-aware metrics can be improved without loss of semantic relevance; downstream code generation benefits from these gains. CoQuIR underscores the importance of embedding quality signals into retrieval systems as a crucial component for more trustworthy developer tools.
2024
A Survey of Confidence Estimation and Calibration in Large Language Models
Jiahui Geng | Fengyu Cai | Yuxia Wang | Heinz Koeppl | Preslav Nakov | Iryna Gurevych
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Jiahui Geng | Fengyu Cai | Yuxia Wang | Heinz Koeppl | Preslav Nakov | Iryna Gurevych
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent research aiming to address this, but there has been no comprehensive overview to organize it and to outline the main lessons learned. The present survey aims to bridge this gap. In particular, we outline the challenges and we summarize recent technical advancements for LLM confidence estimation and calibration. We further discuss their applications and suggest promising directions for future work.
GeoHard: Towards Measuring Class-wise Hardness through Modelling Class Semantics
Fengyu Cai | Xinran Zhao | Hongming Zhang | Iryna Gurevych | Heinz Koeppl
Findings of the Association for Computational Linguistics: ACL 2024
Fengyu Cai | Xinran Zhao | Hongming Zhang | Iryna Gurevych | Heinz Koeppl
Findings of the Association for Computational Linguistics: ACL 2024
MixGR: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity
Fengyu Cai | Xinran Zhao | Tong Chen | Sihao Chen | Hongming Zhang | Iryna Gurevych | Heinz Koeppl
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Fengyu Cai | Xinran Zhao | Tong Chen | Sihao Chen | Hongming Zhang | Iryna Gurevych | Heinz Koeppl
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
2023
ECOLA: Enhancing Temporal Knowledge Embeddings with Contextualized Language Representations
Zhen Han | Ruotong Liao | Jindong Gu | Yao Zhang | Zifeng Ding | Yujia Gu | Heinz Koeppl | Hinrich Schütze | Volker Tresp
Findings of the Association for Computational Linguistics: ACL 2023
Zhen Han | Ruotong Liao | Jindong Gu | Yao Zhang | Zifeng Ding | Yujia Gu | Heinz Koeppl | Hinrich Schütze | Volker Tresp
Findings of the Association for Computational Linguistics: ACL 2023
Since conventional knowledge embedding models cannot take full advantage of the abundant textual information, there have been extensive research efforts in enhancing knowledge embedding using texts. However, existing enhancement approaches cannot apply to temporal knowledge graphs (tKGs), which contain time-dependent event knowledge with complex temporal dynamics. Specifically, existing enhancement approaches often assume knowledge embedding is time-independent. In contrast, the entity embedding in tKG models usually evolves, which poses the challenge of aligning temporally relevant texts with entities. To this end, we propose to study enhancing temporal knowledge embedding with textual data in this paper. As an approach to this task, we propose Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations (ECOLA), which takes the temporal aspect into account and injects textual information into temporal knowledge embedding. To evaluate ECOLA, we introduce three new datasets for training and evaluating ECOLA. Extensive experiments show that ECOLA significantly enhances temporal KG embedding models with up to 287% relative improvements regarding Hits@1 on the link prediction task. The code and models are publicly available on https://github.com/mayhugotong/ECOLA.