2025
pdf
bib
abs
When retrieval outperforms generation: Dense evidence retrieval for scalable fake news detection
Alamgir Munir Qazi
|
John P. McCrae
|
Jamal Nasir
Proceedings of the 5th Conference on Language, Data and Knowledge
The proliferation of misinformation necessitates robust yet computationally efficient fact verification systems. While current state-of-the-art approaches leverage Large Language Models (LLMs) for generating explanatory rationales, these methods face significant computational barriers and hallucination risks in real-world deployments. We present DeReC (Dense Retrieval Classification), a lightweight framework that demonstrates how general-purpose text embeddings can effectively replace autoregressive LLM-based approaches in fact verification tasks. By combining dense retrieval with specialized classification, our system achieves better accuracy while being significantly more efficient. DeReC outperforms explanation-generating LLMs in efficiency, reducing runtime by 95% on RAWFC (23 minutes 36 seconds compared to 454 minutes 12 seconds) and by 92% on LIAR-RAW (134 minutes 14 seconds compared to 1692 minutes 23 seconds), showcasing its effectiveness across varying dataset sizes. On the RAWFC dataset, DeReC achieves an F1 score of 65.58%, surpassing the state-of-the-art method L-Defense (61.20%). Our results demonstrate that carefully engineered retrieval-based systems can match or exceed LLM performance in specialized tasks while being significantly more practical for real-world deployment.
2020
pdf
bib
abs
Recent Developments for the Linguistic Linked Open Data Infrastructure
Thierry Declerck
|
John McCrae
|
Matthias Hartung
|
Jorge Gracia
|
Christian Chiarcos
|
Elena Montiel
|
Philipp Cimiano
|
Artem Revenko
|
Roser Saurí
|
Deirdre Lee
|
Stefania Racioppa
|
Jamal Nasir
|
Matthias Orlikowsk
|
Marta Lanau-Coronas
|
Christian Fäth
|
Mariano Rico
|
Mohammad Fazleh Elahi
|
Maria Khvalchik
|
Meritxell Gonzalez
|
Katharine Cooney
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper we describe the contributions made by the European H2020 project “Prêt-à-LLOD” (‘Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors’) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure. Prêt-à-LLOD aims to develop a new methodology for building data value chains applicable to a wide range of sectors and applications and based around language resources and language technologies that can be integrated by means of semantic technologies. We describe the methods implemented for increasing the number of language data sets in the LLOD. We also present the approach for ensuring interoperability and for porting LLOD data sets and services to other infrastructures, as well as the contribution of the projects to existing standards.
2014
pdf
bib
An Off-the-shelf Approach to Authorship Attribution
Jamal A. Nasir
|
Nico Görnitz
|
Ulf Brefeld
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
bib
Learning to Summarise Related Sentences
Emmanouil Tzouridis
|
Jamal Nasir
|
Ulf Brefeld
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers