Tijl De Bie
Also published as: Tijl de Bie, Tijl De Bie
2025
Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Creating robust occupation taxonomies, vital for applications ranging from job recommendation to labor market intelligence, is challenging.Manual curation is slow, while existing automated methods are either not adaptive to dynamic regional markets (top-down) or struggle to build coherent hierarchies from noisy data (bottom-up). We introduce CLIMB (CLusterIng-based Multi-agent taxonomy Builder), a framework that fully automates the creation of high-quality, data-driven taxonomies from raw job postings. CLIMB uses global semantic clustering to distill core occupations, then employs a reflection-based multi-agent system to iteratively build a coherent hierarchy. On three diverse, real-world datasets, we show that CLIMB produces taxonomies that are more coherent and scalable than existing methods and successfully capture unique regional characteristics. We release our code and datasets at https://github.com/aida-ugent/CLIMB.
Human-AI Moral Judgment Congruence on Real-World Scenarios: A Cross-Lingual Analysis
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 9th Widening NLP Workshop
Nan Li | Bo Kang | Tijl De Bie
Proceedings of the 9th Widening NLP Workshop
As Large Language Models (LLMs) are deployed in every aspect of our lives, understanding how they reason about moral issues becomes critical for AI safety. We investigate this using a dataset we curated from Reddit’s r/AmItheAsshole, comprising real-world moral dilemmas with crowd-sourced verdicts. Through experiments on five state-of-the-art LLMs across 847 posts, we find a significant and systematic divergence where LLMs are more lenient than humans. Moreover, we find that translating the posts into another language changes LLMs’ verdicts, indicating their judgments lack cross-lingual stability.
2024
TopoLedgerBERT: Topological Learning of Ledger Description Embeddings using Siamese BERT-Networks.
Sander Noels | Sébastien Viaene | Tijl De Bie
Proceedings of the Eighth Financial Technology and Natural Language Processing and the 1st Agent AI for Scenario Planning
Sander Noels | Sébastien Viaene | Tijl De Bie
Proceedings of the Eighth Financial Technology and Natural Language Processing and the 1st Agent AI for Scenario Planning
2009
Learning to translate: a statistical and computational analysis
Marco Turchi | Tijl de Bie | Nelo Cristianini
Proceedings of the 13th Annual conference of the European Association for Machine Translation
Marco Turchi | Tijl de Bie | Nelo Cristianini
Proceedings of the 13th Annual conference of the European Association for Machine Translation