2022
pdf
abs
How Can a Teacher Make Learning From Sparse Data Softer? Application to Business Relation Extraction
Hadjer Khaldi
|
Farah Benamara
|
Camille Pradel
|
Nathalie Aussenac-Gilles
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Business Relation Extraction between market entities is a challenging information extraction task that suffers from data imbalance due to the over-representation of negative relations (also known as No-relation or Others) compared to positive relations that corresponds to the taxonomy of relations of interest. This paper proposes a novel solution to tackle this problem, relying on binary soft labels supervision generated by an approach based on knowledge distillation. When evaluated on a business relation extraction dataset, the results suggest that the proposed approach improves the overall performance, beating state-of-the art solutions for data imbalance. In particular, it improves the extraction of under-represented relations as well as the detection of false negatives.
pdf
abs
How’s Business Going Worldwide ? A Multilingual Annotated Corpus for Business Relation Extraction
Hadjer Khaldi
|
Farah Benamara
|
Camille Pradel
|
Grégoire Sigel
|
Nathalie Aussenac-Gilles
Proceedings of the Thirteenth Language Resources and Evaluation Conference
The business world has changed due to the 21st century economy, where borders have melted and trades became free. Nowadays,competition is no longer only at the local market level but also at the global level. In this context, the World Wide Web has become a major source of information for companies and professionals to keep track of their complex, rapidly changing, and competitive business environment. A lot of effort is nonetheless needed to collect and analyze this information due to information overload problem and the huge number of web pages to process and analyze. In this paper, we propose the BizRel resource, the first multilingual (French,English, Spanish, and Chinese) dataset for automatic extraction of binary business relations involving organizations from the web.This dataset is used to train several monolingual and cross-lingual deep learning models to detect these relations in texts. Our results are encouraging, demonstrating the effectiveness of such a resource for both research and business communities. In particular, we believe multilingual business relation extraction systems are crucial tools for decision makers to identify links between specific market stakeholders and build business networks which enable to anticipate changes and discover new threats or opportunities. Our work is therefore an important direction toward such tools.
2020
pdf
abs
Classification de relations pour l’intelligence économique et concurrentielle (Relation Classification for Competitive and Economic Intelligence )
Hadjer Khaldi
|
Amine Abdaoui
|
Farah Benamara
|
Grégoire Sigel
|
Nathalie Aussenac-Gilles
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 2 : Traitement Automatique des Langues Naturelles
L’extraction de relations reliant des entités par des liens sémantiques à partir de texte a fait l’objet de nombreux travaux visant à extraire des relations génériques comme l’hyperonymie ou spécifiques comme des relations entre gènes et protéines. Dans cet article, nous nous intéressons aux relations économiques entre deux entités nommées de type organisation à partir de textes issus du web. Ce type de relation, encore peu étudié dans la littérature, a pour but l’identification des liens entre les acteurs d’un secteur d’activité afin d’analyser leurs écosystèmes économiques. Nous présentons B IZ R EL, le premier corpus français annoté en relations économiques, ainsi qu’une approche supervisée à base de différentes architectures neuronales pour la classification de ces relations. L’évaluation de ces modèles montre des résultats très encourageants, ce qui est un premier pas vers l’intelligence économique et concurrentielle à partir de textes pour le français.