Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots

Alexandre Arnold, Fares Ernez, Catherine Kobus, Marion-Cécile Martin


Abstract
During their pre-flight briefings, aircraft pilots must analyse a long list of NoTAMs (NOtice To AirMen) indicating potential hazards along the flight route, sometimes up to pages for long-haul flights. NOTAM free-text fields typically have a very special phrasing, with lots of acronyms and domain-specific vocabulary, which makes it differ significantly from standard English. In this paper, we pretrain language models derived from BERT on circa 1 million unlabeled NOTAMs and reuse the learnt representations on three downstream tasks valuable for pilots: criticality prediction, named entity recognition and translation into a structured language called Airlang. This self-supervised approach, where smaller amounts of labeled data are enough for task-specific fine-tuning, is well suited in the aeronautical context since expert annotations are expensive and time-consuming. We present evaluation scores across the tasks showing a high potential for an operational usability of such models (by pilots, airlines or service providers), which is a first to the best of our knowledge.
Anthology ID:
2022.naacl-industry.22
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
Month:
July
Year:
2022
Address:
Hybrid: Seattle, Washington + Online
Editors:
Anastassia Loukina, Rashmi Gangadharaiah, Bonan Min
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
188–196
Language:
URL:
https://aclanthology.org/2022.naacl-industry.22
DOI:
10.18653/v1/2022.naacl-industry.22
Bibkey:
Cite (ACL):
Alexandre Arnold, Fares Ernez, Catherine Kobus, and Marion-Cécile Martin. 2022. Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 188–196, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
Cite (Informal):
Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots (Arnold et al., NAACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.naacl-industry.22.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-4/2022.naacl-industry.22.mp4