Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots
Alexandre Arnold, Fares Ernez, Catherine Kobus, Marion-Cécile Martin
Abstract
During their pre-flight briefings, aircraft pilots must analyse a long list of NoTAMs (NOtice To AirMen) indicating potential hazards along the flight route, sometimes up to pages for long-haul flights. NOTAM free-text fields typically have a very special phrasing, with lots of acronyms and domain-specific vocabulary, which makes it differ significantly from standard English. In this paper, we pretrain language models derived from BERT on circa 1 million unlabeled NOTAMs and reuse the learnt representations on three downstream tasks valuable for pilots: criticality prediction, named entity recognition and translation into a structured language called Airlang. This self-supervised approach, where smaller amounts of labeled data are enough for task-specific fine-tuning, is well suited in the aeronautical context since expert annotations are expensive and time-consuming. We present evaluation scores across the tasks showing a high potential for an operational usability of such models (by pilots, airlines or service providers), which is a first to the best of our knowledge.- Anthology ID:
- 2022.naacl-industry.22
- Volume:
- Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
- Month:
- July
- Year:
- 2022
- Address:
- Hybrid: Seattle, Washington + Online
- Editors:
- Anastassia Loukina, Rashmi Gangadharaiah, Bonan Min
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 188–196
- Language:
- URL:
- https://aclanthology.org/2022.naacl-industry.22
- DOI:
- 10.18653/v1/2022.naacl-industry.22
- Cite (ACL):
- Alexandre Arnold, Fares Ernez, Catherine Kobus, and Marion-Cécile Martin. 2022. Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 188–196, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
- Cite (Informal):
- Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots (Arnold et al., NAACL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.naacl-industry.22.pdf