A Data-Driven Semi-Automatic Framenet Development Methodology
Shafqat Mumtaz Virk, Dana Dannélls, Lars Borin, Markus Forsberg
Abstract
FrameNet is a lexical semantic resource based on the linguistic theory of frame semantics. A number of framenet development strategies have been reported previously and all of them involve exploration of corpora and a fair amount of manual work. Despite previous efforts, there does not exist a well-thought-out automatic/semi-automatic methodology for frame construction. In this paper we propose a data-driven methodology for identification and semi-automatic construction of frames. As a proof of concept, we report on our initial attempts to build a wider-scale framenet for the legal domain (LawFN) using the proposed methodology. The constructed frames are stored in a lexical database and together with the annotated example sentences they have been made available through a web interface.- Anthology ID:
- 2021.ranlp-1.165
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1471–1479
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.165
- DOI:
- Cite (ACL):
- Shafqat Mumtaz Virk, Dana Dannélls, Lars Borin, and Markus Forsberg. 2021. A Data-Driven Semi-Automatic Framenet Development Methodology. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1471–1479, Held Online. INCOMA Ltd..
- Cite (Informal):
- A Data-Driven Semi-Automatic Framenet Development Methodology (Virk et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2021.ranlp-1.165.pdf
- Data
- FrameNet