A Data-Driven Semi-Automatic Framenet Development Methodology

Shafqat Mumtaz Virk, Dana Dannélls, Lars Borin, Markus Forsberg


Abstract
FrameNet is a lexical semantic resource based on the linguistic theory of frame semantics. A number of framenet development strategies have been reported previously and all of them involve exploration of corpora and a fair amount of manual work. Despite previous efforts, there does not exist a well-thought-out automatic/semi-automatic methodology for frame construction. In this paper we propose a data-driven methodology for identification and semi-automatic construction of frames. As a proof of concept, we report on our initial attempts to build a wider-scale framenet for the legal domain (LawFN) using the proposed methodology. The constructed frames are stored in a lexical database and together with the annotated example sentences they have been made available through a web interface.
Anthology ID:
2021.ranlp-1.165
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1471–1479
Language:
URL:
https://aclanthology.org/2021.ranlp-1.165
DOI:
Bibkey:
Cite (ACL):
Shafqat Mumtaz Virk, Dana Dannélls, Lars Borin, and Markus Forsberg. 2021. A Data-Driven Semi-Automatic Framenet Development Methodology. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1471–1479, Held Online. INCOMA Ltd..
Cite (Informal):
A Data-Driven Semi-Automatic Framenet Development Methodology (Virk et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.ranlp-1.165.pdf
Data
FrameNet