GRAILGeneralized Representation and Aggregation of Information Layers

Sameer Pradhan, Mark Liberman


Abstract
This paper identifies novel characteristics necessary to successfully represent multiple streams of natural language information from speech and text simultaneously, and proposes a multi-tiered system that implements these characteristics centered around a declarative configuration. The system facilitates easy incremental extension by allowing the creation of composable workflows of loosely coupled extensions, or plugins, allowing simple intial systems to be extended to accomodate rich representations while maintaining high data integrity. Key to this is leveraging established tools and technologies. We demonstrate using a small example.
Anthology ID:
2022.law-1.20
Volume:
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Sameer Pradhan, Sandra Kuebler
Venue:
LAW
SIG:
SIGANN
Publisher:
European Language Resources Association
Note:
Pages:
170–181
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2022.law-1.20/
DOI:
Bibkey:
Cite (ACL):
Sameer Pradhan and Mark Liberman. 2022. GRAIL—Generalized Representation and Aggregation of Information Layers. In Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022, pages 170–181, Marseille, France. European Language Resources Association.
Cite (Informal):
GRAIL—Generalized Representation and Aggregation of Information Layers (Pradhan & Liberman, LAW 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2022.law-1.20.pdf
Data
Penn Treebank