@inproceedings{gandhi-etal-2024-challenges,
    title = "Challenges in End-to-End Policy Extraction from Climate Action Plans",
    author = "Gandhi, Nupoor  and
      Corringham, Tom  and
      Strubell, Emma",
    editor = "Stammbach, Dominik  and
      Ni, Jingwei  and
      Schimanski, Tobias  and
      Dutia, Kalyan  and
      Singh, Alok  and
      Bingler, Julia  and
      Christiaen, Christophe  and
      Kushwaha, Neetu  and
      Muccione, Veruska  and
      A. Vaghefi, Saeid  and
      Leippold, Markus",
    booktitle = "Proceedings of the 1st Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2024)",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.climatenlp-1.12/",
    doi = "10.18653/v1/2024.climatenlp-1.12",
    pages = "156--167",
    abstract = "Gray policy literature such as climate action plans (CAPs) provide an information-rich resource with potential to inform analysis and decision-making. However, these corpora are currently underutilized due to the substantial manual effort and expertise required to sift through long and detailed documents. Automatically structuring relevant information using information extraction (IE) would be useful for assisting policy scientists in synthesizing vast gray policy corpora to identify relevant entities, concepts and themes. LLMs have demonstrated strong performance on IE tasks in the few-shot setting, but it is unclear whether these gains transfer to gray policy literature which differs significantly to traditional benchmark datasets in several aspects, such as format of information content, length of documents, and inconsistency of document structure. We perform a case study on end-to-end IE with California CAPs, inspecting the performance of state-of-the-art tools for: (1) extracting content from CAPs into structured markup segments; (2) few-shot IE with LLMs; and (3) the utility of extracted entities for downstream analyses. We identify challenges at several points of the end-to-end IE pipeline for CAPs, and we provide recommendations for open problems centered around representing rich non-textual elements, document structure, flexible annotation schemes, and global information. Tackling these challenges would make it possible to realize the potential of LLMs for IE with gray policy literature."
}Markdown (Informal)
[Challenges in End-to-End Policy Extraction from Climate Action Plans](https://preview.aclanthology.org/ingest-emnlp/2024.climatenlp-1.12/) (Gandhi et al., ClimateNLP 2024)
ACL