Ella Hofmann-Coyle
2026
PExA: Parallel Exploration Agent for Complex Text-to-SQL
Tanmay Parekh | Ella Hofmann-Coyle | Shuyi Wang | Sachith Sri Ram Kothur | Srivas Prasad | Yunmo Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Tanmay Parekh | Ella Hofmann-Coyle | Shuyi Wang | Sachith Sri Ram Kothur | Srivas Prasad | Yunmo Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
LLM-based agents for text-to-SQL often struggle with latency-performance trade-off, where performance improvements come at the cost of latency or vice versa. We reformulate text-to-SQL generation within the lens of software test coverage where the original query is prepared with a suite of test cases with simpler, atomic SQLs that are executed in parallel and together ensure semantic coverage of the original query. After iterating on test case coverage, the final SQL is generated only when enough information is gathered, leveraging the explored test case SQLs to ground the final generation. We validated our framework on a state-of-the-art benchmark for text-to-SQL, Spider 2.0, achieving a new state-of-the-art with 70.2% execution accuracy.
2023
EntSUMv2: Dataset, Models and Evaluation for More Abstractive Entity-Centric Summarization
Dhruv Mehra | Lingjue Xie | Ella Hofmann-Coyle | Mayank Kulkarni | Daniel Preotiuc-Pietro
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Dhruv Mehra | Lingjue Xie | Ella Hofmann-Coyle | Mayank Kulkarni | Daniel Preotiuc-Pietro
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Entity-centric summarization is a form of controllable summarization that aims to generate a summary for a specific entity given a document. Concise summaries are valuable in various real-life applications, as they enable users to quickly grasp the main points of the document focusing on an entity of interest. This paper presents ENTSUMV2, a more abstractive version of the original entity-centric ENTSUM summarization dataset. In ENTSUMV2 the annotated summaries are intentionally made shorter to benefit more specific and useful entity-centric summaries for downstream users. We conduct extensive experiments on this dataset using multiple abstractive summarization approaches that employ supervised fine-tuning or large-scale instruction tuning. Additionally, we perform comprehensive human evaluation that incorporates metrics for measuring crucial facets. These metrics provide a more fine-grained interpretation of the current state-of-the-art systems and highlight areas for future improvement.
2022
Extractive Entity-Centric Summarization as Sentence Selection using Bi-Encoders
Ella Hofmann-Coyle | Mayank Kulkarni | Lingjue Xie | Mounica Maddela | Daniel Preotiuc-Pietro
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Ella Hofmann-Coyle | Mayank Kulkarni | Lingjue Xie | Mounica Maddela | Daniel Preotiuc-Pietro
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Entity-centric summarization is a type of controllable summarization that aims to produce a summary of a document that is specific to a given target entity. Extractive summaries possess multiple advantages over abstractive ones such as preserving factuality and can be directly used in downstream tasks like target-based sentiment analysis or incorporated into search applications. In this paper, we explore methods to solve this task by recasting it as a sentence selection task, as supported by the EntSUM data set. We use methods inspired by information retrieval, where the input to the model is a pair representing a sentence from the original document and the target entity, in place of the query. We explore different architecture variants and loss functions in this framework with results showing an up to 5.8 F1 improvement over past state-of-the-art and outperforming the competitive entity-centric Lead 3 heuristic by 1.1 F1. In addition, we also demonstrate similarly strong results on the related task of salient sentence selection for an entity.