Comparing LLM-generated and human-authored news text using formal syntactic theory

Olga Zamaraeva, Dan Flickinger, Francis Bond, Carlos Gómez-Rodríguez


Abstract
This study provides the first comprehensive comparison of New York Times-style text generated by six large language models against real, human-authored NYT writing. The comparison is based on a formal syntactic theory. We use Head-driven Phrase Structure Grammar (HPSG) to analyze the grammatical structure of the texts. We then investigate and illustrate the differences in the distributions of HPSG grammar types, revealing systematic distinctions between human and LLM-generated writing. These findings contribute to a deeper understanding of the syntactic behavior of LLMs as well as humans, within the NYT genre.
Anthology ID:
2025.acl-long.443
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9041–9060
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.443/
DOI:
Bibkey:
Cite (ACL):
Olga Zamaraeva, Dan Flickinger, Francis Bond, and Carlos Gómez-Rodríguez. 2025. Comparing LLM-generated and human-authored news text using formal syntactic theory. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9041–9060, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Comparing LLM-generated and human-authored news text using formal syntactic theory (Zamaraeva et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.443.pdf