ChatGPT as a Java Decompiler

Bradley Mcdanel; Zhanhao Liu

ChatGPT as a Java Decompiler

Abstract

We propose a novel approach using instruction-tuned large language models (LLMs), such as ChatGPT, to automatically decompile entire Java classes. Our method relies only on a textual representation of the Java bytecode and corresponding unit tests generated from the bytecode. While no additional domain knowledge or fine-tuning is performed, we provide a single training example of this decompilation process in the model’s prompt. To overcome both compilation errors and test failures, we use an iterative prompting approach. We find that ChatGPT-4 is able to generate more human-readable output than existing software-based decompilers while achieving slightly lower pass rates on unit tests. Source code and datasets are available at https://github.com/BradMcDanel/gpt-java-decompiler.

Anthology ID:: 2023.gem-1.19
Volume:: Proceedings of the Third Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Sebastian Gehrmann, Alex Wang, João Sedoc, Elizabeth Clark, Kaustubh Dhole, Khyathi Raghavi Chandu, Enrico Santus, Hooman Sedghamiz
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 224–232
Language:
URL:: https://aclanthology.org/2023.gem-1.19
DOI:
Bibkey:
Cite (ACL):: Bradley Mcdanel and Zhanhao Liu. 2023. ChatGPT as a Java Decompiler. In Proceedings of the Third Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), pages 224–232, Singapore. Association for Computational Linguistics.
Cite (Informal):: ChatGPT as a Java Decompiler (Mcdanel & Liu, GEM-WS 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2023.gem-1.19.pdf

PDF Search