SoftRegex: Generating Regex from Natural Language Descriptions using Softened Regex Equivalence

Jun-U Park, Sang-Ki Ko, Marco Cognetta, Yo-Sub Han


Abstract
We continue the study of generating se-mantically correct regular expressions from natural language descriptions (NL). The current state-of-the-art model SemRegex produces regular expressions from NLs by rewarding the reinforced learning based on the semantic (rather than syntactic) equivalence between two regular expressions. Since the regular expression equivalence problem is PSPACE-complete, we introduce the EQ_Reg model for computing the simi-larity of two regular expressions using deep neural networks. Our EQ_Reg mod-el essentially softens the equivalence of two regular expressions when used as a reward function. We then propose a new regex generation model, SoftRegex, us-ing the EQ_Reg model, and empirically demonstrate that SoftRegex substantially reduces the training time (by a factor of at least 3.6) and produces state-of-the-art results on three benchmark datasets.
Anthology ID:
D19-1677
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
6425–6431
Language:
URL:
https://aclanthology.org/D19-1677
DOI:
10.18653/v1/D19-1677
Bibkey:
Cite (ACL):
Jun-U Park, Sang-Ki Ko, Marco Cognetta, and Yo-Sub Han. 2019. SoftRegex: Generating Regex from Natural Language Descriptions using Softened Regex Equivalence. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6425–6431, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
SoftRegex: Generating Regex from Natural Language Descriptions using Softened Regex Equivalence (Park et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D19-1677.pdf