Challenges in Technical Regulatory Text Variation Detection
Shriya Vaagdevi Chikati, Samuel Larkin, David Minicola, Chi-kiu Lo
Abstract
We present a preliminary study on the feasibility of using current natural language processing techniques to detect variations between the construction codes of different jurisdictions. We formulate the task as a sentence alignment problem and evaluate various sentence representation models for their performance in this task. Our results show that task-specific trained embeddings perform marginally better than other models, but the overall accuracy remains a challenge. We also show that domain-specific fine-tuning hurts the task performance. The results highlight the challenges of developing NLP applications for technical regulatory texts.- Anthology ID:
- 2025.regnlp-1.2
- Volume:
- Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025)
- Month:
- January
- Year:
- 2025
- Address:
- Abu Dhabi, UAE
- Editors:
- Tuba Gokhan, Kexin Wang, Iryna Gurevych, Ted Briscoe
- Venues:
- RegNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5–9
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.regnlp-1.2/
- DOI:
- Cite (ACL):
- Shriya Vaagdevi Chikati, Samuel Larkin, David Minicola, and Chi-kiu Lo. 2025. Challenges in Technical Regulatory Text Variation Detection. In Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025), pages 5–9, Abu Dhabi, UAE. Association for Computational Linguistics.
- Cite (Informal):
- Challenges in Technical Regulatory Text Variation Detection (Chikati et al., RegNLP 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.regnlp-1.2.pdf