Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language
Linda Wiechetek, Flammie Pirinen, Mika Hämäläinen, Chiara Argese
Abstract
We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.- Anthology ID:
- 2021.ranlp-1.171
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1526–1535
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.171
- DOI:
- Cite (ACL):
- Linda Wiechetek, Flammie Pirinen, Mika Hämäläinen, and Chiara Argese. 2021. Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1526–1535, Held Online. INCOMA Ltd..
- Cite (Informal):
- Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language (Wiechetek et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.ranlp-1.171.pdf