- Anthology ID:
- 2025.clicit-1.104
- Volume:
- Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
- Month:
- September
- Year:
- 2025
- Address:
- Cagliari, Italy
- Editors:
- Cristina Bosco, Elisabetta Jezek, Marco Polignano, Manuela Sanguinetti
- Venue:
- CLiC-it
- SIG:
- Publisher:
- CEUR Workshop Proceedings
- Note:
- Pages:
- 1102–1111
- Language:
- URL:
- https://preview.aclanthology.org/info-author-pages/2025.clicit-1.104/
- DOI:
- Cite (ACL):
- Fabio Tamburini. 2025. Curated Data Does Not Mean Representative Data When Training Large Language Models: An Experiment Using Representative Data for Italian. In Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025), pages 1102–1111, Cagliari, Italy. CEUR Workshop Proceedings.
- Cite (Informal):
- Curated Data Does Not Mean Representative Data When Training Large Language Models: An Experiment Using Representative Data for Italian (Tamburini, CLiC-it 2025)
- PDF:
- https://preview.aclanthology.org/info-author-pages/2025.clicit-1.104.pdf