Myths about Writing Systems in Speech & Language Technology

Kyle Gorman, Richard Sproat


Abstract
Natural language processing is largely focused on written text processing. However, many computational linguists tacitly endorse myths about the nature of writing. We highlight two of these myths—the conflation of language and writing, and the notion that Chinese, Japanese, and Korean writing is ideographic—and suggest how the community can dispel them.
Anthology ID:
2023.cawl-1.1
Volume:
Proceedings of the Workshop on Computation and Written Language (CAWL 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Kyle Gorman, Richard Sproat, Brian Roark
Venue:
CAWL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–5
Language:
URL:
https://aclanthology.org/2023.cawl-1.1
DOI:
10.18653/v1/2023.cawl-1.1
Bibkey:
Cite (ACL):
Kyle Gorman and Richard Sproat. 2023. Myths about Writing Systems in Speech & Language Technology. In Proceedings of the Workshop on Computation and Written Language (CAWL 2023), pages 1–5, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Myths about Writing Systems in Speech & Language Technology (Gorman & Sproat, CAWL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.cawl-1.1.pdf