SubmissionNumber#=%=#2 FinalPaperTitle#=%=#How Well Do Tweets Represent Sub-Dialects of Egyptian Arabic? ShortPaperTitle#=%=# NumberOfPages#=%=#15 CopyrightSigned#=%=# JobTitle#==# Organization#==# Abstract#==#How well does naturally-occurring digital text, such as Tweets, represent sub-dialects of Egyptian Arabic\,(EA)? This paper focuses on two EA sub-dialects: Cairene Egyptian Arabic (CEA) and Sa'idi Egyptian Arabic (SEA). We use morphological markers from ground-truth dialect surveys as a distance measure across four geo-referenced datasets. Results show that CEA markers are prevalent as expected in CEA geo-referenced tweets, while SEA markers are limited across SEA geo-referenced tweets. SEA tweets instead show a prevalence of CEA markers and higher usage of Modern Standard Arabic. We conclude that corpora intended to represent sub-dialects of EA do not accurately represent sub-dialects outside of the Cairene variety. This finding calls into question the validity of relying on tweets alone to represent dialectal differences. Author{1}{Firstname}#=%=#Mai Author{1}{Lastname}#=%=#Mohamed Eida Author{1}{Username}#=%=#maimm2 Author{1}{Email}#=%=#maimm2@illinois.edu Author{1}{Affiliation}#=%=#University of Illinois Urbana Champaign Author{2}{Firstname}#=%=#Mayar Mohamadein Author{2}{Lastname}#=%=#Nassar Author{2}{Username}#=%=#mayarnassar Author{2}{Email}#=%=#mayarmnassar@gmail.com Author{2}{Affiliation}#=%=#Ain Shams University Author{3}{Firstname}#=%=#Jonathan Author{3}{Lastname}#=%=#Dunn Author{3}{Username}#=%=#jonathandunn Author{3}{Email}#=%=#jedunn@illinois.edu Author{3}{Affiliation}#=%=#University of Illinois Urbana-Champaign ========== èéáğö