msa word 0.0036812299999999997
arabic words 0.00338256
several word 0.003057983
word forms 0.0029952119999999997
word unigram 0.002903147
word lists 0.0028915269999999996
word unigrams 0.0028626909999999997
word normalizations 0.002862526
word uni 0.002859923
word concatenations 0.002859923
word segmentor 0.002859923
morphological features 0.00282464
msa words 0.0028014900000000002
arabic language 0.00262217
arabic dialect 0.00258703
dialectal words 0.002423779
dialectal arabic 0.002377339
arabic corpus 0.0022694860000000002
msa data 0.00225727
standard arabic 0.002233057
other features 0.00220597
arz words 0.002123257
possible arabic 0.002105521
multiple words 0.002096593
arabic online 0.002063699
arabic dialects 0.00205921
unique words 0.002054948
arabic tweets 0.0020252
top words 0.002004458
foreign words 0.0019949900000000003
arabic roots 0.001975844
linguistic features 0.001974241
arabic side 0.001952353
syntactic features 0.001946216
such features 0.00193707
arabic speakers 0.0019348540000000002
morphological rules 0.0018851760000000001
language model 0.001881603
dialectal data 0.0018795590000000003
training data 0.001849668
new features 0.0017802819999999998
msa training 0.0017663779999999999
morphological patterns 0.0017411870000000002
morphological variations 0.001740063
morphological constructs 0.001737019
morphological differences 0.0017260130000000002
morphological gen 0.001721356
morphological pat 0.001720172
distinguishing features 0.001719691
slex feature 0.001715584
words 0.0017145
morphological generator 0.001697099
meta features 0.0016832309999999999
morphological phenomenon 0.001678587
srule features 0.001676767
arabic 0.00166806
large msa 0.001662078
standalone feature 0.0016608869999999998
arz data 0.0015790370000000001
ing data 0.0015311740000000002
commentary data 0.001488046
language models 0.001482527
language identification 0.001476476
baseline system 0.0014663010000000001
training set 0.001464973
segmented data 0.001444628
msa tweets 0.0014441299999999998
dialect identification 0.001441336
identification system 0.0014247700000000001
morphological 0.00141309
features 0.00141155
test set 0.001405246
domain language 0.001404512
feature 0.0013945
msa cor 0.0013762479999999998
fact msa 0.001353432
dialect detection 0.0013490009999999998
same training 0.0012947229999999998
different languages 0.001275914
language processing 0.001267686
different dialects 0.001243954
unigram model 0.0012364
natural language 0.001228497
language identi 0.001223609
dialectal sentences 0.001216499
proper dialect 0.001214871
dialectal egyptian 0.0012143940000000002
trained model 0.00120914
gram model 0.001206911
particular dialect 0.001195389
effective dialect 0.001190159
dialect identi 0.001188469
other dialects 0.00118557
egyptian training 0.001184503
large corpus 0.001176514
training dataset 0.001170552
fication system 0.001167303
evaluation set 0.001123052
training sets 0.001087357
standard dataset 0.001056161
