James Scharf


2021

pdf
A Mixed-Methods Analysis of Western and Hong Kong–based Reporting on the 2019–2020 Protests
Arya D. McCarthy | James Scharf | Giovanna Maria Dora Dore
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

We apply statistical techniques from natural language processing to Western and Hong Kong–based English language newspaper articles that discuss the 2019–2020 Hong Kong protests of the Anti-Extradition Law Amendment Bill Movement. Topic modeling detects central themes of the reporting and shows the differing agendas toward one country, two systems. Embedding-based usage shift (at the word level) and sentiment analysis (at the document level) both support that Hong Kong–based reporting is more negative and more emotionally charged. A two-way test shows that while July 1, 2019 is a turning point for media portrayal, the differences between western- and Hong Kong–based reporting did not magnify when the protests began; rather, they already existed. Taken together, these findings clarify how the portrayal of activism in Hong Kong evolved throughout the Movement.

pdf
Characterizing News Portrayal of Civil Unrest in Hong Kong, 1998–2020
James Scharf | Arya D. McCarthy | Giovanna Maria Dora Dore
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

We apply statistical techniques from natural language processing to a collection of Western and Hong Kong–based English-language newspaper articles spanning the years 1998–2020, studying the difference and evolution of its portrayal. We observe that both content and attitudes differ between Western and Hong Kong–based sources. ANOVA on keyword frequencies reveals that Hong Kong–based papers discuss protests and democracy less often. Topic modeling detects salient aspects of protests and shows that Hong Kong–based papers made fewer references to police violence during the Anti–Extradition Law Amendment Bill Movement. Diachronic shifts in word embedding neighborhoods reveal a shift in the characterization of salient keywords once the Movement emerged. Together, these raise questions about the existence of anodyne reporting from Hong Kong–based media. Likewise, they illustrate the importance of sample selection for protest event analysis.