Matthew Sundberg
2022
The Lexometer: A Shiny Application for Exploratory Analysis and Visualization of Corpus Data
Oufan Hai
|
Matthew Sundberg
|
Katherine Trice
|
Rebecca Friedman
|
Scott Grimm
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Often performing even simple data science tasks with corpus data requires significant expertise in data science and programming languages like R and Python. With the aim of making quantitative research more accessible for researchers in the language sciences, we present the Lexometer, a Shiny application that integrates numerous data analysis and visualization functions into an easy-to-use graphical user interface. Some functions of the Lexometer are: filtering large databases to generate subsets of the data and variables of interest, providing a range of graphing techniques for both single and multiple variable analysis, and providing the data in a table format which can further be filtered as well as provide methods for cleaning the data. The Lexometer aims to be useful to language researchers with differing levels of programming expertise and to aid in broadening the inclusion of corpus-based empirical evidence in the language sciences.