Imagine sitting on the edge of your seat waiting in anticipation to hear whether your entire work career was determined to be worthy of recognition amongst your peers. Eagerly waiting to hopefully be accepted into a club that anoints you an immortal legend. This is what the life of a major league baseball player is like after their career ends. They hope that over the course of their career they did enough to be inducted into the baseball Hall of Fame in Cooperstown, NY. Over the past several weeks I have chosen to analyze some key statistics within baseball, specifically those centered on shortstops, to gain a better understanding as to what exactly makes a shortstop Hall of Fame worthy compared to his peers. Prior to conducting this there has always been a common conception amongst baseball that longevity and continued success was more important than short periods of greatness when evaluating the possibility of someone reaching the Hall of Fame. I put this to the test to see if this was in fact true. 
The first part of this project involved finding a reasonable data set to test my hypothesis. After several hours of searching I found a resource known as the baseball data bank. This source had thousands and thousands of records dating back to the late 1800s of baseball statistics including ballpark data, individual statistics, team statistics, where each player was born, colleges attended, all star appearances, and most importantly, whether that individual had been inducted into the hall of fame. The data set was not all in one piece so prior to performing the analysis portion of this project there would be a significant amount of cleansing necessary to manipulate the data in the appropriate ways. 
The first part of manipulating this data was determining which factors that I thought were most important for analysis. Through this I determined that the common counting statistics such as home runs, hits, runs batted in, walks, and games plated would all be important but there was one major problem. All of the data that I had was for individual seasons rather than career statistics. While individual season data would be important for analyzing what the ideal season looked like for a hall of fame player, the sum total for the average hall of fame career would be equally, if not more, important in the analysis portion. Using Excel’s sumif statement I was able to add all career totals for individual shortstops in order to get their entire career statistics. Another issue with the data set was that it had common counting statistics but did not have important factors for a hitter such as average, on base percentage, and slugging percentage. These are all factors that play a large role in the evaluation of an individual’s career and without them; proper analysis simply would not be possible. Luckily all of the statistics to calculate these factors were present in the dataset; I just needed to figure out how to apply proper formulas in order to get the results. After a bit of research I found the following formulas:
