DREW SP 2013 Kyle K
Tuesday, May 21, 2013
Senior Project Day 12
Just finished the report. Need to add a conclusion but the hard part is done. I will also add a title page and a glossary for certain terms that may be unfamiliar to certain readers.
Monday, May 20, 2013
Senior Project Day 11
Worked on the weekend and today on the actual report. It has been time consuming, yet very rewarding, as I have finished up the report for Barry Bonds and already working on the final two players as I speak. I am doing my best to make this informal report accessible to more readers, meaning that the report is more on the statistics that people will recognize and be able to follow along, as well as the thought process behind the reasoning and calculations.
Friday, May 17, 2013
Senior Project Day 10
Have finally decided on the approach I will take in finalizing the numbers. Will look at the averages amongst the greats, add and subtract the standard deviation, plot all of those data points on a graph, and see the nice range I get. Then, I will take Barry Bonds' and Sammy Sosa's actual career numbers, and plant them on the same graph, seeing if their numbers fall within those ranges in their so called "clean years." If so, then I believe it would be safe to track and project their numbers into their later years had they not taken steroids. Some stats will be a bit more noticeable than others. Obviously home runs is the biggest stat for sluggers like those two, and more emphasis will be put on the more well-known stats in the report, as to keep it from getting any more complicated for readers who don't consider themselves baseball-savvy and might not be familiar with stats such as On-Base Percentage and Slugging Percentage.
Thursday, May 16, 2013
Senior Project Day 9
Finished up with the 5 All-Time Great players I used as a sample for the other players, and I have to say that the process has been quite surprising. Everyone peaks and declines differently in different ways. Of course, when I started this project, I didn't expect to find a exact numbers for these players, since there's no real way for us to really know what the final stat line would have or could have been. Yet, the more and more I calculate, the less and less I want to settle for a range, but instead find an exact number. I know that's impossible, but it's been on my mind these past few days. After I finished up the calculations by hand, I hit a dead end. The percent decline and standard deviations didn't match smoothly across the 5 players. Some statistics had some very close drops in decline, for example, for Home Runs, Willie Mays, Stan Musial, and Mickey Mantle experienced drops of 48%, 50%, and 50%, where as On Base Percentage decline for Mantle, Musial, Hank Aaron, and Ted Williams were 13%, 12%, 14%, and 16%.
What I'm thinking of doing is averaging out the percentages and finding their own standard deviation in order to set up a a range of percentage declines I can use for the steroid users. At least then, I can see where my calculations land in comparison to their actual values. Microsoft Excel has been an enormous amount of help to me. Once I hit that dead end earlier today, I decided to see if Excel could identify anything I had missed. It did not disappoint. I took the home run averages of Mantle, Musial, Mays, Aaron, and Williams and plotted them on a graph. Then, I found the standard deviation between their actual numbers and the average. I set up data points that made additional lines representing the sum of the average and standard deviation as well as the difference, giving me 3 lines in total. Finally, I decided to plot Barry Bonds' home run totals on the graph as well, to see what his numbers looked in comparison to the 5 legends:
As you can see, it was within the range of the averages, up until the years he supposedly started taking steroids. With this new found information, I'm hoping to use this same analysis for the other statistics as well, and compare them to the actual numbers of Bonds and Sosa.
On an interesting side note, while I was struggling to make a decision as to how to proceed, I decided to look on google for any tips or pointers as to forecasting or projecting a baseball players statistics. I stumbled upon a very interesting article on BaseballProspectus, detailing the difficulties of forecasting a players stats and what good basic steps a novice like myself could take. Yet, there was something about this article that stuck out to me. More specifically, it was the author:
What I'm thinking of doing is averaging out the percentages and finding their own standard deviation in order to set up a a range of percentage declines I can use for the steroid users. At least then, I can see where my calculations land in comparison to their actual values. Microsoft Excel has been an enormous amount of help to me. Once I hit that dead end earlier today, I decided to see if Excel could identify anything I had missed. It did not disappoint. I took the home run averages of Mantle, Musial, Mays, Aaron, and Williams and plotted them on a graph. Then, I found the standard deviation between their actual numbers and the average. I set up data points that made additional lines representing the sum of the average and standard deviation as well as the difference, giving me 3 lines in total. Finally, I decided to plot Barry Bonds' home run totals on the graph as well, to see what his numbers looked in comparison to the 5 legends:
As you can see, it was within the range of the averages, up until the years he supposedly started taking steroids. With this new found information, I'm hoping to use this same analysis for the other statistics as well, and compare them to the actual numbers of Bonds and Sosa.
On an interesting side note, while I was struggling to make a decision as to how to proceed, I decided to look on google for any tips or pointers as to forecasting or projecting a baseball players statistics. I stumbled upon a very interesting article on BaseballProspectus, detailing the difficulties of forecasting a players stats and what good basic steps a novice like myself could take. Yet, there was something about this article that stuck out to me. More specifically, it was the author:
Wednesday, May 15, 2013
Senior Project Day 8
Finished creating an excel spreadsheet for each player today. The task was long and monotonous, but well worth the time. Currently trying to experiment with Excel's functions. So far, I've noticed similar numbers amongst the players' decline. Two players had nearly the same % decline in Runs, Hits, Home Runs, Runs Batted In, Batting Average, Slugging Percentage, and On Base Percentage. Got super excited to see that as I did them back-to-back, but was a bit disappointed when the third player didn't follow that same trend, although Home Runs, Batting Average, and Slugging Percentage were still close to the first two. I'm learning a bit about regression analysis from the online statbook, trying to make sense of the numbers Excel spewed out for me. So far, I've been calculating averages of "declining years" comparing them to "peak/prime years" and measuring the percent decline between the two, in addition to calculating the Standard Deviation of the declining years averages and finding out how far percentage wise, the sum or difference of the average and the standard deviation is. My hope was that the standard deviation of the statistics would be similar across the board for all the players, but no such luck. Since the steroid users' stats increased rather than decreased during their so called "declining years" I'm hoping that using the decline of clean players can serve as a good model for declination.
![]() |
| Hank Aaron's Stats Over the Years |
![]() |
| The folder containing all of the Excel Spreadsheets of every player. |
![]() |
| Regression analysis on Excel...still trying to understand everything. |
![]() |
| Mickey Mantle's Stat Analysis Page 1 |
![]() |
| Mantle's Stat Analysis Page 2 |
![]() |
| Mantle's Stat Analysis Page 3 |
![]() |
| Mantle's Stat Analysis Page 4 |
![]() |
| Mantle's Stat Analysis Page 5 |
![]() |
| Stan Musial's Stat Analysis Page 1 |
![]() |
| Musial's Stat Analysis Page 2 |
![]() |
| Musial's Stat Analysis Page 4 |
![]() |
| Musial Stat Analysis Page 5 |
![]() |
| Musial's Stat Analysis Page 6 |
Tuesday, May 14, 2013
Senior Project Day 7
Excel is probably my new best friend right now. While I prefer and continue to write a lot of the calculations by hand, Excel has aided me in deciding where I start my own calculations of a player's decline. By just looking at a data table, its hard to determine where exactly the decline of a player begins, or where their prime years are. Some players, like Ted Williams, make it even harder due to the fact that their playing career was interrupted by military service, where most of their supposed "prime" was spent. Thus, it leaves me with data that has weird trends. Also, I also have to consider injuries as well, as I've left off some years of certain players in my calculations because they did not play full seasons some years. So far, I've been comparing declining years versus prime years of the greats, making sure I picked players long before the steroid era. I realized, however, a flaw in my thought process, as I was comparing declining years to the rest of the career when I actually wanted decline vs. peak. Been trying to play around with excel and learn about regression at the same time, though it has been a bit confusing since I've never taken a stats class before. Looking forward to talking to Ms. Ferrara this week in order to discuss how to process and analyze the data I've been given and make good projections/forecasts.
Monday, May 13, 2013
Senior Project Day 5/Weekend/Day6
Senior Project carried over into the weekend for me. I spent it reading the book, The Signal and The Noise by Nate Silver, detailing how predictions are made and why some succeed and others don't. He had a specific chapter on baseball and its obsession with statistics, and even mentioned that he developed a baseball projection system known as PECOTA, that it utilized on a site that I frequently use, BaseballProspectus. Needless to say, I was very surprised, and tried to use it myself. Unfortunately, only subscribers of BaseballProspectus have access to PECOTA, but it is relatively cheap to purchase a month long subscription, so I plan to buy one and see how it works.


Subscribe to:
Comments (Atom)






























