by: Matthew de Marte – May 13th, 2018
Supervised by: Nathaniel Karst
Athletes are always looking for an edge – how to get better, how to raise their level of play – and are looking for different tools and resources to help them to accomplish this goal. In recent years, baseball has been going through a boom of data-driven player development, with players and coaches looking for the best technology to increase on-field performance as much as possible. Technology allows for coaches to stop guessing, and instead to leverage data in order to deliver answers. As just one example, swing sensors have become extremely popular in recent years in both baseball and softball to help create better hitters.
The Blast Motion sensor, when attached to the end of the bat, gives the hitter different metrics pertaining to the swing. The Blast Motion sensor is the official bet sensor of Major League Baseball. As a baseball player looking for a data-driven way to objectively look at my swing and try to improve it, I purchased a Blast sensor last May. While using the sensor two important questions came up: (i) what metrics matter the most in creating the best possible swing? (ii) do any of these metrics correlate to on-field success? If this question can be answered, users of the Blast Motion sensor can be better prepared for how they use it to create better swings, and hitters.
There appear to have been no empirical studies of these questions, and so I decided to try to answer them myself. I had an entire college baseball team full of hitters to use as my sample. My goal was to create a study looking into the metrics on the Blast Motion sensor and see which correlate the best to on-field success and therefore are the most important to focus on and gear your training towards.
Data Collection and Exploration
The sample of this study is the Babson College Varsity baseball team. Each member of the team took 50 swings using the Blast Motion Sensor on their bat over the course of a single batting practice session sometime between November 2017 and February 2018. Each player was given the opportunity to warm up before swinging to ensure they were loose and were swinging as hard as they could, just as happens in real games. The swings were taken against an underhand toss that I threw. Having players hit against front toss rather than swing off a tee more closely simulates an in-game scenario. The 50 swings each player recorded were taken in 5 rounds of 10 swings. Players took a short break in between each round of 10 swings.
Gaining a better understanding of each metric provided by the Blast sensor will help me comprehend and analyze the conclusions I gather later on in the study. The definitions and calculations for each metric are detailed in the table below.
|Bat Speed (BS)||MPH||The speed of the bat at contact.|
|Attack Angle (AA)||degrees||The angle of the bat at contact where a completely flat-bat that is parallel to the ground is 0 degrees. If the bat is coming from a down-to-up angle that is positive and an up-to-down angle that is negative|
|Time to Contact (TTC)||seconds||The time it takes for a hitter to make contact with the ball from the start of their swing.|
|Peak Bat Speed (PBS)||MPH||The fastest speed observed at any point in the swing.|
|Vertical Bat Angle (VBA)||degrees||Vertical bat angle is the angle of the bat at contact with 0 degrees being a perfectly flat and horizontal barrel. A barrel that is below the hands at contact results in a negative angle(9). The ideal vertical bat angle range is -25 to -35 degrees (3) Figure 1 below displays Mike Trout making contact and is a good visual of where vertical bat angle is measured.|
|Power (P)||kW||Power is a measurement incorporating both bat mass and bat velocity. The average power generated during the swing is found from the effective mass of the bat, bat speed at impact, and the average acceleration during the downswing (10). Players with the ability to swing a heavier bat that they can accelerate faster produce more power.|
|Blast Factor (BF)||1-100||This is a metric created by the Blast team that is on the scale of 0-100 where 0 is the worst possible score and 100 is the best possible score. The 100 possible points in the blast factor are comprised of two equally weighted components: power and swing efficiency (1). The power part of it is comprised of the power metric. The efficiency part of Blast Factor is more complicated, and we will discuss it in more depth below.|
|Body Rotation (BR)||0%-100%||Body Rotation is also rated on a 0-100% scale. Body rotation is expressed as the ratio of Body rotation during the time a players “wrists unhinge” to the total rotation during this time. (8). The ideal number for body rotation is 45% and ideal range is 40%-50%. .|
|On Plane Percentage (OPP)||0%-100%||This metric calculates how “on plane” a player is. It is on a 0%-100% scale. The red line in figure 2 below represents the pitch plane. The green dots represent different points of Miguel Cabrera’s swing. The two dots that are roughly on the line represent the “on plane” portion of Cabrera’s swing. How well a player does this is represented in OPP. Blast calculates OPP by defining how long the sweet spot of a players barrel is on plane. The percentage is calculated by how well a players bat speeds up during this point (11). A typical range for a good swing is 55%-65% (4).|
|Peak Hand Speed (PHS)||MPH||The top speed of a hitter’s hands during the swing.|
Regression models for Blast Factor:
Now that I understand the meaning of every variable, I can move on to better understanding how the Blast factor is calculated. There have been some conflicting formulas I have seen. Due to these conflicting formulas, I will be running my own analysis in R to see if the data can explain the calculation of swing efficiency. The goal is for these models is to figure out what variables go into the equation that provides me with the swing efficiency score. I will run a linear regression model with the other nine swing metrics (excluding Blast factor) as the predictors and blast factor as the target. I will use the 1,000 swings captured on the Blast sensor in this study as my training dataset. The goal of this model is to understand which variables lead to a better blast factor to see if they can understand it. The full results of this output can be found in the appendix.
We find that a simple linear regression model can capture 72.1% of the variance in the blast factor (R2 = 0.7218). Every variable had a p-value less than 0.05 and was statistically significant in the model. To verify that all variables were making meaningful contributions to the model (outside of statistical significance), I used backward variable selection in R to see if any variables should be taken out of the model. Once again, every variable was included in the model.
To continue to try and gain more of an understanding of blast factor, I will try and see which metric influences blast factor the most. To understand which metric contributed the most to blast factor we will run nine different models. In each model, we will remove exactly one variable and compare the R2 of each model to the 72.18% benchmark we got from the full model. The difference between these two quantities will give us a measure of importance for each variable.
This table below shows the difference between the original R2 value of 72.18% and the model with the corresponding variable removed R2 value.
|Difference in R2||5.84||2.80||0.92||0.52||13.52||0.64||1.26||9.12||1.05|
By far the most important variable to computing blast factor is on plane percentage (OPP). When OPP was taken out of the model, the R2 value dipped 13.52% from its value in the full model. The next closest variable in importance to computing Blast factor was body rotation (BR) at 9.12%. Power (P) and peak hand speed (PHS) are the least important when calculating Blast. This taught us that OPP and BR are the most important variables in understand blast factor.
Below is a table with each metric and the metric coefficient in the full model.
Before interpreting any of these numbers we have to remember a few things. First, these coefficients are the metric relations to blast factor and blast factor alone. Also, while coefficient values may vary, the main thing is to look at is their impact on blast factor. In each interpretation, there will be a scatter plot included showing the relationship between the variables. The coefficients will be interpreted in the context of each metric’s scatterplot. The most notable variables and their relationship to blast factor are documented below, and the rest (along with the code used to produce these plots) can be found in the Appendices.
Bat speed: The coefficient of bat speed indicates that as bat speed increases, so will Blast factor. This agrees with intuition and baseball common sense because players who have fast swings generally have better swings. The following plot depicts the relationship between bat speed and blast factor. As bat speed increases from 55 MPH to 70 MPH blast factor increases steadily. Past 70 MPH bat speed, blast factor stays pretty consistent although there is a decrease past 80 MPH in blast factor. The overall trend of this plot is swing faster for a better blast factor. Of course, one variable in a model does not explain all the variability in the result, but bat speed is the second most important variable to the model so its relationship to Blast factor is important and must be examined fully.
Peak hand speed: The coefficient of peak hand speed indicates that as peak hand speed increases blast factor decreases. That is counterintuitive to what one might think. Having fast hands is a good thing according to conventional wisdom in baseball, so when hand speed increases a metric given to the overall quality of the swing such as blast factor should not decrease. Examining the plot below, the relationship between the two variables is interesting. As peak hand speed initially increases so does blast factor rapidly until it reaches about 23 MPH where it stays pretty consistent to about 26.5 MPH. From there increases in peak hand speed appear to have diminishing blast factor returns.
On plane %: The more on plane a hitter is the higher their blast factor. OPP had the highest change in the models R^2 value when it was removed from the model, meaning it has the strongest relationship with blast factor. This makes sense as being on plane should correlate to a better swing. According to Blast, 55% and better for OPP is a good rating (4). Another thing to consider is according to a spokesperson at Blast, Jose Altuve, the reigning AL MVP, has the highest on plane % of anyone the company has tracked. It stands to reason that one of the best hitters in baseball would also have the best on plan percentage. There is evidence of a clear linear relationship between the OPP and blast factor. As OPP progressively increases so does blast factor. Although the returns on blast factor slightly diminish as OPP surpasses 75%, there are not enough swings in this region for us to fully resolve the trend.
Power: The coefficient of power says as it increases Blast factor decreases. This is a little perplexing because you would think more power in a swing is a good thing. Also, power efficiency is half of the blast factor rating (1). One potential explanation is as a players swing becomes more powerful they potentially could become more erratic and lose efficiency to their swing. Outside of this explanation, I do not have a ready explanation for the sign of this coefficient. As power increases towards 4 kW blast factor increases, and then it stays pretty consistent until power reaches 6kW, and then blast factor slightly decreases as power increases more. There is not evidence of much of the relationship between power and blast factor the coefficient indicated, but such a small axis that could play in the lack of evidence of the relationship. The R2 value indicated power was insignificant in determining blast factor which is strange. Maybe blast uses a different power metric in their blast factor formula than in the actual power metric they produce. Otherwise, as the coefficient, R2 value, and plot predict there is not much of a relationship between the variables and quite possibly an inverse relationship if any.
Before completely jumping into the study one last way to understand blast factor is to create a regression tree with every metric trying to predict a players blast factor. The regression tree allows for another way of looking at each metrics relationship to blast factor and the variables can help predict and better understand blast factor. The regression tree:
In this regression tree model, the only variables used to find blast factor were body rotation, time to contact, on-plane %, attack angle, and power. Players fell into 14 Blast Factor categories based on swings they took. Players who had body rotations below 44% and a time to contact equal to or greater than 0.17 seconds struggled with their blast factor. The 27 swings in this leaf produced an average blast factor of 66. The last two leaves of this regression tree have average blast factors of 91, and 95 respectively. For a players swing to fall in one of these leaves the player must have an on plane % greater than 44% and a body rotation greater than 42%. The distinction between players who had a 95 average Blast Factor was that they had an on plane % greater than 54%! For players whose, average blast factor was 91 on their swings their on plane % fell between 45% and 53% and also had attack angles greater than 3.5 degrees. There were 156 swings in the leaf containing an average blast factor of 91, and there were 151 swings in the leaf containing an average blast factor of 95. That accounts for over 30% of swings in the dataset. To have a good blast factor, players it seems should concentrate on having their body rotation be greater than 42% and be on plane above 44%. The following table displays the error rates for this model.
|MAPE||MAPE Benchmark||RMSE||RMSE Benchmark|
The benchmark error is another representation of the total errors associated with this model. The benchmark of each error rate represents how well this model can predict something without using any real data from the dataset. To interpret the MAPE Benchmark, our benchmark error rate is 7.234%, meaning with no use of the dataset that is how often the model will make an error. The error rates of MAPE and RMSE must be lower than their benchmark rates to ensure the model is valid and good.
Both MAPE and RMSE are below their benchmark rates meaning this tree model is a good model and results can be taken seriously.
The linear regression model, plots, and regression tree all indicate that on plane % is probably the best indicator of blast factor and that is can explain a lot of the variability in a players swing efficiency.
Predicting On-field Performance
The actual goal of this study is to see what metrics Blast provides if any, correlate well to on-field success, meaning if a player’s swing performs well in some metrics, does that make the player a better hitter? If I am able to identify certain metrics that make a better hitter than the Blast Motion sensor can be better utilized to create and identify good hitters. To define a hitter’s success and the measure of how good a hitter is the response variable I choose is wOBA (weighted on-base average). The following snippet from Fangraphs shows the definition and formula used to calculate wOBA.
I use wOBA over other notable offensive metrics like batting average, on-base %, slugging %, on-base plus slugging, and RBI’s for a few reasons. Batting average does not account for extra-base hits being worth more than singles. On-base %, while extremely valuable in player evaluation, also does not account for extra-base hits being worth more than singles. On-base plus slugging, the sum of a players on-base % and slugging %, is great, but it does not have the advantage of giving each outcome a specific weight like wOBA does. Luck and randomness account for a lot of the variation in RBI’s making it a poor choice to use as the representation of offensive output. Luck and randomness occur in all stats, but there is more a player cannot control for in their RBI total than other metrics. Players who play for teams with lineups that have hitters who get on-base often get more RBI chances and generally drive in more runs, while quality hitters in bad lineups RBI totals generally suffer due to a lack of base runners. wOBA is superior because it provides specific weights for each outcome and it is easy to understand. The linear weights are calculated by taking every individual play that occurred in a given season and calculating the sum of their Run Expectancy value divided by how many times that event occurred (7). The sum of Run Expectancy is calculated getting the sum of each play in RE. RE is calculated based on the Run Expectancy Matrix created by Tom Tango where the run expectancy of the end state of a play subtracted by the run expectancy at the beginning of a play plus runs scored. There is one potential way to improve the metric being used as the response variable, but being that my sample size was college baseball players I do not have the necessary technology or resources to calculate it. In the MLB there is a metric called xwOBA or expected weighted on-base average. This is similar to wOBA, with the only difference being it is calculated by what a player is expected to get on their batted balls based off the exit velocity and launch angle of each hit, two metrics measured by Statcast (6). As elements like luck, an opposing team’s defensive ability, and wind can play a role leading to a difference in xwOBA and wOBA, xwOBA would be better to use if possible because it is computed solely off the inputs of what a player does hitting. At the collegiate level without the funding necessary for such a system it is impossible to get this which is alright because wOBA is a sufficient response variable in this study.
Now that we have an understanding of the independent and response variables we can build the linear regression model. I will be building four different models. While there are 20 members of the Babson College baseball team who are hitters, there are only a certain number of team members who actually get to play in games. Due to this, I will be building models using players who got at least 40 plate appearances during the 2017 season in one model and the other model use players performance during fall intrasquad scrimmages. By doing this it will give me a larger sample size to compare to and see if different metrics are significant in both models. If metrics are significant in both models it gives a better chance they indicate a good hitter and instruction with the Blast should be tailored towards these metrics.
To create the models I will use backward selection in R so that it chooses the variables in each model for me. Code and output can be found in the appendix.
For the Spring model the following variables were deemed insignificant by backward selection: On plane %, peak bat speed, and body rotation. The R^2 output for this model was 48.27%.
For the Fall model, the following variables were deemed insignificant by backward selection: attack angle and vertical bat angle. The R^2 output for this model was 57.48%.
To begin evaluating the two models the following table shows the coefficients for each variable in the models with an interpretation of each coefficient.
|Metric||Fall coefficient||Spring Coefficient|
|Time to Contact||5.3589336||-1.3025226|
|Peak Bat Speed||-0.0039033||0.0042570|
|Vertical Bat Angle||N/A||0.0023798|
|Peak Hand Speed||-0.0048995||N/A|
|On Plane %||-0.2789730||N/A|
Bat Speed: For both of the models the coefficients were pretty consistent. Each model said that for each mile an hour faster a player swings their wOBA decreases slightly. While this may seem confusing, bat speed is good, but a certain point returns may diminish. Consider the graphic in Figure 3:
The MLB Average Bat speed according to this is 69.6 MPH. The average swing speed in this sample size is 70.98 MPH. Maybe there is something to swinging as hard as you can or swinging harder leads to a slight decrease in production. This graphic was taken from the 2016 MLB Futures Game. The Futures Game consists of the Top Prospects across baseball playing against each other in a Scrimmage Game. The top speed in this game was just 77.4 MPH. Three players in the sample I collected had average swing speeds above this peak speed. Nine of the 20 players in this sample recorded at least one swing faster than 77.4 MPH. Maybe players swing slower in game. I do not think division 3 college baseball players would swing harder than professional players who are older and stronger than college players.
Attack Angle: Attack angle was one of the variables that were significant in one model and not in the other. I don’t really know why that is. The spring sample has 9 hitters and the fall sample has 20. So there has to be a difference in the 11 additional hitters and their production. As attack angle increases wOBA increases slightly in the Spring model.
Time to Contact: This is the most perplexing metric on the list. In the spring as time to contact decreased wOBA increased which makes sense. For the Fall model, it was the exact opposite and the coefficient was quite large. Hitters in the fall sample who were slower to the ball had higher wOBA’s. It would make sense hitters who had success during the actual Spring season were faster to the ball. This could potentially be due to its relationship with peak bat speed. For each model, time to contact and peak bat speed have inverse relationships with each other. One is positive and one is negative.
Blast Factor: This metric was extremely consistent across the spring and the fall. As blast factor increased wOBA increased in both instances which is as expected.
Power: Power was positive in both models meaning as a players power increased their wOBA did too.
Peak Bat Speed: Peak Bat speed is significant in both models. In the fall model as peak bat speed increases production slightly decreases which is consistent with the results of bat speed. For the spring production slightly increases as peak bat speed increases. Overall the net of the two would suggest an increased peak bat speed doesn’t lead to an increase in a players wOBA.
Vertical bat angle: As vertical bat increased production increased. This metric was only significant for players in the spring sample. This coefficient makes sense because as vertical bat angle increased to the desired angle of -25, production should increase with it as well.
Peak Hand Speed: Peak Hand speed was only significant for the model measuring players production in the fall. As peak hand speed increased production slightly decreased. Going into this study I did not think peak hand speed was very important to determining a hitters success and quality of their swing.
Body Rotation: This is another metric that was only significant for the fall model. As body rotation increased generally so did a players production.
On plane %: This metric also was only significant for the sample consisting of players production during the fall. As on plane % decreased production increased. This is interesting because it was the most important metric for predicting blast factor, which is one of the best predictors of a productive player. This could be evidence that being on plane does not necessarily indicate a productive hitter and vice versa.
I have to now calculate the MAPE and RMSE for the two models. The following table has the MAPE, RMSE, and benchmarks for these measurements for the Spring and Fall models. All measurements are rounded to the fifth decimal place.
|Season||MAPE||MAPE Benchmark||RMSE||RMSE Benchmark|
The next step to this study is to build regression trees to see if there is a fluid way to predict each players success through a visualization. The following picture is the regression tree from the Spring data:
The variables that appear in the spring regression tree are time to contact, attack angle, vertical bat angle, peak bat speed, and blast factor. In the lowest leaf for wOBA, the average wOBA was 0.170 and the leaf contained 44 swings. Players who found themselves in this leaf had a time to contact of 0.16 seconds or greater and an attack angle less than 7.5 degrees. For the most productive leaves, the average wOBA was 0.420, and 0.390. The players in the leaf with an average wOBA of .390 only needed one metric to be separated in this leaf and that was having a time to contact lower than 0.14. There were 165 swings or 37% of the data that fell into this leaf. For the leaf next to it that had an average wOBA of 0.420, these players had swings with time to contacts of 0.14, or 0.15 and vertical bat angles greater than -20 degrees. Only 20 swings or 4.4% of the dataset ended up in this leaf. For this model players who had the most success simply had time to contacts of .15 or lower. They were the quickest to making contact.
The following picture is the regression tree from the Fall data:
For the fall regression tree, the variables included are time to contact, on plane %, peak hand speed, body rotation, and power. The least productive bracket of swings in this model contains just 34 swings but has an average wOBA of 0.150. Players in this leaf had swings with time to contact greater than or equal to .15, on plane % greater than or equal to 36%, body rotation less than 40% and peak hand speed less than 26 MPH. The most productive leaf by wOBA in this tree had a wOBA of 0.630. This leaf contained 79 swings or about 8% of the data. Swings in this leaf had time to contact less than 0.14 seconds, on plane % less than 58%, and peak hand speed greater than or equal to 22 MPH. The largest leaf had an average wOBA of 0.280 containing 323 swings or roughly 34% of the data. This leaf contained swings with a time to contact greater than or equal to 0.14 seconds, peak hand speed less than 26 MPH, body rotation, greater than or equal to 40%, and power below 4 kW.
The following table shows the MAPE, MAPE benchmark, RMSE, and RMSE benchmark for the regression trees from the Fall and Spring data. Numbers are rounded to five decimal places.
|Season||MAPE||MAPE Benchmark||RMSE||RMSE Benchmark|
Discussion and Conclusion
First, before evaluating the actual results of the model we need to evaluate the MAPE, and RMSE of each regression model, and regression tree. MAPE and RMSE represent error rates that evaluate how good a model is. These error rates that are produced, are compared to the MAPE Benchmark and RMSE Benchmark. If the MAPE and RMSE are less than their benchmark rates, then the model is good! If not then the model is not good and not really useful. Fortunately, both regression models and regression trees MAPE and RMSE all are significantly lower than their benchmark rates. This means the models do a good job of predicting the response variable! Now that I know the models we produced are useful I can officially draw conclusions.
In the Spring linear regression model, the following metrics were deemed significant in determining a players wOBA: Bat Speed, Attack Angle, Time to Contact, Blast Factor, Power, Peak Bat Speed and Vertical Bat Angle. Time to Contact and Peak Bat Speed both had p values above .05 in this model, but backward selection through R said these variables added to the validity of the model so I kept them. Of all these variables Bat Speed, Blast Factor, and Vertical Bat Angle had the lowest p-values, so these metrics are the most significant in predicting a players success in the Spring linear regression model.
The metrics included in the Spring regression tree are: Time to contact, attack angle, vertical bat angle, bat speed, blast factor, and peak bat speed. Time to contact was the metric in this tree that best predicted a players success. Players who had success during the Spring season were quick to the ball and players who struggled were slow to the ball.
The last conclusion to draw from the spring linear regression model is the R^2. The R^2 for the spring model is 48.27%. Meaning 48.27% of the variance of a players wOBA in the spring can be explained by a players performance in the significant swing metrics. This may seem a little low, but I think this shows great correlation. A swing is not *everything* in hitting a baseball. There are other variables I knew could not be accounted for in this model, such as approach, vision, among others. Second, there are only nine hitters in this sample size. Nine! That is significantly lower than the sample size I have for the fall. So in a small sample size, there is more variance and each swing matters more. I think “swing metrics” being able to explain about half of a players production in this model is great and proves the Blast Motion sensor can be a useful tool to help improve a players swing and production.
In the fall model, the following metrics were deemed significant in determining a players wOBA: bat speed, time to contact, blast factor, power, peak bat speed, peak hand speed, body rotation, and on plane %. Peak Bat speed has a p-value above .05, but backward selection in R selected the variable, saying that it contributed to the overall validity of the model. Every other variable had a p-value under .05. The most significant variables in their contribution to this model were bat speed, time to contact, power, and on plane % because they had the lowest p-values.
The regression tree for the model included the following metrics: Time to contact, on plane %, peak hand speed, body rotation, and power. There was no clear trend to a player being successful or not based on the regression tree for the fall like the tree in the trend for spring regression tree.
The last conclusion to draw from is the R^2 value. In the Fall model, there is a 19 player sample size. More than twice as large as the spring model. The only difference is in this sample players did not accumulate as many plate appearances as those in the spring model did. The R^2 in the fall linear regression model was 57.44%. Nearly 10% greater than that from the spring model. This means how a player produced in the swing metrics included in this model can explain 57.44% of the variance in their wOBA! This is better results than the spring model in terms of indicating a relationship between performance on the Blast to on-field performance. This is further evidence that the Blast Motion sensor is a useful tool to use in the evaluation and development of a hitter. The sensor is not able to explain everything that leads to a players performance, but can certainly explain a large portion of it.
Now that both models have been evaluated I want to look at the variables deemed significant in both models and which ones are probably the best to gear instruction to create better swings. The following variables were significant in both models: bat speed, time to contact, blast factor, power, and peak bat speed. Of these metrics, there is one I immediately want to eliminate when considering which metrics to focus on and that is peak bat speed. Bat speed and peak bat speed pretty much measure the same thing. There is no reason to try and improve both because if you increase bat speed you increase peak bat speed and vice-versa. I will eliminate peak bat speed from this group and only consider bat speed in this analysis. The four metrics appear to be the most significant contributors to a good swing and on-field success according to this study. Evaluation of these four metrics are below and their significance to a swing are explained below according to the results of both linear regression models. A table showing the coefficients of each metric in the spring and fall models are shown below:
|Metric||Fall coefficient||Spring coefficient|
|Time to Contact||5.3589336||-1.3025226|
Bat Speed: For guys who swing slightly slower their production increases. One reason I think this happened is the player with the best production in the spring model had the 5th lowest average bat speed of any player in the complete sample size. This player also had the 3rd lowest average bat speed of the players in the spring model. The player with the lowest average bat speed happened to be fairly productive in both the spring and fall model. I don’t necessarily believe because of this swinging slower is better, but maybe specifically training to increase bat speed isn’t the best to increase a players production. I think what this tells is that players are capable of being successful at different bat speeds. You do not necessarily have to swing harder to be a productive hitter. In terms of training to swing harder to increase production, that is an entirely different question that I cannot answer based on this study alone.
Time to Contact: Like bat speed, time to contact is another perplexing metric, in terms of its interpretation. In the spring model the quicker a player was to make contact the better their production was according to our model. In the spring regression tree, there was evidence that time to contact was the most important metric for predicting success. In the fall model, it was the exact opposite. I do not think being slower to contact necessarily indicates a more successful hitter like it did in the Fall model. I think what this says is players do not necessarily have to be fast to make contact to have success as a hitter. Players should try and be as quick to the ball as possible because if they do this it means a player can wait and recognize a pitch for longer, even for a few hundredths of a second it allows them to decide later whether they want to swing or not at a ball or strike and make adjustments better within in their swings. A player does not necessarily have to be fast to the ball to be successful, there are other variables that go into a hitter being productive that a player can excel in making them productive. If a player can utilize their time to contact properly and ensure they can have proper timing with it, they can be successful with a slower time to contact.
Power: A metric that both models agree upon! Players who had a higher power output in both models tended to perform better by wOBA. This is parallel to what you might think. Players using the Blast Motion can try and increase their power output to increase their production. This makes sense because players who hit for more power and produce more of a power output tend to be better hitters. This represents a clear conclusion and guidance in use of the Blast Motion Sensor. If a player increases their power output they have a better chance of increasing their production, therefore those using the sensor should train to better their performance in this metric.
Blast Factor: Another metric that the models agree upon! The company Blast says the better the blast factor the better the hitter, and according to the models as blast factor increases wOBA slightly increases. This means players and coaches using the Blast Motion can teach hitters to increase their blast factor to become better hitters. A problem with this like we explained above is blast factor is extremely complex. While it is still somewhat unclear after the analysis on blast factor performed as to what exactly goes into it, it is known blast factor is half power index, and half efficiency index. The models run trying to explain blast factor indicated on plane % is the best indicator of a strong blast factor. To increase blast factor it is beneficial to tailor instruction to improving power output, and players on plane %.
Those are the four metrics based on this study I would suggest focusing on, to players using the Blast Motion Sensor to increase their output on the field. Power and blast factor have the strongest evidence that excelling in these two metrics leads to a productive hitter. This is not to say that the other metrics are worthless and not worth analyzing and attempting to improve, but over the sample size, I analyzed these two metrics explained players on-field production the best. A player should also understand while increasing bat speed, and decreasing their time to contact can make them a better hitter, it is not necessary according to this study to perform well in these metrics to be a successful hitter.
After drawing conclusions the last thing that must be done is an evaluation of the study design, and what I learned could be improved in this study. First, there are certain aspects of this study design that could have been done better. The first is obvious to me, but would have been impossible to accomplish with the resources I had, is to obtain swing metrics from swings players actually took in game. This is something done at the MLB Futures game every year, which was noted earlier in a graphic, but unfortunately, wearable technology is illegal by the NCAA. Another thing is to have gotten more swings from each player. With the sample size, it was difficult to ensure everyone participated and swung on the Blast Motion sensor. Ideally, I would have had every player take 1,000 swings using the sensor because I believe the swing results would have been more consistent. In one 50 swing sample size results can be inconsistent, a player could have been fatigued, not hit for a while, taking lazy swings, and a whole bunch of other factors could have affected their performance. If they took a large sample of swings all of these external variables would have evened themselves out. The last thing I could have done better is getting a larger sample size of players. Of course, I could not get other players involved due to the difficulties of being in college and not having the luxury of seeing other baseball players not apart of the Babson team. Ideally, I would have randomly selected thousands of different players from across the country to participate in this study to be able to approximate that sample size for one representing the entire population of baseball players. Unfortunately, that is not realistic at the moment and I had to work with the sample size I was provided at Babson College.
Overall what I learned from this study is the Blast Motion sensor is a useful tool in predicting a hitter’s performance, evaluating their swing, and can be utilized by coaches. According to the models this study produced, about half of the variance in a players production by wOBA can be explained by their performance in the swing metrics Blast provides. Although due to the sample size constraints this cannot be approximated to the entire sample size of hitters, this does give evidence the Blast sensor is a good indicator of a players performance. The most important metrics this study indicated to concentrate on are bat speed, blast factor, time to contact, and power. While it is potentially beneficial to train to increase their bat speed, a player with slow bat speed is not necessarily one who is a bad hitter. The same goes for time to contact, a player who is not quick to the ball can be successful as well if they utilize their timing properly. Players should work to increase their power output and blast factor. Thank you for taking the time to read this study and I hope coaches and players alike can use this and the Blast Motion sensor to better themselves as players and instructors. If anyone would like to access the code used, and spreadsheets used for this study you can find them on GitHub located HERE. There are some additional visualizations in the appendix that are not included here as well if anyone would like access to that, and the full appendix, feel free to reach out to me at firstname.lastname@example.org. Lastly, if anyone would like to have any further dialogue about this study feel free to reach out to the email provided!
- “What Is Blast Factor?” Blast Motion, blastmotion.com/training-center/baseball/metrics/blast-factor/what-is-blast-factor/.
- “What Should Body Rotation Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/body-rotation/what-should-body-rotation-be/.
- “What Should Vertical Bat Angle Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/vertical-bat-angle/vertical-bat-angle-2/.
- “What Should On Plane Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/on-plane/what-should-on-plane-be/.
- “What Should Attack Angle Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/attack-angle/what-should-attack-angle-be/.
- “What Is a Expected Weighted On-Base Average (XwOBA)? | Glossary.” Major League Baseball, m.mlb.com/glossary/statcast/expected-woba.
- Linear Weights | FanGraphs Sabermetrics Library, http://www.fangraphs.com/library/principles/linear-weights/.
- “What Is Body Rotation?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/body-rotation/what-is-body-rotation/.
- “What Is Vertical Bat Angle?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/vertical-bat-angle/vertical-bat-angle/.
- “What Is Power?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/power/what-is-power/.
- “What Is On Plane?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/on-plane/what-is-on-plane/.