by: Matthew de Marte – July 2nd, 2018
Throughout the course of a long season, it can be difficult to rank all 30 Major League teams. Division standings and team records give us a great understanding of how teams are performing, but when trying to rank all 30 teams, looking at records alone does not account for all the variables that make a good team. Run differential is a more accurate measure when evaluating a team’s overall performance, and predicting their future. Pythagorean Win % records does a great job already of accounting for a team’s run differential, and predicting its success.
For SOTG’s power rankings I want to create a formula that goes further in depth in evaluating a team’s performance than Pythagorean Win %. For those who do not know, here is the formula for calculating Pythagorean Win %
To create the desired Power Rankings I created a metric called weighted run differential. Pythagorean W-L is great, but there are some variables I believe that can be added to the equation. The two variables I am talking about are home and road performance, and the strength of the opponent. Depending whether you play your opponent on the road or at home can dictate how difficult a matchup is.The strength of your opponent also has to be accounted for in this ranking system. Using these rules a blowout win against the Boston Red Sox, is worth more than a blowout win against the Miami Marlins. This also means a blowout loss to the Miami Marlins hurts a teams weighted run differential more than a blowout loss to the Boston Red Sox would. A team who consistently has large margins of victory against playoff caliber teams will have that reflected in their weighted run differential, which is not reflected in current rankings systems. The goal of this rankings system is to create a power rankings that is effective in telling what has happened, but is a better predictor of future success than simple W-L records, and Pythagorean W-L is. Proving the latter part of that statement will be difficult so in future Power Rankings the predictive power of weighted run differential will be compared to current ranking systems.
This ranking system will be known as weighed Run Differential W-L record, or wRD for short. Now that we understand the basic principles behind wRD, here is how I went about calculating the wRD power rankings.
- First, I will take the run differential of each team on the road and at home.
- Each team needs to be assigned a weight for its home and away performance. To do this, the mean and standard deviation for run differential of teams on the home and road must be calculated. After doing this, I calculate each teams weight for home and away using this formula: (((Home or Away RD/(SD +/- Mean))*.34)+1. The resulting number is a teams wRD weight. A few notes on the formula used to calculate teams wRD weight: Addition or subtraction of the mean value is dependent upon whether or not run differential is below or above the mean. If the value is above the mean, the mean is added to the SD the .34 constant value is positive. If the value is below the mean, the mean value is subtracted from the SD and the .34 constant value is negative. After following this formula each team should have its own home and away weight. The weights are limited to stay between 0 and 2 to ensure weights between teams are not too drastic, and no team has a negative weight.
- After calculating each team’s weight, I have to go back to the scores of each game and plug in each teams opponents wRD weight. If team A is playing team B at home, then the weight of Team B on the road will be used in calculating the wRD from that game for team A. To calculate the wRD for a specific game, the formula is: (Runs Scored-Runs Allowed)* or / Opponents wRD weight. If a team wins, then the wRD weight value is multiplied by game run differential. If a team loses, the score difference is divided by the wRD weight value. Once this is accomplished, a wRD must be calculated for every game of the 2018 season.
- After doing this, I calculate each teams Home and Away wRD.
- Next, a runs per win value must be calculated. To do this I first gather total runs scored by home and away teams during the 2018 season. After this is calculated, totals runs scored by home and away teams are divided by the number of respective wins home and away teams have accumulated during the 2018 season. These values give us how many runs a win on the home and road is worth during the 2018 season.
- After the runs per win value is calculated, we can finally translate everything into a wRD W-L record for each team. To calculate either a team’s home or road wRD wins the following formula was used: (wRD/RunsPerWin). The value produced in the previous equation will be called WAbA, and is then used in the following formula: (# of home/road games/2)+WAbA. Repeat this process for both a teams home and road splits. Then, add up the two values and the constant value to get their wRD win total.
- After getting final wRD win total, divide that number into game total to get wRD win %. Lastly, round the wRD win total to the nearest integer to get a teams wRD record.
The process to get to a teams wRD W-L record can be confusing and involves a lot of steps, so if you are confused about anything, please feel free to reach out to me at email@example.com so I can try and clear up any confusion. The databases I am using for this are updated everyday so if anyone is interested in them please reach out, otherwise I will post them on GitHub at the end of the season.
Now let’s take a look at the power rankings! Note that wRD W-L record wHome RD, and wAway RD is rounded to the nearest integer:
|Rank||Team||wRD W-L||wRD W%||wHome RD||wAway RD||W-L||Win %||Luck|
According to wRD Win % the Houston Astros are the best team in baseball right now! I am very satisfied with how these rankings came out because for every team except for two there luck is within +/- 5 wins. The Astros according to wRD W-L are the unluckiest team in baseball and are have won six less games than expected, while the Mariners have been the luckiest team in baseball and have won 10 more games than expected!
The goal of this rankings system is to see if this algorithm can predict a teams future success better than W-L record or Pythagorean W-L does. It is going to take time to see how wRD Win % does as a predictive tool so I will save that analysis for a future power rankings. To evaluate how wRD Win % is let’s compare it to Win % and Pythagorean Win %.
The following plot compares wRD Win % to actual Win %:
Correlation = 0.9327055
Teams above the blue line have been “lucky” and teams below the blue line have been “unlucky”.
The following plot compares wRD Win % to Pythagorean Win %:
Correlation = 0.9827361
Teams above the blue line have a Pythagorean Win % higher than their wRD Win %, and vice-versa. wRD Win % has a stronger correlation to Pythagorean Win % than it does to regular Win %, which could be a strong sign in displaying its potential predictive powers. One last comparison is seeing how Pythagorean Win % correlates to Win %:
Correlation = 0.9277185.
wRD Win % has a stronger correlation to Win % than Pythagorean Win % does. The difference is rather small, but it appears wRD Win % seems to be at least as valuable a tool as Pythagorean Win % in its predictive nature. It will be fun to continue to look at the how wRD Win % plays out through the rest of the season!
This is the first edition of the SOTG MLB Power Rankings. I plan on releasing it again on August 1st, and September 1st, and every week from there until the end of the season. I hope you all enjoyed our power rankings! Remember feel free to reach out if you have any questions or would like to discuss the rankings!