Exit Velocity Over Expected: A Dive Into Hitting The Ball Hard
To me, and probably to you too, watching hitters mash is one of the most captivating experiences in sports. There’s nothing like a scorcher off Giancarlo Stanton’s bat. Or an absolute rocket from Mike Trout. A piss missile from Yordan Alvarez. A moonshot from Kyle Schwarber. It goes on and on.
The analysis of the aforementioned and all batted ball events has improved vastly. Launch angle has been used to measure the degree at which a hitter makes contact. Exit velocity was deployed to capture how hard a hitter is hitting the ball. And, both of those have been paired together in discussion, creating a new metric in itself: barrel rate. There are so many more metrics that have contributed to our understanding of capturing hitter performance. Honing in here, exit velocity has gained standing in how we look at hitters and pitchers. The relationship between hitting the ball hard and overall success for hitters is real, and the same goes for pitchers and limiting hard contact.
Exit velocity already gives us great insight into that, but there was more to be done with it. What if a hitter is routinely hitting barrels, but all the pitches being thrown to him are middle-middle meatballs? Is that as impressive as someone who’s hitting put-away sliders with power? As remarkable as someone who’s consistently barreling up 100 mph fastballs? I’d argue no, and that’s what led to the creation of the metric this article is about: exit velocity over expected, looking to find how power truly plays in baseball.
Before I get into the metric, the process, and the results, I’d be remiss if I didn’t shout out a friend who greatly helped with this piece, Sean Sullivan. In fact, he already built a model for exit velocity over expected, which you should absolutely read about here. Phenomenal work.
Exit velocity over expected (or EVOE) was made to evaluate pitchers and hitters. Who prevents hard contact with the best relative to their stuff? Which hitters are really demonstrating not only their power but their pitch recognition ability? By factoring in pitch movement, location, and a few game-state elements, we can adjust exit velocity based on expectation to really depict who the best power hitters and preventive pitchers are in the game.
I don’t want to get into too much of the nerdy, lengthy process behind the development, so I’ll try to keep this section short, but it’s owed to the reader for transparency.
All work was done in RStudio using the language R. Data, ranging from 2016 until the present, was acquired from BaseballSavant thanks to Bill Petti’s baseballR package. For anyone looking to get into baseball analysis, can’t recommend it enough.
Now for some of the data cleaning steps, this project focused on exit velocity, and, because of that, I only wanted to work with batted balls. This meant removing all non-instances. Additionally, I removed most bunts (some were labeled as groundballs) from my data since those don’t really capture a player’s true ability to hit the ball hard. I also removed any rows that were missing my selected inputs for the model, leaving me with about 727,000 rows of data to work with.
I’ll quickly run through my selected features for the model and the reasoning. The goal here was to fully isolate batter ability from any predictions, allowing the skill of the batter to be revealed in the model’s results. I was able to accomplish that by focusing on pitch specs and a few game-state factors as the features for the model. Namely, here are the main inputs: where a pitch crossed the plate (location), pitch velocity, pitch movement, and the count. The former three all speak to the “stuff” of a pitch, and the latter more to the situation at hand for a batter and pitcher. And here’s what each feature’s importance was, as you can see a clear focus on the location in plate_x and plate_z:
Now that I had my model’s inputs chosen and cleaned, I went ahead with my XGBoost model (will happily discuss the specifics of it with anyone who wants) to find the predicted exit velocity of each play in my dataset, ranging all the way back to 2016. And from there, all I had to do was subtract the actual exit velocity by the predicted value to get my new metric, exit velocity over expected!
A small but important note regarding the model’s output, as more data comes in, the model will update and I’m looking to update it every month. This means values won’t stay static, but they won’t sway by that much. Just worth keeping in mind in case your favorite hitter’s EVOE drops a bit.
Well, what does it actually tell us? At its most basic form, it tells us that given a pitch’s attributes, how hard is a player hitting the ball compared to what we expected. A positive EVOE means the hitter hit the ball harder than expected, and vice versa. Players with higher EVOEs likely have better bat paths, hands, and overall swing decisions. Plus the power. Don’t forget the power.
Many learn best with their eyes though, and this is best shown through an example.
Kendall Graveman threw this sinker at 97.2 mph with solid horizontal movement in an extremely tough location to hit. Yet, Aaron Judge hit this ball from Graveman at 111.1 mph, 44.08 mph above expected. This ball had the second highest EVOE out of every batted ball event I tracked. It speaks to a player’s ability to consistently hit hard pitches, thanks to either his bat path, pitch recognition, or sometimes, luck. On the other hand, players sometimes miss easy-to-hit pitches. Take this one for example:
Jose Altuve only hit this ball 73.4 mph according to Statcast. The predicted exit velocity on it though was astronomical, coming in at around 100 mph. It’s pretty easy to see why: a low, middle pitch with average velocity and not too much movement. Altuve missing a solid pitch to hit resulted in an EVOE of -26.66 mph for this play. It’s only one event but this is a great example of the other side of hitting, missing meatballs.
By analyzing EVOE over the course of a season or a player’s career, we get a much more insightful look at who separates themselves from the pack with their bat skills and who lags behind, failing to perform as we may expect.
Let’s start with the positives though: here are the top leaders in EVOE over their career, minimum of 200 batted ball events.
This list tracks. Every single one of these guys hits nukes. Julio Rodriguez slotting in at 10, after an electric start to his career, is what you love to see. Aaron Judge and Giancarlo Stanton back to back, 1 and 2, just makes sense. It’s not all pretty, though. With a top ten, comes a bottom ten:
Excuse the headshots getting fried on both tables, but the dweller-board of EVOE makes a lot of sense. Speedsters that never really figured out how to hit for power, or to hit at all, to be honest. Oh, Billy Hamilton, what could’ve been.
Of course, though, what good’s a stat if it isn’t stable? Good thing EVOE certainly is:
A 0.67 R value is super encouraging, showing the stat has validity year to year and it’s something a hitter can largely control. And it makes sense, power hitters usually continue to hit for power. You’d be hard-pressed to find many guys who suddenly lose their strength during their careers. It rarely disappears.
There’s a lot more to be analyzed: how does every hitter line up by EVOE? How do hitters do against certain pitches? How do pitchers do by this metric? Where is it? Well, instead of bombarding you with every single intriguing stat I found, I’ll leave some of that to you with this ShinyApp. It contains leaderboards, player cards, and strike zone plots that are all easy to access.
EVOE has a lot of usefulness aside from the stuff we’ve already discussed. Testing its relationship to 90th% exit velocity from my friend Jeremy Siegel (@JerSiegs) is next on the to-do list. Power should match up power, and to see that would be validating. Comparing it to launch angle is of importance as well, to see if that sweet spot for power can be narrowed down. Some of the work falls on the average fan, though, as so many niche cases can be uncovered through the app and analyzing performance against different pitches and zones.
In terms of the metric, itself, it, by all means, isn’t perfect. We’re only using it on batted balls, meaning we’re failing to capture so many whiffs and takes. Pulling in whiff analysis to this metric would be amazing, as that would give a deeper dive into a player’s decision-making. Is he routinely late on a ball? Early? Under it? Over it? Even further, I’m sure someone out there has been working on factoring in biodata regarding a player’s swing decisions. That analysis is needed.
But for now, this is a concrete step above exit velocity by itself to give you a look into how players are performing relative to their expectations based on what they face and how they are pitched. I can’t recommend enough that you play around with the app to find your favorite play and discover something new. Baseball’s an endless world, on the field and through the numbers, forever growing and I hope this piece contributed.