The 2022 NBA Draft is right around the corner. For the second year in a row now, I’ve decided to create my own draft model for a more statistical approach to the draft. In this write-up, I’m going to go over the data behind it, how it works, who it differs on most relative to consensus and why, historical results, and future outlook.
Data Collection and Preparation
Like last year, I’d like to give a shoutout to Will Schreefer (was @refersadness on Twitter, but he may have deactivated) as pretty much all the data I used is courtesy of him, and Jesse Fischer (@jessefischer33) whose models and excellent work at tothemean influenced my approach a lot.
The features I chose for my model were mainly a mix of efficiency stats (ts%, 3p%, ft%, etc.), traditional stats (like ppg), attempt rate stats (ftr, 3par), slightly more advanced stats (ast%, trb%, stl%. blk%, etc.), all-in-one metrics (bpm), measurements (wingspan, height, weight), along with age and class.
The data I used excluded players from 2019-21 since they’re still not done solidifying their roles within the league. Like last year, I used a multiclass neural network that outputs probabilities of a player being a star, starter, rotation player, or bust, but I went about preparing the data a little bit differently. Firstly, the dataset had a pretty large class imbalance (way more busts than anything else), so I removed some of them to create a more balanced dataset. Once I did that, I separated players by position and normalized numeric data using a minmax function and then merged all the players back into a larger dataset. In essence, this contextualizes things a little bit better for the model. For example, a guard averaging 3 assists per game may be considered a mediocre or below-average passer, but a big doing the same thing would be considered above average. It also ensures features are being given equal importance since they all have the same scale now.
Results and How to Evaluate
I want to remind everyone reading this that I made adjustments in the model between now and when I released my scouting reports, so originally they had different role probabilities than what you’re about to see for them. They should all be updated now though, so if you would like to revisit any of them, be my guest. Anyways, now that we have the stuff behind it taken care of, here are the model results for 2022:
As a reminder, the model gives us role probabilities (star, starter, rotation, bust) which are a great way to interpret results, BUT I know everybody loves rankings so I’m going to go over a few ways you can rank these guys. For those that love high-floor players, go by who is least likely to be a bust (keep your eyes on that 3rd green column). For those that love high-ceiling players, we can sort by most likely to be starter or star (2nd green column). Finally, for a combination of these approaches, you can use my own scoring method (listed as Score in the pic above) inspired by CJ Marchesani.
Using this method, I took each prospect's probabilities and multiplied rotation probability by 53, starter probability by 77, and star probability by 96. This essentially creates an all-in-one metric to rank prospects by. It’s important to note that it’s an imperfect way to evaluate the model results, but if you have to rank the players, it gets the job done. Just remember that players who may seem to be separated by a lot are probably closer than they appear.
Historical Results
For context on how well the model has performed in past years, here are the results for 2019-21. You might want to put an asterisk next to 2020 because covid definitely had an effect on both the college season and how teams were able to develop players that were selected (no summer league amongst other things).
Key Surprises
These 6 players were higher/lower than most would think.
Aminu Mohammed
Model Score: 7.04
Model Rank: 5
Consensus Rank: 70
Mohammed was a 5-star out of high school but was a bit of a surprise to stay in the draft. He had a pretty rough shooting year, but the model is likely betting on his size, length, pedigree, defensive activity, and volume scoring as a guard.
Christian Braun
Model Score: 6.48
Model Rank: 10
Consensus Rank: 32
Coming off a national championship at Kansas, Braun’s stock has risen all the way to first-round consideration over this past draft cycle. His combination of shooting, passing, and solid impact metrics are strong headliners to his statistical profile.
Kennedy Chandler
Model Score: 6.54
Model Rank: 9
Consensus Rank: 24
While Chandler is undersized, his length, defense, playmaking, pedigree, age, and performance in impact metrics all likely boost his status in the eyes of the model.
Ochai Agbaji
Model Score: 3.07
Model Rank: 73
Consensus Rank: 17
Agbaji has been on the draft radar for a couple years now and his age, mediocre passing, and subpar defensive playmaking (low stl% and blk%) likely make the model hesitant.
Keegan Murray
Model Score: 5.31
Model Rank: 28
Consensus Rank: 5
Make no mistake, the model still thinks Keegan Murray will be a good NBA player. However, his older age, high school pedigree, and mediocre freshman year, amongst other things, make it harder for the model to see him as a high-ceiling player.
Tari Eason
Model Score: 4.77
Model Rank: 41
Consensus Rank: 16
Eason is a bit of a wild card in the eyes of the model, with similar probabilities for bust, rotation, and starter. His production at LSU is undeniable, but his role as 6th man, foul trouble, and mediocre freshman year make it harder for the model to project him.
Future Outlook
As always, I’ll look to improve my model come next year. One of the biggest things for me is going to be improving model interpretability. Right now the best I can really do is see which features have the most correlation with my Score metric across out-of-sample data. I might also look to switch up my feature selection, as it can probably be improved. Finally, I may have gone a little overboard with regards to removing so many “busts” since my results may be just a little inflated compared to the reality of succeeding in the league (the average out-of-sample player had the following role probabilities: bust 63.1%, rotation 21.5%, starter 11.9%, star 3.5%).
Other than that, there are still things I’m sure I can marginally improve. I’m still relatively new to this whole thing so if you know I’m doing anything wrong, please don’t hesitate to tell me, I’m always looking to learn and improve. Hopefully, you all enjoyed this breakdown and my draft coverage so far this year. I have one more article coming out the day before the draft so keep your eyes peeled. As always, thanks for reading :)