Predicting the best players of the 2018 draft

We are 2 weeks away from the begining of the season and the rookies are already showing some good performances (DeAndre Ayton in 24/9/1 against the Kings for example). In this article, based on previous seasons, I will try to predict the best players of the 2018 draft.

To evaluate a performance of a player I considered two statistics that already exists :

  • Win Share (WS) : through a quite complex formula it basically measures the number of win contributed by a player.

  • Player Efficiency Rating (PER) : it measures a player’s productivity (per minute) by adding up all the positive contributions and substracting the negative ones.

In order to make it simple I only focused on the players from 1st and 20th picks.

The Data

To feed my machine learning models I selected all the players drafted between the 1st and 20th picks since 2012 (year of Anthony Davis, Bradley Beal) . I used Basketball Reference to collect the NBA stats and Sports Reference for the NCAA ones. I selected :

  • Age : age of the player

College stats (only the season where the player got drafted) :

  • G : games played

  • MP : minutes played per game

  • FG % : field goal percentage

  • FT % : free throw percentage

  • 3P %: 3 points percentage

  • TRB : number of total rebounds per game

  • AST : number of assists per game

  • BLK : number of blocks per game

  • STL : number of steals per game

  • TOV : number of turnovers per game

  • PF : number of personal fouls per game

NBA stats :

  • WS

  • PER

(For my models there is no big difference between a 1st pick player and a 20th pick player so I didn’t take it into account)

Glimpse of the first lines of the data

My Models

I used 6 differents models :

  • Linear Regression

  • SVR Regression

  • kNN Regression

  • Multi Layered Perceptron Regression

  • Ridge Regression

  • Lasso Regression

I splitted my data set of 100 players (from 2012 to 2017) into a train set of 75 players and a test set of 25 players (randomly chosen).

The best model will be the one with the lowest Mean Squared Error (MSE) and a normal distribution for its residuals.

WS Prediction

Here are the results of the MSE for all our models :

The best model seems to be the Ridge Regression, let’s take a closer look at the residuals.

We can admit that the residuals for the Ridge model are normally distributed.

Using the Ridge Regression we can now predict the WS for our 2018 drafted players.

The 5 best Rookies of season 2018-2019 based on WS should be Jaren Jackson Jr (Grizzlies), Mohamed Bamba (Magic), Wendell Carter Jr (Bulls), DeAndre Ayton (Suns) and Mikal Bridges (Suns).

PER Prediction

Glimpse of the MSE results for all our models :

The best model seems to be the Lasso Regression, let’s take a closer look at the residuals.

We can admit that the residuals for the Lasso model are normally distributed.

Using this model we can predict the PER for our 2018 drafted players.

The 5 best Rookies of season 2018-2019 based on PER should be Mohamed Bamba (Magic), Jaren Jackson Jr (Grizzlies), Wendell Carter Jr (Bulls), DeAndre Ayton (Suns) and Zhaire Smith (76ers).

You are probably wondering why there is no Luka Doncic. I couldn’t take him into account because he played in Eurore, therefore there is no sense comparing his stats with NCAA player’s ones. However, by analyzing what he has already done previously, he could be part of the top 5 Rookies of the year.

Who will be the nominees for the 2019 Rookie Award, we will see at the end of the season !