Post by Deleted on Mar 15, 2021 21:53:04 GMT -5
Test Sims: A Devious Deviation?
by JZ (GM of the Brooklyn Nets)
Abstract: A method to simulate multiple seasons in FBB2001 is re-derived and applied to some scouted grades from 6.0 using an early 5.0 league file. The statistics are presented in per game, and per36 format. Standard deviation within individual seasons and across a run of 100 ensembles are also shown. Results imply large standard deviations for player performance within a season, and a smaller, yet still significant, deviations for player performance across multiple ensemble runs with the same boundary conditions. The code is released in the attached files.
Introduction
In my most recent devious deed, I’ve acquired an early (2008) 5.0 file and created an apparatus (with the help of eric and others in shout! ty!) to run a bunch of sims and then output some results into an excel file. Much has been made ado about the effectiveness about this approach, and while I understand the concern I hope some of my findings can help alleviate some of the concern over test sims being an OP tool for some people to analyze draft results. Literature Review
For people who don’t have a background on statistics here’s a picture of what the standard deviation represents (taken from wikipedia):
In this diagram you have your values on the y axis (vertical), and the standard deviation of the sample on the x axis (horizontal). This shape is commonly known as a “Bell Curve” and you expect most distributions in nature to be normal. This probability curve shows you the probability of receiving any one value, where 0 is the mean. How wide and how tall this curve can vary, and that is where standard deviation comes in. In this chart its shown as σ. 1 standard deviation (which is marked by -1σ and 1σ) is how far away from the mean value any sample can be expected to be 68.2% of the time (34.1% x 2). That means that out of 100 simulations that I did, ~2/3rds of them will fall within one standard deviation.
Methodology
Simulation Methods
At this point, I’ve just started looking into these sims, as inputting player ratings (and giving them updates) takes considerable time with FBB2001. *hattip to heavyreign* *double hat tip to eric for all the work he does* *triple hat tip to the commishes who did this all manually*. I spent all day yesterday writing the scripts to do this, and left it running overnight to simulate the same season 100 times. (scripts are attached)I put 24 players in the sim season, some were players whose scouts I obtained from the 3002 6.0 Draft, and some were players from the 3003 6.0 Draft. Additionally, the rookies that were already in the base file are included for comparison. Altogether 40 rookies were tested who received enough minutes to be included (cutoff of 15 minutes per game).
Depth Charts
More disclaimers are that the depth charts extremely change the type of stats we see, I don’t think this is news to anyone, but some rookies who could have had good showings did not because of DC position. Because the sim has to be left running overnight (or at least for a few hours, but it completely ties up your computer), it ends up taking significant time to sufficiently test different setups. Whether that is uppies, depth chart settings, or any other variation you can imagine.Statistics Reported
There are two different averages and standard deviations reported in the results. The first approach involves simulating 100 seasons and finding the average statistic of each game for each player. So an example would be over the 100 simulations Al Horford's average stats for Game 1 would be calculated (i.e. AvgMin 26, AvgMinSTD 4, Avgpts 15, AvgptsSTD 3, Avgtov 2, AvgtovSTD .5, etc), and I would have these average stats for all 82 games on the schedule. I then calculated the mean/std of their performance) to find the final statistics reported below. This proposed averaging and std calculation shows the aggregated stats of the expected value of the player for a simulated season. This is useful, with the std showing you the variation between a player’s performance during a single season (i.e. The typical stat line you would expect from that player any given game). These are the same type of stats you could calculate from an individual season for a player, but simulates the “expected” value due to the large number of samples (100). Use these for guessing what type of season a player is expected to have.
The second approach involves simulating 100 seasons, and taking the players statistics from all 100 seasons and finding the mean/calculating the std directly from that list (of ensemble length) of season statistics. This averaging scheme is intended to allow the standard deviation to highlight deviations from the expected season-long statistics of a player. This is useful, showing you the variation between season-long statistics, which is used to tell you how useful one individual simulation of a season is for a given player build. Use the standard deviation metrics from this for estimating how likely your individual simulation test is represented of the expected statistics from the previous method.
The second approach involves simulating 100 seasons, and taking the players statistics from all 100 seasons and finding the mean/calculating the std directly from that list (of ensemble length) of season statistics. This averaging scheme is intended to allow the standard deviation to highlight deviations from the expected season-long statistics of a player. This is useful, showing you the variation between season-long statistics, which is used to tell you how useful one individual simulation of a season is for a given player build. Use the standard deviation metrics from this for estimating how likely your individual simulation test is represented of the expected statistics from the previous method.
Code Release
The ahk code to run the FBB sim iteratively, as well as a python script to combine the output mdbs is included in the attachments. Small tweaks may need to be made to run on your setup, feel free to ask for help at anytime.
Results
Expected per game
First I’ll show the pergame statistics for these rookies, shown in alphabetical order. The color codes for all of these tables are generally Green=Good and Red=Bad. So higher minutes, points, or whatever should be marked Green. While more fouls or turnovers is marked Red.Figure 1) Mean per game statistics for any given season
Expected per36
I’ve also calculated the per36 statistics for the players which is shown here:Figure 2) Mean per36 statistics for any given season
Expected Deviations
The following is the per game statistics, with the STD of each statistic shown. These are the standard deviations during an individual average season, i.e. some games Al Horford, one of the best rookie builds we’ve seen so far in 6.0 plays is expected to play an average of 23.5 minutes (in a 5.0 file) a game, with a standard deviation of 6.5 minutes.
Conclusions
Listening to some of the latest podcasts and articles, and talking in discord we’ve heard wild variations in some of these simulation statistics, with Kevin Durant being a notable example. Ian can do test sims where he looks fantastic, and Dirt will find him averaging 6 minutes a game. These examples are likely not outliers, but rather the result of Training Camp, Depth Chart variations, or league and team composition, as we don't see variations that large in the test without changing those boundary conditions.
The following is the per game statistics, with the STD of each statistic shown. These are the standard deviations during an individual average season, i.e. some games Al Horford, one of the best rookie builds we’ve seen so far in 6.0 plays is expected to play an average of 23.5 minutes (in a 5.0 file) a game, with a standard deviation of 6.5 minutes.
Figure 3) Statistics from Figure 1 plotted with the standard deviation of those statistics within the season
Sim-to-Sim deviations
The following is what I will call the “sim-to-sim” deviation. Again, we show the expected standard deviations of pergame statistics, but this time the STD column refers to the expected difference from one sim to another.
The following is what I will call the “sim-to-sim” deviation. Again, we show the expected standard deviations of pergame statistics, but this time the STD column refers to the expected difference from one sim to another.
Figure 4) Ensemble standard deviation. The expected deviation from the mean value for any given season statistic.
Discussion
As you can see the 6.0 rookie players don’t do particularly well when compared to the 5.0 rookies. This shows us that early 5.0 had much better players than we currently do in 6.0. Further enforcing that face, the raw statistics shown in Figure 1 and Figure 2 are not representative of the statistics we saw for players from the 6.0 3002 Rookie Draft (Michael Brooks, Kevin McHale etc. are much worse than they did in 6.0). However, the better players tend to still look better than players that are worse. "Well wait a minute", you might be thinking, "Even if the test sim doesn't produce realistic statistics, you can still get a large advantage by looking at the rookie's stats relative to each other". Well, have I got an answer for you...
Devious deviations
The answer lies in the deviations. First thing you notice is a ton of deviation (Shown in Figure 3) among some very basic statistics within the season. To call out an example, Al Horford averaged 22.7 mpg, with a STD of 5.4 minutes. That means in some games he will average 29 minutes, and in other games he will average 16 minutes, this is a range of ~+-24%, for a total value swing of nearly 48% depending on which game he is playing in, and that is just for the 68% most normal games, the other 32% of games will have values that fall outside of those values.
Figure 4 shows the ensemble standard deviation. This is the moneyshot of the article, because the big question to answer is "How representative of the expected simulation of a player is this particularly simulation I've done?". While this varies for different prospects, we'll look at Mike Conley's minutes per game for discussion purposes. You can find his average minutes per season in Figure 1 (22.5), and the standard deviation of those minutes in Figure 4 (1.6). This means that ~68% of the seasons simmed Mike Conley played within 1.6 minutes of his 22.5 minute average. You can look at two times the standard deviation and show that for 95% of all seasons simmed, Mike Conley played within 3.2 minutes of his standard deviation, which shows us a range of approximately +-15%. This all happened no changes to the depth chart or any players going through TC. While this range is not as large as the in-season range, a 30% swing in minutes for Mike Conley in is quite large, especially with no changes in the boundary conditions (same season, same league, same dc, same EVERYTHING).
Listening to some of the latest podcasts and articles, and talking in discord we’ve heard wild variations in some of these simulation statistics, with Kevin Durant being a notable example. Ian can do test sims where he looks fantastic, and Dirt will find him averaging 6 minutes a game. These examples are likely not outliers, but rather the result of Training Camp, Depth Chart variations, or league and team composition, as we don't see variations that large in the test without changing those boundary conditions.
The most value, I think, comes from not finding out their stats, but how they compare to each other, and it this strategy that people have started to sound off concern about, especially when using a file from 5.0. I hope that my results and discussion here helped to alleviate some of that concern at the moment, showing a nearly 30% uncertainty for Mike Conley's minutes per game across 95% of the simulated seasons. However, you can see some patterns in the stats between these players and their rookie stats in 6.0. Additionally, with so many depth chart variations needed to properly compare different rookies in the same situation, it would take a significant amount of time to make a confident assessment on one rookie being clearly superior to the other, and would be unrealistic to properly test out within the timelimits of the draft.
Limitations
While limitations of this study were discussed in-text, a list of notable limitations is provided here for clarity. 1) Only 100 simulations have been run, 2) Only one league composition was used, 3) Only one team composition was used, 4) Only one DC was used, 5) injuries were off.
Limitations
While limitations of this study were discussed in-text, a list of notable limitations is provided here for clarity. 1) Only 100 simulations have been run, 2) Only one league composition was used, 3) Only one team composition was used, 4) Only one DC was used, 5) injuries were off.
Future Work
More detailed analysis into the range of standard deviations for individual statistics. While we discussed an example (Conley's minutes) of deviation across simulations, rigorous analysis was not applied to find expected deviation values based of player stats, ratings, positions, depth chart changes, or training camps. Work that accounts for, rather than merely acknowledging the limitations listed above. That is, more simulations, more league compositions, more team compositions, and more depth charts used. In addition, future proposed work would not only account for these limitations but study the differences between simulations when these are modified. Finally, statistical look at TCs and replications of eric’s work in the mechanics thread will be explored.
Acknowledgements
To Dirt, Ian for inspiring to do these simulations. To Odin for giving me access to a TMBSL league file. To Eric for his help with AHK and FBB. And finally to everyone in shout for their help about FBB and other related materials.