Friday, March 11, 2016

Discussion: Scoring

Going into our fourth competition, we now have enough objective data to completely redo the scoring system and make it as objective as possible. Scoring for the first LOTR competition was done entirely by subjective judgement based on our combined estimation of move difficulty. After each competition, we review the results and adjusted scoring for the next competition; but this system was still largely subjective. Now, I want to present a formulaic approach to generate scores for moves based on the objective difficulty.

The Formula
I don't think that the precise scoring system matters that much; what matters is separating competitors as accurately and fairly as possible. This we have already achieved at the top level in past competitions - I doubt any competitors felt they would have, or should have, done better under a different scoring system. Going forward, scoring for moves that have been done (based on logged scores in the LOTR database) should be formula based. The formula I am proposing is:

((1/x)^0.25 - 0.75) * 250 +37.5

With x being the percent chance of success for a given move. What this means, or rather looks like is this:

The Statistics
I based my statistics off unique competitors that have done the move 1:3-5 - there are 65 of these. Thus, that is the "100%" chance of success - because competitors that haven't done that move are not included in the statistics. I designed the curve to be exponential towards the top end of difficulty (moves done by less than 25% of the competitors) as this will reward competitors pushing their limits as much as possible. Next, we can look at the success rate for all the various moves.
Description Completed Success Rate Points
1:2-3 4f 75
1:3-4 4f 85
1:2-4 4f 90
1:3-5 4f 65 100% 100
1:4-5 4f 63 97% 102
1:2-5 4f 57 88% 108
1:4-6 4f 39 60% 134
1:3-6 4f 37 57% 138
1:2-3 2f 35 54% 142
1:5-6 4f 32 49% 148
1:2-4 2f 28 43% 159
1:3-4 2f 27 42% 161
1:3-5 2f 22 34% 178
1:4-5 2f 20 31% 186
1:5-7 4f 19 29% 190
1:4-7 4f 18 28% 195
1:2-6 4f 17 26% 200
1:2-5 2f 16 25% 205
1:4-6 2f 10 15% 249
1:5-8 4f 8 12% 272
1:3-6 2f 8 12% 272
1:3-7 4f 4 6.2% 352
1:5-8.5 4f 3 4.6% 389
1:6-7 4f 2 3.1% 447
1:5-6 2f 2 3.1% 447
1:6-8 4f 1 1.5% 560
1:2-6 2f 1 1.5% 560
1:4-7 2f 1 1.5% 560
1:4-8 4f 1 1.5% 560
This chart lists the points for the three moves easier than 1:3-5 (1:2-3, 1:2-4, 1:3-4) and the points for them is decided arbitrarily. The "Completed" column shows how many of the 65 competitors did each move; for example, the next hardest move after 1:3-5 is 1:4-5 and was done by 63 of the 65 competitors, so that move has a 97% success rate. In the middle of the difficulty spectrum is 1:5-6 (4 fingered) which 32 competitors did, or 49%. At the top end there are four moves that have been logged by only one competitor; these are: 1:6-8 4f, 1:2-6 2f, 1:4-7 2f, 1:4-8 4f and they have a 1.5% success rate.

Undone Moves
We must also generate score for moves that have never been done (in the LOTR database - clearly, many of them have been done before) and Stacks. LOTR 2016 will be the first year that we score mono moves, so we don't yet have statistics on those moves either. First, I'll cover my approach to assign scores for the 4 and 2 finger moves that have not been done; these are:
Description Estimated
Success Rate
Points
1:5-9 4f 0.77% 694
1:6-9 4f 0.77% 694
1:5-7 2f 1.4% 579
1:5-8 2f 0.77% 694
1:3-7 2f 0.38% 854
1:5-8.5 2f 0.25% 968
1:6-7 2f 0.19% 1,051
1:4-8 2f 0.09% 1,279
1:6-8 2f 0.09% 1,279
1:5-9 2f 0.05% 1,549
1:6-9 2f 0.05% 1,549
The two undone 4 finger moves (1:5-9 and 1:6-9) were assigned half the probability of the hardest 4 finger move, so these two moves have an estimated success rate of .77%. For the undone 2 finger moves, I estimated the chance of success by looking at the ratio for the 4 finger moves and applying them to the 2 finger moves. Again, this doesn't have to be perfect, it simply needs to be directionally accurate and fair.

Monos
For the mono moves, I simply doubled the points of the 2 finger move. This probably makes the mono moves a little under-valued, but it's where we started with 2 fingering and that worked out fine. The mono points are:
Description Points
1:2-3 1f 284
1:2-4 1f 317
1:3-4 1f 323
1:3-5 1f 356
1:4-5 1f 371
1:2-5 1f 410
1:4-6 1f 498
1:3-6 1f 544
1:5-6 1f 894
1:2-6 1f 1,120
1:4-7 1f 1,120
1:5-7 1f 1,158
1:5-8 1f 1,388
1:3-7 1f 1,708
1:5-8.5 1f 1,936
1:6-7 1f 2,103
1:4-8 1f 2,557
1:6-8 1f 2,557
1:5-9 1f 3,098
1:6-9 1f 3,098
Stacks
Finally, I wanted to review the stack scoring by looking at the odds of doing the stack compared to doing the two moves in the stack individually. Another way of saying that is: the points for the stacks should be determined by the success rate among people that can do the base moves of the stack. So for each stack, I counted how many people did all the base moves and then how many of those people did the stack.
Fingers Stack Base Moves Individual
Moves
Completed
Stack
Success
Rate
Stack
Score
4 1:2-5-6-9 1:2-5 & 1:4-5 56 33 59% 135
4 1:2-6-9 1:2-6 & 1:5-8 8 4 50% 147
4 1:3-5-7-9 1:3-5 x3 65 37 57% 138
4 1:3-5-9 1:3-5 & 1:3-7 4 2 50% 147
4 1:3-6-8 1:3-6 & 1:4-6 34 29 85% 110
4 1:3-6-9 1:3-6 & 1:4-7 18 10 56% 140
4 1:3-7-9 1:3-7 & 1:5-7 4 3 75% 119
4 1:4-5-8 1:4-5 & 1:2-5 56 34 61% 133
4 1:4-5-9 1:4-5 & 1:2-6 17 7 41% 162
4 1:4-6-9 1:4-6 & 1:3-6 34 18 53% 143
4 1:4-7-9 1:4-7 & 1:4-6 18 15 83% 112
4 1:5-6-9 1:5-6 & 1:2-5 32 25 78% 116
4 1:5-7-9 1:5-7 & 1:3-5 19 17 89% 107
2 1:2-5-6-9 1:2-5 & 1:4-5 15 4 27% 297
2 1:3-5-7-9 1:3-5 x3 22 6 27% 294
2 1:3-6-8 1:3-6 & 1:4-6 7 3 43% 238
2 1:4-5-8 1:4-5 & 1:2-5 15 4 27% 297
2 1:4-6-9 1:4-6 & 1:3-6 7 2 29% 288
2 1:5-6-9 1:5-6 & 1:2-5 2 1 50% 221
Proxy System Modifications
Everyone agreed with my proposed changes to the proxy system. 1:2-6-9 will be the cutoff for the "hard" proxy moves. This means that top competitors only have three 4 fingered stacks to do (1:2-6-9, 1:3-7-9, and 1:3-5-9) so this significantly reduces the total number of moves.

Please email me with any questions about scoring, or leave comments below.