New Ranking System

Everything dealing with the video game developed by Cyanide!
User avatar
dode74
Posts: 7034
Joined: 11 December 2008, 11:18
Location: Nr. Reading, UK
Contact:

Re: New Ranking System

Postby dode74 » 12 November 2016, 23:29

When you play in the leaderboard and want to maximize your chance to qualify, you don't watch "who I can catch?" but rather "I'm more likely to improve or worsen my actual rating by playing an additional game?"
Then you're thinking about it wrong. Your rating relative to other players is what matters, not your relative rating to your own previous rating. If someone else is at a similar rating they likely have a similar amount to gain or lose as you unless there is a disparity in games played.
Now imagine a system in which win rate dominates. I sit on a 25-3-2 record. My actual win rate is 88.33. If I lose next game, my record drops to 85.5. If I lose, I will able to return to a 88.33 win rate only with a record of 32-3-3, i.e. with 7 more won games. With a loss costing me 7 wins, on a 25-3-2 record I'm likely to play one additional game only if I think to have a chance higher than 88% (7/(7+1)) to win next game. From here the stall on "lucky" records and the rerolls after losing a game.
Given that's what you've been achieving so far there's no real reason to think you'll not achieve that.
But you're talking about an extreme, win%-only rating which is not what I am looking at at all.
I want also to point out that a player with a long term win rate of 80% has a 16% chance to obtain a 25-0-3 record or better (89%+ win rate) and a chance of 2% to obtain a 90-0-12 record or better (88%+ win rate). Thus, a higher win rate with x games isn't necessarily "playing at higher standard" than a lower win rate with 2x or 3x games. There is no way to have a rating system taking in consideration the higher variance generated by a lower number of games played?
He is playing more demonstrably consistently though.
More games played gives more ranking points. That is taking into account that variance. A 80% player with 40 games will be ahead of an 80% player with 30 games.
Image

Veggente
Posts: 21
Joined: 10 July 2016, 23:31

Re: New Ranking System

Postby Veggente » 12 November 2016, 23:55

He is playing more demonstrably consistently though.
The maths you quoted don't support this claim.

User avatar
dode74
Posts: 7034
Joined: 11 December 2008, 11:18
Location: Nr. Reading, UK
Contact:

Re: New Ranking System

Postby dode74 » 12 November 2016, 23:57

I'm not sure what you're talking about. 90-0-12 will beat 25-0-3 under the new system because the former team has demonstrated consistency with the more games played.

I think you are under the false impression that it's win% only under the new system. It's not.
Image

Veggente
Posts: 21
Joined: 10 July 2016, 23:31

Re: New Ranking System

Postby Veggente » 13 November 2016, 00:45

I'm not sure what you're talking about. 90-0-12 will beat 25-0-3 under the new system because the former team has demonstrated consistency with the more games played.

I think you are under the false impression that it's win% only under the new system. It's not.
I had to remove the draws to more easily calculate the probabilities. We started with a 91-8-13 (84.8%) being behind a 25-3-2 (88.3%). the former STILL demonstrated more consistency than the latter.

Demonstration:

Player with a long term win rate of 84.8% (91-8-13). He has a chance of 31% to do 27 wins or more over 30 games (and thus to beat the guy with the 25-3-2 record _ 27-0-3 to simplify - 88.3% win rate). He has a chance of 67.1% to do that on 3 full attempts (90 games). Surely higher chances if in the first 10 games he rerolls after the first loss. Thus, the 84.8% player with 112 games is more likely than less likely to be a better player than the 88.3% player with 30 games.

In short, math indicates that you are discounting the accrued difficulty of achieving a high win rate with a lot of games compared to achieving it with much less games (and rerolls after a loss as a viable option).

JimmyFantastic
Posts: 495
Joined: 28 February 2012, 21:12

Re: New Ranking System

Postby JimmyFantastic » 13 November 2016, 03:00

The previous (i.e. current) version overvalued playing games, and you are now claiming (on the basis of nothing more than a single datapoint) that the new version is too far the other way.
Well, those of us who play Bloodbowl didn't need two months of data to know that you overvalued playing games the first time :D
The new one does seem like it could have swung too far the other way but I'd like to see the new graph.
Image

User avatar
Scram Lyche
Posts: 143
Joined: 22 July 2015, 10:04

Re: New Ranking System

Postby Scram Lyche » 13 November 2016, 06:42

Be cautious that such heady concepts do not blow Scram Lyche's mind in the process, though.
Hey Mike, you seem to think you're a funny guy, well I got a couple of good jokes for you..

Open Ladder Season 1 ps4

1: W54 D18 L28
2:
3: W27 D8 L0
4: W29 D5 L11
5:
6: W22 D1 L0
7:
8:
9: W34 D6 L26
10:
11: W21 D4 L1

You can waffle on for paragraph after paragraph about this and that and tennis matches, I just have to list up a few numbers to show that what you are trying to defend is batshit crazy.

User avatar
dode74
Posts: 7034
Joined: 11 December 2008, 11:18
Location: Nr. Reading, UK
Contact:

Re: New Ranking System

Postby dode74 » 13 November 2016, 08:36

I had to remove the draws to more easily calculate the probabilities. We started with a 91-8-13 (84.8%) being behind a 25-3-2 (88.3%). the former STILL demonstrated more consistency than the latter.
Depends on what you are calling "consistency". And 25-3-2 won't do it under the system we settled on. 26-3-1 would, though.
Player with a long term win rate of 84.8% (91-8-13). He has a chance of 31% to do 27 wins or more over 30 games (and thus to beat the guy with the 25-3-2 record _ 27-0-3 to simplify - 88.3% win rate). He has a chance of 67.1% to do that on 3 full attempts (90 games). Surely higher chances if in the first 10 games he rerolls after the first loss. Thus, the 84.8% player with 112 games is more likely than less likely to be a better player than the 88.3% player with 30 games.
Your methodology is reliant on accepting the current win rate is the real win rate for the player, whereas it's actually just a sample of his ability. When we treat it like that the 95CI range (i.e. 1.96 z-score) for the 112 game player is 78.15-91.48 and the range for the 30 game player is 76.8-99.8. The 30-game player is more likely than less likely to be a better player than the 112 game player under that model.
That sort of modelling is what's behind TrueSkill, as you may be aware. TrueSkill uses -3σ to give a conservative estimate of ability.
I'm not saying that's necessarily the right model, either, but then neither is yours.
In short, math indicates that you are discounting the accrued difficulty of achieving a high win rate with a lot of games compared to achieving it with much less games (and rerolls after a loss as a viable option).
I don't think it does at all. Besides, starting a new team is an option regardless. If someone does manage a 30-0-0 run in 6 weeks I'd fully expect them to sit on it until that record is threatened by someone at, say, 60-0-3. That's because the incentive to play another game or not comes not from how high or low a ranking score you can get but from whether it's good enough to qualify, and that line shifts depending on other players.
Furthermore, there needs to be some sort of incentive for people playing fewer games to allow them to compete with each other. This was the main complaint with the previous system, and the data suggests there may need to be some bias towards lower games played.
Well, those of us who play Bloodbowl didn't need two months of data to know that you overvalued playing games the first time :D
The new one does seem like it could have swung too far the other way but I'd like to see the new graph.
It's funny, because I saw the previous post where you actually complimented the new system on being far better before deleting it. It even said "well done" in it. More interested in maintaining the Jimmy persona than being honest with people, I guess.
Image

Veggente
Posts: 21
Joined: 10 July 2016, 23:31

Re: New Ranking System

Postby Veggente » 13 November 2016, 11:56

Your methodology is reliant on accepting the current win rate is the real win rate for the player, whereas it's actually just a sample of his ability. When we treat it like that the 95CI range (i.e. 1.96 z-score) for the 112 game player is 78.15-91.48 and the range for the 30 game player is 76.8-99.8. The 30-game player is more likely than less likely to be a better player than the 112 game player under that model.
That sort of modelling is what's behind TrueSkill, as you may be aware. TrueSkill uses -3σ to give a conservative estimate of ability.
I'm not saying that's necessarily the right model, either, but then neither is yours.
Sorry, but you just did what you accuse me to do. You assume that a win rate coming from 112 games is just as good than a win rate coming from 30 games to represent the central tendency of the probability distribution of win rates for a player. This is wrong.

The win rate of the player with 30 games is more likely to be far on the right tail of the probability distribution than the win rate of the player with 112 games.

User avatar
dode74
Posts: 7034
Joined: 11 December 2008, 11:18
Location: Nr. Reading, UK
Contact:

Re: New Ranking System

Postby dode74 » 13 November 2016, 12:13

Sorry, but you just did what you accuse me to do. You assume that a win rate coming from 112 games is just as good than a win rate coming from 30 games to represent the central tendency of the probability distribution of win rates for a player. This is wrong.
I've not assumed the centroid is the representation of the player at all, simply pointed out what the distribution is based on the sample size. The numbers given show the range in which the real value sits to 95CI. The central tendency is based on the data itself and the magnitude of the uncertainty is based on how much data we have for each player.
The win rate of the player with 30 games is more likely to be far on the right tail of the probability distribution than the win rate of the player with 112 games.
How have you come to that conclusion? Your assumption is that they continue the win streak at the same rate, but that assumption is flawed in that it doesn't account for its own uncertainty (which is what TrueSkill does, conservatively). It's 95CI to be within that distribution for each player. If you look at the 90CI ranges then the lower (left) bound of the range is higher for the 30-game player than for the 112-game player.

Either way, I think the model you have described is flawed for the reasons stated (which you've not countered). Furthermore, there needs to be some sort of incentive for people playing fewer games to allow them to at least try to compete with the higher games-played players, and if that means a devaluation of games played in order to incentivise play at the lower end of the scale then so be it.
Image

User avatar
dode74
Posts: 7034
Joined: 11 December 2008, 11:18
Location: Nr. Reading, UK
Contact:

Re: New Ranking System

Postby dode74 » 13 November 2016, 12:36

All that said, you've given me an interesting idea for something I can look at for S3...
Image


Return to “General”

Who is online

Users browsing this forum: No registered users and 1 guest

cron