In our statistical wrap-up of the Blood & Thunder World Cup 2014, we lay the blame for some significant misrankings in the tournament (in particular, Germany’s failure to be placed in the top 16) on the use of Group Qualification into a Single Elimination tournament.
We realise that this may need some more backing as a statement for those who aren’t quite as into statistics as us.
To begin with, let’s take the issue with Group Selection, and, in particular, the kind of Group Selection that the World Cup used. To simplify the maths a bit, we will consider the case of a tournament with 32 entrants rather than 30, but the numbers carry over closely for the case of 30.
Let us assume that of the 32 teams attending, we can divide them into a Top 16 and a Bottom 16, such that if we had perfect knowledge (and thus didn’t need a tournament in the first place), all the teams in the Top 16 would beat each of the teams in the Bottom 16.
Our Group round consists of 8 Groups, each of 4 teams, such that the top 2 teams in each group go through to the second stage, and the bottom 2 are eliminated. Ideally, therefore, we want 2 Top 16 and 2 Bottom 16 teams in each Group, such that the second stage of the tournament consists entirely of Top 16 skaters.
If we are randomly selecting to fill Group A, then the first place in the Group has a (16/32) chance of being a Top 16 skater (there are 16 of them, out of the 32 skaters who could be picked for that position). The second place then has a (15/31) chance of being a Top 16 skater (as we’ve removed one skater from the eligible Top 16 skaters and from the total number of skaters remaining). Place 3 then has a (16/30) chance of being a Bottom 16 skater (as we still have 16 of the Bottom 16 remaining, and only 30 skaters in total), and Place 4 a (15/29) chance of being Bottom 16.
However, the ordering in which we pick from Top and Bottom 16 doesn’t matter, as long as there are 2 Top and 2 Bottom 16 in the Group (we might have easily chosen 1 Top, 1 Bottom, 1 Top, 1 Bottom, for example). There are 6 different orders of Top, Bottom skater selection which result in a total of 2 Top and 2 Bottom skaters (TTBB,TBTB,TBBT,BTTB,BTBT,BBTT), so we need to multiply the probability above by 6.
The probability of a perfect Group A, then is (6*16*15*16*15)/(32*31*30*29) = 0.40 (to two significant figures).
That is, only about 40% of the possible random selections for Group A produce a combination of skaters which will result in 2 Top 16 skaters passing through to the final (almost 25% result in a Top 16 and a Bottom 16 skater getting through, the same number pass 2 Top 16s, but relegate a third unfairly, and 10% either put through 2 Bottom 16 skaters or relegate 2 Top 16s unfairly).
If we know something about the relative performance of the teams entering the tournament, we can try to reduce this probability by non-randomly selecting the 1st position in each Group to be a Top 8 skater (assuming that we know which the Top 8 are, by comparison with their previous performance, for example).
In that case, things do improve, as we just need to have precisely more Top 16 skater in the Group, and 2 Bottom 16s. Calculating the probability of a perfect Group A in this situation gives us a 47% chance: better than the completely unseeded case, but still not even a 50-50 chance!
All this is just for the first Group: the probabilities compound across the selections for later Groups, resulting in selection for the full 8 Groups, even with a Top 8 seed assigned to each one, being very likely to contain at least one Group which will relegate at least one Top 16 skater, and at least one Group which will allow through at least one Bottom 16 skater.
If we must use Group Qualifier stages, and there are good reasons to use them mostly based on their ease of understanding by spectators and teams, then we have the following choices:
1) We accept the above, and allow that very good teams will be eliminated from the tournament before the second stage.
2) We even more heavily seed selection for the groups. This requires us to actually be able to estimate seeds for at least the Top 16 teams attending, but Roller Derby is geographically siloed to the extent that ranking teams across regions is extremely unreliable. (This is the cause of the continual surprise of US spectators when (WFTDA)-underranked European teams travel across and beat expectations. For South America and the other non-European regions, the situation is even worse.) Even with the information available from other tournaments, however, we could estimate seeds based on past performance (as is done in Tennis, for example) with some accuracy. This is clearly not what was done in the World Cup, as can be seen by the implicit seeding of Canada at #2, despite the clear signal from earlier in the year that England were the higher ranked team. (We assume that Blood & Thunder only took into account the rankings from the previous World Cup, three years ago, when considering seeding!)
3) We accept the above, but allow “relegated teams” a second chance at reentering the tournament.
Point 3 brings us to the issue of the other failing of the World Cup tournament schedule, which compounds the problem above: the use of a single-elimination tournament to rank teams after the group qualification seeding.
Single-elimination tournaments (or knockout tournaments) are popular because they are simple to understand: if you lose a game in the tournament, you’re out. Winners at Round X go through to Round X+1. They also involve a relatively small number of games for a given number of competitors (approximately N-1 for N teams), which makes them nice for tournaments with many teams.
The problem with single-elimination tournaments is in their precise simplicity. Because every team which plays only gets one chance to fail, the #1 seed will knock out every team they play going through to the final. This means that care is needed to avoid the #1 seed playing the #2 seed until that final game (otherwise the #3 or lower seed will end up in that final instead).
While the #2 position is somewhat robust, in that only the #1 seed can knock the “rightful” #2 out early, the situation gets progressively worse for the lower placements.
In order to provide some kind of accuracy in placement for teams below #1 rank, then, single-elimination tournaments are usually very strongly seeded, so that high-seeds play low-seeds in every bracket. This minimises the risk of a significant misranking in the higher ranks of the tournament (but of course, as seeding is never perfect, it does not remove it for the lowest places, which is why most single-elimination tournaments only try to rank the top 4 or so).
As we’ve just seen, however, Group qualification stages such as the World Cup uses are also bad at guaranteeing globally good seeds, and so are a poor choice to seed into a single-elimination tournament. The biases of the two processes compound their errors, resulting in an increased probability of misranking than either would alone. (Again, good seeding at the Group stage helps both the Group and the Elimination stages to perform well, but without it, the entire tournament just compounds errors in ranking all the way through.)
Let’s say we want to keep Group Qualification, because we understand it. If we don’t have good seeds, then our problem is choosing a good tournament to pass our Group results into to minimise the effects of unfair Groups.
The obvious candidate is a double-elimination tournament.
Double-elimination tournaments are tournaments where a competitor has to lose twice (hence double) before being eliminated. The first loss drops the competitor down into a parallel bracket (called the “elimination” bracket) which is run like a single-elimination tournament – losses kick you out of the tournament entirely, and wins continue to the next stage. The winner of the elimination bracket is then allowed to play the loser of the “top” bracket final for second place.
Seeding into a double-elimination tournament from our Groups is relatively easy – we start off the top 2s in the “top” bracket and the bottom 2s in the “elimination” bracket, and then proceed as normal. Teams penalised by being in a bad Group get a second chance via the elimination bracket, rather than being unceremoniously removed to a “Consolation” playoff or kicked out entirely.
In fact, as most Roller Derby tournaments already include Consolation games, in order to guarantee all competitors a certain amount of track time, double-elimination tournaments don’t even add many more games to the total structure (we still have games for “losers”, but we allow them to feed back into the main tournament at the end).
[We can also go to higher-order elimination tournaments – triple, quadruple, etc – but the number of additional games needed for them does increase significantly, and the complexity of the schedule tends to suffer. Triple-elimination is used in some Curling contests, but we are not aware of any real-world deployment of quadruple-elimination or further (probably because at that point, there are other tournament types which offer better performance in less games).]
On the other hand, we could also ditch the Group Qualifiers, and try to pick a better way of seeding our single-elimination tournament.
The main problem with Group Qualifiers is that way they divide up (partition) the teams into rigid groups that play off internally. If we relax the partitioning requirement, but add a requirement that we use previous games to improve our matches in future, we arrive at so-called “Swiss-system Tournaments”.
Swiss tournaments were designed for Chess (and their variants, the McMahon and Danish systems, for Go and Bridge), but are also used for Badminton, amongst other sports. The scheme is as follows: in the first round, teams are paired against each other (the pairing mechanism can be chosen to suit the tournament – we’d suggest matching geographically distant teams to each other for interest). The winners get 1 point, the losers get 0 points.
In each subsequent round, we pair off teams with the same number of points (so 1 pt teams play 1 pt teams), with the winners gaining a point each time.
Swiss tournaments take the same number of total games as the Group rounds above, if we stop after each team has played the same number of games as they would have in a Group round (3 in this case). The advantage is that: as the field is not narrowed artificially, each team has played a wider range of opponents (and thus the risk of initially poor selection is reduced), and as we narrow by past performance, each team should have played teams increasingly close to their own ability after the first round.
If we wish to seed into a single-elimination tournament, then the top half of the score table is always identifiable after an odd number of rounds. We also have the advantage that, as the teams have all played a wider range of opponents, estimation of relative strengths in the table is easier.
The disadvantage of Swiss-style qualifiers is that the schedule for each round is hard to predict before most of the previous round has completed. To keep players, and spectators, in the loop, you need a relatively effective backoffice to update the schedules and post the upcoming games as early as possible.
Another alternative to Group Qualifiers is to simply seed into a double, or even triple, elimination tournament (or a Swiss tournament with enough rounds) using properly estimated seeds based on previous performance. As previously mentioned, this is the process adopted by, for example, Tennis. There is a small increase in the number of total games competed in, but the trade-off is far better accuracy in ranking competitors.
-Schedule length calculations for 32 team tournaments –
WC-style (8 Groups of 4 seeding top halves into 16-player single elimination, with single Consolation games for bottom 16) = 8*(4*3/2) + 15 + 8 = 71
[Note that the WC also included 4 Expo bouts, and a third/fourth place playoff]
Groups into double-elimination, no Consolation: 8*(4*3/2) + 63 = 111
Top 3 in each Group into double-elimination, no Consolation: 8(4*3/2) + 48 = 96
Swiss-style into double-elimination for top half of table (no Consolation): 16*3 + 31 = 79
Complete double-elimination tournament with no qualifier: 63
Complete triple-elimination tournament with no qualifier: 95
Complete Swiss-style tournament with no qualifier: 16*log2(32) = 80
Complete Swiss-style tournament with triple-elimination (Swiss, but after 3 losses, you’re out): 3*16 + 14 + 11 = 71 [4 teams get only 3 games, 6 get 4 and the rest all get 5]