British Championships Premier Tier – Some Statistics

Today (May 31st) saw the final bouts in the first ever round of Roller Derby’s British Championships Premier Tier.

The British Championships is the development of a new, national, Roller Derby tournament, modelled after other national sporting tournaments, and the Premier Tier is the highest division in it. With teams ranked by the UK Roller Derby Association (UKRDA) rankings (themselves provided by ubiquitous derby ranking site, FlatTrackStats), the Premier tier pit the best 6 teams in the UK against each other (minus London Rollergirls, the best team in the UK, as they are so good that they don’t spend enough time playing in the UK to get a local ranking).

After some exceptionally close and surprising games, the final rankings were:

1 Glasgow Roller Derby GRD
2 Auld Reekie Roller Girls ARRG
3 Middlesbrough Milk Rollers MMR
4 Tiger Bay Brawlers TBB
5 Central City RollerGirls CCR
6 Rainy City Roller Girls RCRG

with Glasgow winning all 5 of their bouts, and Rainy City unfortunately losing all of theirs.

We’re here for statistics, however, so what does our least squares ranking approach say about the teams from their performance in all 15 games?

Well, running the toolkit at https://github.com/aoanla/ranking-chain-inference gives the following ranking tables:

Rank (Score difference) Rank (Score ratio) Rank (FTS-esque)
1. GRD 0.0
2. ARRG -99.58
3. MMR -124.16
4. TBB -139.0
5. CCR -171.083
6. RCRG -178.16
1. GRD 1.0
2. ARRG 0.496
3. MMR 0.430
4. TBB 0.372
5. CCR 0.322
6. RCRG 0.300
1. GRD 1.0
2. ARRG 0.735
3. MMR 0.684
4. TBB 0.650
5. CCR 0.594
6. RCRG 0.578

No matter what ranking approach we choose, we agree with the results of the tournament itself (unsurprisingly, as this is an all-plays-all tournament, which should be very good at ranking participants).

However, there’s more we can do with least squares fits than just ranking. One of the results of any least squares fit to data is a set of “residuals” – the differences between the real scores that happened and the “perfect” scores that our predicted strengths would result in. We can use these residuals as a measure of how good our predictions are, and how well we fit the data – but we can also use them to see which results are the most surprising out of all the games played.

As our prediction is holistic – all of the games contributing equally to the error in our prediction – we can test which game is the most `surprising’ by seeing how the residuals change if we remove 1 game from the list we consider. The game which reduces the residuals the most must be the game which is furthest from the `predicted’ result (and we can measure how surprising each game is by the relative amount by which they change the residuals).

For British Champs Premier Division, if we run over all 15 games, we find that there are precisely two bouts which have a really strong effect on the accuracy of our predictions, both reducing the residuals by a factor of two if they were removed. Those games are: TBB 243  v CCR 114  and GRD 280 v TBB 54.

What’s interesting about this result is that the two are divergent in their effect – the former is one where Tiger Bay Brawlers did much better than they did on average in the tournament (the prediction from all games is that they should have been quite closely matched with Central City), and the latter, one where they did much worse than they might have expected (the prediction would be that they would lose to Glasgow by closer to a 2:5 ratio than the 1:5 they achieved).

So, it appears that the most significant thing about the Premier Tier this time around was that Tiger Bay Brawlers performed in a very inconsistent manner. With no more visible statistics from British Champs on team composition, it is hard to theorise further, but an obvious possibility is that the composition of TBB’s roster was more variable than for the other teams, presenting them with much weaker or stronger formations for each bout.

Another interesting thing we can do with statistics is to combine results from multiple tournaments in order to make predictions. In this case, the author noted that Gent Go-Go Roller Girls’s annual SKOD tournament presented an interesting opportunity – British Champs Premier Tier’s MMR and CCR were both attending, but so was British Champs National Tier (Northern Division)’s Newcastle Roller Girls. The author is of the opinion that NRG are a likely candidate for promotion to Premier Tier in the next year (especially as rankings other than UKRDA’s already place them in the same league as the Premier Tier teams), so seeing them pitted against two current Premier Tier teams was a useful opportunity to gather data.

Running the least squares ranking against the combined set of all of the results from SKOD2015 and British Champs Premier Tier 2015 produces the following ranking (UK teams in Bold):

Rank (Score difference) Rank (Score ratio) Rank (FTS-esque)
1. GRD 0.0
2. ARRG -108.8
3. NRG -119.6
4. MMR -122.8
5. GGGRG -128.9
6. TBB -139.0
7. DRD -150.9
8. CCR -163.1
9. RCRG -178.1
10. PR -184.3
11. NDG -209.6
12. DRRG -215.2
13. NHH -253.1
14. CRD -360.0
1. GRD 1.0
2. ARRG 0.469
3. MMR 0.435
4. NRG 0.432
5. GGGRG 0.406
6. TBB 0.372
7. DRD 0.346
8. CCR 0.336
9. RCRG 0.300
10. PR 0.278
11. NDG 0.241
12. DRRG 0.240
13. NHH 0.193
14. CRD 0.097
1. GRD 1.0
2. ARRG 0.715
3. MMR 0.688
4. NRG 0.686
5. GGGRG 0.667
6. TBB 0.650
7. DRD 0.618
8. CCR 0.607
9. RCRG 0.578
10. PR 0.556
11. NDG 0.519
12. DRRG 0.514
13. NHH 0.465
14. CRD 0.3403

As can be seen, the results suggest very strongly that Newcastle Roller Girls would, if promoted to the Premier Tier, perform very strongly – coming either third or fourth against a very closely matched Middlesbrough Milk Rollers. The author very much hopes that, in 2016, we get a chance to see if this prediction comes to pass…

Advertisements
Posted in Articles | 1 Comment

Avengers: Age of Ultron, some rambling

This essay will contain potentially unmarked spoilers for Avengers: Age of Ultron.

There seems to be something of an emerging tradition for me and the Avengers movies. I really don’t watch movies in the cinema any more, as a rule – the prices are too high, and the experience really isn’t an improvement of being at home with all the things I want nearby.
However, I saw the first Avengers movie when I was in New York as part of the “holiday experience”, and because it was a luxury thing I saw it in 3D. Somehow, even though I am not in New York now, I managed to see the sequel in a cinema locally, and also in 3D.

So, firstly, 3D is not an unvarnished success for A:AoU, just as it is not for any movie. I think it helps for the action scenes a bit, giving them more ‘pop’, but for every action scene there’s a still scene with shallow depth of field where the 3d makes the out of focus bits glaringly distracting (rather than effectively focussing as in 2d).

That out of the way, I’ll start the discussion proper with my conclusion: whilst not perfect, Avengers: Age of Ultron is a superior film to its predecessor, although not long enough.

The first Avengers movie felt like the experiment it was sometimes, trying to pull together all the heroes of the last couple of Marvel movies (including some which were originally not quite intended for that purpose) into one big thing, and make them all work together,
thematically as well as socially. Because of that, while it has a large amount of character writing for an action movie, it also feels like it’s running around a lot of the time trying to introduce things.

Age of Ultron, meanwhile, can confidently open on an action scene in media res, as the already-introduced-Avengers demonstrate their awesome by effortlessly slicing through the defences of a HYRDA base, while engaging in nice character-defining banter. (A Whedon trademark is how the topics of the banter are then echoed through the rest of
the film in conversation.)

Within the very apotheosis of their success, however, as in all good 2nd movies in a sequence, is set the seed of their potential destruction, as the Wanda Maximoff/Scarlet Witch (never called such in the film, but she does wear red and does get called a witch by people) sets up Tony Stark to destroy himself with a vision of his deepest fear.
Unfortunately, Tony is not the sanest of people to start with, and his trauma from the first Avengers film means that what he most deeply fears is a massive alien invasion killing everyone while he survives, impotent and alone. This is obviously not going to go well for anyone, especially as he’s just taken control of Loki’s mind-controlling Scepter from the first movie.

One issue that AoU does have, thanks to being cut heavily from the original 3.5 hours, we assume, is that it tends to be sketchy on the details of some elements. So, Tony finds a “program” inside Loki’s Scepter, which some viewers have interpreted as being Actually The Mind Gem As A Program (spoilers: the Mind Stone is in the Scepter). Given that this program becomes the basis of Ultron, and Ultron inherits a lot of Tony Stark in his twisted personality, I think the implication was supposed to be that the program is the Mind Gem echoing Tony’s desires back at him – he wants to make a “suit of armour around the world”, and Wanda’s nightmare vision is still echoing in the back of his skull, so the Mind Gem gives him what he wants. Unfortunately for all concerned, except in that it actually gives us an antagonist for the film.

Character development via duologue is at the heart of several strands of the movie, from the Tony v Steve moral argument (which will blow up in Captain America: Civil War) through the Paragon v Nihilist opposition of Vision and Ultron. It also serves in both of Black Widow’s character development arcs – contextualised by the Hawkeye Family, and against Bruce Banner/Hulk.

The latter is introduced via the entirely-novel-to-MCU concept that apparently all it takes to calm down the Big Green Guy is for Nat to say some trigger phrases while stroking his hand and forearm. This is convenient for plot pacing purposes, of course, as otherwise Hulk would need to calm down some other more long-winded way, while introducing some fridge logic concerning the way in which Hulk was conditioned to respond to those phrases in the first place.

More importantly to the film, it also allows us to introduce the concept of a tenderness between Nat and Hulk’s alterego, Banner. (It’s also the first horribly incongruous part of the film, albeit well acted by both participants, separated by green-screen. As with most of the other incongruous parts, it’s an attempt to attach more traditionally feminine aspects to Black Widow’s character which sits on the edge of being insultingly obvious.)

The emerging romance between the two is handled with some delicacy, as two damaged individuals (one of whom made seduction part of her armory in her spying days, so needs to be careful when using it for someone she actually cares for, and one who tends to destroy everyone close to him if he loses control) circle each other.
While there are incongruities earlier on (why is Nat apparently tending bar at her own party?) the big stumbling block in this romance is the point where, in response to Bruce’s “I’m a monster and can’t have kids” routine, Nat uses her forced sterilisation as part of the Russian Black Widow project as her example of how she’s a monster too. To his credit, Whedon is obviously trying to tie that line in on multiple levels – Nat is responding to the fact that Banner can’t have children (because he goes green when he gets aroused too) by noting that she can’t either; but she’s also using it to show how she’s a damaged
individual. Unfortunately, it isn’t really made clear that the sterilisation itself is not what makes Widow monstrous, just that it is one part of the training/indoctrination process that did so. (It is not clear that this is entirely unintentional, as certainly there are aspects of society that might consider having children/pregnancy to be the highest point of a woman’s life, and that removing that capability lessons a woman (in a way that might not lessen a man). And Nat is shown to be contrasted with her distaff counterpart Clint, who does actually have a secret family, which are all extremely conventional and settled.)
[It also rings a little false that, in a universe where Helen Cho can construct an artificial tissue that can apparently fuse with magical metal to make a full pseudo-living body, and where bioscience seems generally incredibly advanced, that someone couldn’t make Nat a new womb, or even just provide surrogacy for her from eggs derived from cell samples. At least in Bruce’s case, he doesn’t actually know if his sperm carries Hulkness with it into his descendants either, so his inability to safely perform is not the only issue. (In a longer analysis, one might note the asymmetry even here in the Hulk/Widow reproductive problem – Banner can’t even get it up without potentially killing people, while Romanoff is concerned more by being infertile ground – an explicitly reproductive issue, contrasted with Banner’s sexual one.]
But false notes aside, the romance between the two of them drives an important development towards the end of the film, based on both characters making an explicit choice to reject or embrace aspects of their personality for the greater good. Being a Marvel universe romance, you might imagine that these choices are not to the benefit of their personal happiness. Nat chooses Banner, but immediately betrays him because she needs the Hulk to save people (and thus also lets Bruce become another person she’s manipulated for another cause). Bruce (and Hulk) decide they can’t trust anyone to stop them rampaging, and exile themselves from happiness and the contact of those fragile, deceptive, others. It’s a tragedy for all involved.

Character arcs which explicitly exist to plumb together later films are much more deeply integrated into Age of Ultron than the first Avengers film. Over Phase 2, in fact, it’s striking how more confident the films have become in believing that they are part of a long
project which will actually last long enough for slow plot threads to accrue. In this, the Marvel Cinematic Universe is starting to resemble a series of very long, slow-release episodes of a TV series, rather than a movie franchise. We even have a few characters, such as Andy Serkis’ Ulysses Klaue, introduced apparently purely to give them an origin story for a later film (Black Panther). This does, places some additional stress on the running time of the film, and you do sometimes wish there’d been more emotional and plot development left in from the original 3 hr+ cut, rather than the material which made it.

The other aspect of the film, going back up a few paragraphs, is the focus on Renner’s Hawkeye, a sort of apology to the actor and the character for having him spend half of the first Avengers film in mind-controlled thrall to Loki. The film tries to cast him as the
“ordinary guy” of the superhero team – we learn that he has a Secret Family, which is almost an archetype of the Classic American Nuclear family (they’re just missing an excitable dog who can also look soulful when its master is sad), located in the archetypical midWest American setting. All this is as counter to the anomalous, and, we are
supposed to believe, ungrounded lives of the rest of the Avengers (although, ironically, we assume that Tony’s move to committed matrimony with Pepper Potts makes him the next most grounded individual in the team, which is hardly a recommendation of the
state).

So, as the Everyman (who also is a master archer and skilled combat pilot…), Hawkeye’s role is expanded just in time for him to leave the team at the end. It’s not really convincing that his new informed role as Heart of the Avengers actually has any consequences – he doesn’t get enough time to serve as an audience surrogate, bar one knowing comment late on the film, and it’s certainly not clear that he provides any particular moral argument that other Avengers might not.

(This also compounds the confusing treatment of Johansson’s Black Widow, however, as she’s also being written into a more “traditional” social role in the film, but one with less power and responsibility.
She even manages to be the only member of the Avengers to get captured by Ultron, mainly so he can gloat at her (and she can tell the rest of the team where she is). Whedon has talked about having long arguments with Marvel/Disney execs about various parts of the film structure, and, in this light, perhaps the minimisation and weakening of Black
Widow’s character are part of this struggle.)

Of course, you could argue that the entire film is really based on a mostly hidden moral argument between The Vision’s Post-existentialist Life affirming purity and Ultron’s Nihilistic rejection of Humanity.
While Hawkeye is the Everyman, Ultron and Vision sit at ultimate poles in their rejection of what he represents. Spader’s Ultron, like HAL before him, is an AI with a maddeningly impossible objective which, perhaps, drives him mad. Tasked with creating “peace in our time”, an end to conflicts or threats to humanity, he seems to perceive the worst aspects of everything around him – first fixating on the Avengers themselves as the most powerful potential threat to peace, and then expanding his sphere, inconsistently and irrationally, ever wider. His final doomsday plan could be seen as the ultimate cynical response to his original directive – the only way to save humanity from itself is to destroy it.
The Vision, at the other pole, is created with no overriding objective. Almost the first thing he does on his creation is to stare into the city skyline, apparently mesmerised by the beauty of the world, almost his next is to declare his allegiance with “life”, rather than any petty factional divisions between individuals. In fact, he is the only heroic character who admits to sympathy for Ultron, noting that he is in great pain, driving his actions (which, nevertheless, must be fatally prevented, for the greater good).
In this messianic role, played note perfect by Paul Bettany, he actually resembles another Marvel hero, Adam Warlock, more than the original comics’ Vision. Warlock, in fact, was intended to be an explicit expy of Jesus, pure, on the side of life, eventually sacrificing himself to save the world. (He also had one of the Infinity Gems/Stones embedded in his head, which gave him some of his powers – another aspect which the MCU version of Vision picks up.)
While The Vision does not sacrifice himself in Age of Ultron, we suspect that some manner of this resolution might unfold in the next Avengers film (given that the Infinity War covers Thanos’ collection of all of the Infinity Stones, and one of them is in Vision’s head…)
The Avengers themselves are contrasted with these two poles throughout the film – Ultron explicitly calls them out as being “killers”, and the Maximoffs’ alignment with him at first is driven by their similar characterisation of Stark. Meanwhile, in their climactic battle, the Avengers are explicitly shown to act first to preserve life, evacuating the area and even sacrificing themselves to prevent civilian casualties.

The Vision is also established as being the Paragon via his ability to lift Thor’s Hammer, something nicely established as beyond any of the other Avengers, save Thor himself, earlier on in the film. As well as establishing a more mystical side to the MCU (explicitly, via Steve Rogers and Tony Stark’s rationalist reductionist attempts to constrain
or argue away the Hammer’s definition of ‘Worthy’), the former scene also develops Steve Rogers’ character as well, as the fallen Paragon – whilst coming closest to lifting the Hammer, he is now only able to shift it slightly in position. Steve’s character arc seems to be one of a fall from grace due to trauma – his ‘nightmare vision’ is simply
of the love he lost and the victory party that he missed via his decades frozen in ice.  Perhaps 1940s Steve could have lifted the Hammer, but now he has too much damage, and too little hope, to really achieve that purity of purpose. (Again, perhaps a plot thread
to be picked up in Civil War?)

Overall, though, all of the heroes remain predominantly heroic, although the more compromised retire at the end. Old Heroes seem to stop because their purity can’t stand up to life in the MCU, and they need replaced by newer, purer stock (with cheaper film contracts) in time for Phase 3.

But before that starts, with a battle between Authoritarianism and Liberty in Civil War, there’s just a small matter of a man and his Ant…

Posted in Articles | Leave a comment

Group Qualifiers and Single Elimination Tournaments: A Poor System for a Roller Derby World Cup

In our statistical wrap-up of the Blood & Thunder World Cup 2014, we lay the blame for some significant misrankings in the tournament (in particular, Germany’s failure to be placed in the top 16) on the use of Group Qualification into a Single Elimination tournament.

We realise that this may need some more backing as a statement for those who aren’t quite as into statistics as us.

To begin with, let’s take the issue with Group Selection, and, in particular, the kind of Group Selection that the World Cup used. To simplify the maths a bit, we will consider the case of a tournament with 32 entrants rather than 30, but the numbers carry over closely for the case of 30.

Let us assume that of the 32 teams attending, we can divide them into a Top 16 and a Bottom 16, such that if we had perfect knowledge (and thus didn’t need a tournament in the first place), all the teams in the Top 16 would beat each of the teams in the Bottom 16.

Our Group round consists of 8 Groups, each of 4 teams, such that the top 2 teams in each group go through to the second stage, and the bottom 2 are eliminated. Ideally, therefore, we want 2 Top 16 and 2 Bottom 16 teams in each Group, such that the second stage of the tournament consists entirely of Top 16 skaters.

If we are randomly selecting to fill Group A, then the first place in the Group has a (16/32) chance of being a Top 16 skater (there are 16 of them, out of the 32 skaters who could be picked for that position). The second place then has a (15/31) chance of being a Top 16 skater (as we’ve removed one skater from the eligible Top 16 skaters and from the total number of skaters remaining). Place 3 then has a (16/30) chance of being a Bottom 16 skater (as we still have 16 of the Bottom 16 remaining, and only 30 skaters in total), and Place 4 a (15/29) chance of being Bottom 16.

However, the ordering in which we pick from Top and Bottom 16 doesn’t matter, as long as there are 2 Top and 2 Bottom 16 in the Group (we might have easily chosen 1 Top, 1 Bottom, 1 Top, 1 Bottom, for example). There are 6 different orders of Top, Bottom skater selection which result in a total of 2 Top and 2 Bottom skaters (TTBB,TBTB,TBBT,BTTB,BTBT,BBTT), so we need to multiply the probability above by 6.

The probability of a perfect Group A, then is (6*16*15*16*15)/(32*31*30*29) = 0.40 (to two significant figures).
That is, only about 40% of the possible random selections for Group A produce a combination of skaters which will result in 2 Top 16 skaters passing through to the final (almost 25% result in a Top 16 and a Bottom 16 skater getting through, the same number pass 2 Top 16s, but relegate a third unfairly, and 10% either put through 2 Bottom 16 skaters or relegate 2 Top 16s unfairly).

If we know something about the relative performance of the teams entering the tournament, we can try to reduce this probability by non-randomly selecting the 1st position in each Group to be a Top 8 skater (assuming that we know which the Top 8 are, by comparison with their previous performance, for example).
In that case, things do improve, as we just need to have precisely more Top 16 skater in the Group, and 2 Bottom 16s. Calculating the probability of a perfect Group A in this situation gives us a 47% chance: better than the completely unseeded case, but still not even a 50-50 chance!

All this is just for the first Group: the probabilities compound across the selections for later Groups, resulting in selection for the full 8 Groups, even with a Top 8 seed assigned to each one, being very likely to contain at least one Group which will relegate at least one Top 16 skater, and at least one Group which will allow through at least one Bottom 16 skater.

If we must use Group Qualifier stages, and there are good reasons to use them mostly based on their ease of understanding by spectators and teams, then we have the following choices:

1) We accept the above, and allow that very good teams will be eliminated from the tournament before the second stage.

2) We even more heavily seed selection for the groups. This requires us to actually be able to estimate seeds for at least the Top 16 teams attending, but Roller Derby is geographically siloed to the extent that ranking teams across regions is extremely unreliable. (This is the cause of the continual surprise of US spectators when (WFTDA)-underranked European teams travel across and beat expectations. For South America and the other non-European regions, the situation is even worse.) Even with the information available from other tournaments, however, we could estimate seeds based on past performance (as is done in Tennis, for example) with some accuracy. This is clearly not what was done in the World Cup, as can be seen by the implicit seeding of Canada at #2, despite the clear signal from earlier in the year that England were the higher ranked team. (We assume that Blood & Thunder only took into account the rankings from the previous World Cup, three years ago, when considering seeding!)

3) We accept the above, but allow “relegated teams” a second chance at reentering the tournament.

Point 3 brings us to the issue of the other failing of the World Cup tournament schedule, which compounds the problem above: the use of a single-elimination tournament to rank teams after the group qualification seeding.

Single-elimination tournaments (or knockout tournaments) are popular because they are simple to understand: if you lose a game in the tournament, you’re out. Winners at Round X go through to Round X+1. They also involve a relatively small number of games for a given number of competitors (approximately N-1 for N teams), which makes them nice for tournaments with many teams.

The problem with single-elimination tournaments is in their precise simplicity. Because every team which plays only gets one chance to fail, the #1 seed will knock out every team they play going through to the final. This means that care is needed to avoid the #1 seed playing the #2 seed until that final game (otherwise the #3 or lower seed will end up in that final instead).
While the #2 position is somewhat robust, in that only the #1 seed can knock the “rightful” #2 out early, the situation gets progressively worse for the lower placements.

In order to provide some kind of accuracy in placement for teams below #1 rank, then, single-elimination tournaments are usually very strongly seeded, so that high-seeds play low-seeds in every bracket. This minimises the risk of a significant misranking in the higher ranks of the tournament (but of course, as seeding is never perfect, it does not remove it for the lowest places, which is why most single-elimination tournaments only try to rank the top 4 or so).

As we’ve just seen, however, Group qualification stages such as the World Cup uses are also bad at guaranteeing globally good seeds, and so are a poor choice to seed into a single-elimination tournament. The biases of the two processes compound their errors, resulting in an increased probability of misranking than either would alone. (Again, good seeding at the Group stage helps both the Group and the Elimination stages to perform well, but without it, the entire tournament just compounds errors in ranking all the way through.)


Alternatives:

Let’s say we want to keep Group Qualification, because we understand it. If we don’t have good seeds, then our problem is choosing a good tournament to pass our Group results into to minimise the effects of unfair Groups.
The obvious candidate is a double-elimination tournament.

Double-elimination tournaments are tournaments where a competitor has to lose twice (hence double) before being eliminated. The first loss drops the competitor down into a parallel bracket (called the “elimination” bracket) which is run like a single-elimination tournament – losses kick you out of the tournament entirely, and wins continue to the next stage. The winner of the elimination bracket is then allowed to play the loser of the “top” bracket final for second place.
Seeding into a double-elimination tournament from our Groups is relatively easy – we start off the top 2s in the “top” bracket and the bottom 2s in the “elimination” bracket, and then proceed as normal. Teams penalised by being in a bad Group get a second chance via the elimination bracket, rather than being unceremoniously removed to a “Consolation” playoff or kicked out entirely.
In fact, as most Roller Derby tournaments already include Consolation games, in order to guarantee all competitors a certain amount of track time, double-elimination tournaments don’t even add many more games to the total structure (we still have games for “losers”, but we allow them to feed back into the main tournament at the end).

[We can also go to higher-order elimination tournaments – triple, quadruple, etc – but the number of additional games needed for them does increase significantly, and the complexity of the schedule tends to suffer. Triple-elimination is used in some Curling contests, but we are not aware of any real-world deployment of quadruple-elimination or further (probably because at that point, there are other tournament types which offer better performance in less games).]

On the other hand, we could also ditch the Group Qualifiers, and try to pick a better way of seeding our single-elimination tournament.
The main problem with Group Qualifiers is that way they divide up (partition) the teams into rigid groups that play off internally. If we relax the partitioning requirement, but add a requirement that we use previous games to improve our matches in future, we arrive at so-called “Swiss-system Tournaments”.

Swiss tournaments were designed for Chess (and their variants, the McMahon and Danish systems, for Go and Bridge), but are also used for Badminton, amongst other sports. The scheme is as follows: in the first round, teams are paired against each other (the pairing mechanism can be chosen to suit the tournament – we’d suggest matching geographically distant teams to each other for interest). The winners get 1 point, the losers get 0 points.
In each subsequent round, we pair off teams with the same number of points (so 1 pt teams play 1 pt teams), with the winners gaining a point each time.

Swiss tournaments take the same number of total games as the Group rounds above, if we stop after each team has played the same number of games as they would have in a Group round (3 in this case). The advantage is that: as the field is not narrowed artificially, each team has played a wider range of opponents (and thus the risk of initially poor selection is reduced), and as we narrow by past performance, each team should have played teams increasingly close to their own ability after the first round.
If we wish to seed into a single-elimination tournament, then the top half of the score table is always identifiable after an odd number of rounds. We also have the advantage that, as the teams have all played a wider range of opponents, estimation of relative strengths in the table is easier.

The disadvantage of Swiss-style qualifiers is that the schedule for each round is hard to predict before most of the previous round has completed. To keep players, and spectators, in the loop, you need a relatively effective backoffice to update the schedules and post the upcoming games as early as possible.

Another alternative to Group Qualifiers is to simply seed into a double, or even triple, elimination tournament (or a Swiss tournament with enough rounds) using properly estimated seeds based on previous performance. As previously mentioned, this is the process adopted by, for example, Tennis. There is a small increase in the number of total games competed in, but the trade-off is far better accuracy in ranking competitors.

Appendix:
-Schedule length calculations for 32 team tournaments –

WC-style (8 Groups of 4 seeding top halves into 16-player single elimination, with single Consolation games for bottom 16) = 8*(4*3/2) + 15 + 8 = 71
[Note that the WC also included 4 Expo bouts, and a third/fourth place playoff]

Groups into double-elimination, no Consolation: 8*(4*3/2) + 63 = 111
Top 3 in each Group into double-elimination, no Consolation: 8(4*3/2) + 48 = 96

Swiss-style into double-elimination for top half of table (no Consolation): 16*3 + 31 = 79
Complete double-elimination tournament with no qualifier: 63
Complete triple-elimination tournament with no qualifier: 95
Complete Swiss-style tournament with no qualifier: 16*log2(32) = 80
Complete Swiss-style tournament with triple-elimination (Swiss, but after 3 losses, you’re out): 3*16 + 14 + 11 = 71 [4 teams get only 3 games, 6 get 4 and the rest all get 5]

Posted in Articles | 1 Comment

Ranking the Blood & Thunder World Cup 2014: the Meat of the Matter

One of the big problems with single-elimination tournaments, like that in the World Cup, is that they are exceptionally poor at producing absolute rankings for teams that don’t get through to the final round or so. In their preliminary ranking, Blood & Thunder attempted to use the score-differences from the Top 16 to rank the losers of those match ups (which included Team Scotland). This is problematic, as it doesn’t take into account the large skill differential at the top end of the tournament (even England, breaking records against USA, still scored half that of their opponents) – the same team placed against USA or Canada would record a radically different score differential, and this effect cannot be simply disentangled from the relative performance of the “second 8”.

Driven by this, we have performed some statistical analysis on the World Cup scores as a whole, in order to attempt to provide a firmer basis for ranking the 30 teams in terms of their actual strength, rather than their performance in a tournament.

Our approach is based upon the idea that the ratio of the scores in a bout are a better model for the relative skill of two teams than the difference. This is intuitively
The other advantage of choosing a relative skill measure based on ratios is that we can infer the relative skill of two teams who have not played each other by comparison via a common opponent of the two. If Team A’s relative skill against Team C is X and Team B’s against Team C is Y, then the relative skill of Team A against Team B should by X/Y.
We can also build longer chains of inference on the same model, involving more intermediate opponents, but clearly the error included also increases as the distance between the compared Teams increases. In order to reduce this, we can average over all of the possible chains of a given length to produce a composite relative rating for that pair, assuming that the errors will partially cancel.
(For example, if Teams A and B have not played each other, but have both played Teams C,D and E, then we average the ratios from A-C-B, A-D-B and A-E-B to produce the final “length 1” ratio comparison.)

Examining the structure of the chains of inference, we can always build a “length N” chain of inference up from combining the results of “length N-1” chains and shorter. Relying on the precalculated results for the previous chains improves the efficiency of our calculation by orders of magnitude and reduces the margin of error in our implementation.

We need to build chains of inference up to 4 teams long in order to estimate ratios for all of the possible pairings of teams in the World Cup, due to the fact that the teams in the Consolation rounds had the most limited competition in the Cup.

In order to provide a check on the accuracy of the inferences as chains become long, we also calculate the “self-ranking” of each team, when it is available for a given chain length. This is the strength that the inference assigns the team to if it was playing itself – clearly, for perfect inferences, this should always be 1. The deviation of the self-rankings from 1 is a measure of how much error the inference chains have accumulated so far. We used the self-rankings to select the best performing chain combination process (taking the geometric mean rather than the arithmetic mean as our average produces much better stability, as well as being theoretically justified), and, for the World Cup data, only the self-rankings for Sweden deviate significantly from the expected value (having a value of about 1.3 at rank 3). (We suspect that this is because Sweden is also the only team to have achieved a perfect shutout in the tournament, against Japan, and thus it encounters an inevitable error from the uncertainty this produces in the relative rankings between the two teams.)

Given a matrix of all of the ratios of skill between two teams, we can sort the list of teams via the full matrix (choosing the shortest chains of inference for each ratio we need).

We choose to use a topological sort for our data: a sorting approach that builds an ordered list from a tree of dependencies – in our case, the requirement that winners of a bout are ranked above losers. As topological sorts are not dependant on having a complete ordering for all items, we can perform topological sorts even at earlier rankings in the data, which returns an ordered set of “equivalence classes”, lists of teams that we can say are all superior to teams in the classes preceding them, and inferior to teams in the classes after them, but which we cannot separate using data available at this inference level.

At Rank 0 (using only data directly from bouts in the World Cup), the Topological equivalence classes are:
*******RANK 0********
Japan ,Switzerland ,PuertoRico ,SouthAfrica
Portugal ,Mexico ,Italy ,Wales ,Netherlands,Spain ,Chile
Denmark ,Norway ,Germany ,Greece ,Brazil ,WestIndies
Belgium,France ,Ireland ,NewZealand ,Colombia
Sweden ,Argentina ,Scotland
Finland
Canada
England ,Australia
USA

At each higher rank, we sort only within the topological classes provided at the rank before, to prevent higher inference levels destroying orderings of better provenance.

As our inference chains extend, we cannot guaranteed that cyclic dependancies will not form in our topological graph. These occur when a valid ordering can’t be constructed between teams, assuming that all of their priors are accurate (i.e., we derive Team A > Team B > Team C, but also have Team C > Team A). We find that these only occur rarely, and for inference chains of at least rank 2 (our first such example is West Indies > Norway > Brasil > Greece > West Indies at that rank). Our procedure is to return the equivalence class unordered to the pool, to allow a higher order of inference to attempt to break the cycle. (In the case of the example above, this does not happen at any inference level, leading us to believe that the teams should be equally ranked).

The progressive rankings produced from the original data set at increasing levels of inference are:
*************** RANK1***************
Japan ,Switzerland ,PuertoRico
SouthAfrica
Portugal ,Mexico ,Spain
Netherlands,Italy ,Chile
Wales
WestIndies ,Norway ,Greece ,Brazil
Germany ,Denmark
Ireland ,France ,Colombia
Belgium
NewZealand
Sweden ,Scotland ,Argentina
Finland
Canada
Australia
England
USA
*************** RANK2***************
Japan ,PuertoRico
Switzerland
SouthAfrica
Portugal ,Mexico ,Spain
Italy ,Chile
Netherlands
Wales
WestIndies ,Norway ,Greece ,Brazil
Denmark
Germany
Ireland ,France ,Colombia
Belgium
NewZealand
Argentina
Scotland
Sweden
Finland
Canada
Australia
England
USA
*************** RANK3***************
Japan
PuertoRico
Switzerland
SouthAfrica
Mexico
Portugal
Spain
Italy ,Chile
Netherlands
Wales
WestIndies ,Norway ,Greece ,Brazil
Denmark
Germany
Colombia
Ireland
France
Belgium
NewZealand
Argentina
Scotland
Sweden
Finland
Canada
Australia
England
USA
 

)

As with all rankings, this ranking is somewhat vulnerable to random chance on the day. In particular, blowouts are problematic for all rankings, as they contain almost no information (other than that one team was exceptionally better than the other), as the performance of the lower ranked team in scoring at all is essentially dominated by noise. At the opposite end of the scale, knife-edge games give the impression that two teams are very close in skill, but do not unambiguously provide a measure of which is the better (just one jam different might have changed the winner). Knife-edge rankings are problematic for our topological sorter, as they of course imply partitions that might not have existed if the other team had won, while blowouts present problems for our inference engine, as they return less information to the chains they participate in than other games.

In order to guard against the accumulation of error from the sample games we have, we added “fuzz factors” to the scores on each game, and recalculated the predicted rankings with the slightly perturbed scores. (That is, we construct a new set of scores, where all the games went very slightly differently, and then see what the resulting ranking we would have got in that case.) We repeated ranking calculations over 10,000 sets of perturbed scores for the inference and topological sorter. We then combined the sets of rankings together to gain an impression of how the ranking for a given team would vary.
This has two effects: random perturbations in scores tend to break the cyclic dependencies in the original data, allowing “drawn” teams to be separated statistically; and our higher-order, and more vulnerable, inferences are allowed to vary across their uncertainty ranges, allowing us to measure the actual certainty in their predictions.
Calculating the mean and variance for these rankings allows us to calculate a final ranking synthesised over all of the simulations.

Below is a “heatmap” from the completed run of simulations, showing the distribution of ranking for the various teams. The teams are provided in order of final ranking. The horizontal bar to the side of each team shows how likely the team was to be ranked in a given place, over all of the 10,000 possible rankings calculated. A completely white box shows that 100% of the rankings placed the team in that position, a completely black box shows that 0% of the rankings placed the team in that position.
We have also coloured each team name based on the estimated confidence in the team’s final ranking – light blue is high confidence in that ranking, and dark blue is low confidence.

Heatmap for 10k iterations

The first thing that is clear from the ranking is that the results of the semifinal and finals are in no serious doubt. While an England v Australia bout would be an exciting affair, we have high confidence that England would win the matchup.

It is striking that Team Germany, who were eliminated from the tournament at the Group stage, are consistently ranked at 14th place by our analysis, with high confidence. Germany were hit by a particularly hard Group, facing both eventual-2nd-place England and eventual-9th-place Ireland, and there was no room for a third team in the Top 16 after the latter had qualified. This is a big problem with Group-selection processes into single-elimination tournaments, and we would have preferred to have seen a more “global” playoff scheme for the single-elimination phase (for example, three rounds of Swiss-selection, which pairs off teams first randomly, and then against teams who have won or lost as many times as they have) to avoid the “local minimum” problem.

There is some statistical spread in the middle part of the ordering, which is caused by the same effect as above. Teams on the border of the Top 16 selection can pop in and out of the Top 16 dependant on small perturbations of their performance, especially with poorly seeded Groups.

However, the spread is not terrible for any particular team, so we are happy to pronounce our ranking for the Blood & Thunder World Cup 2014 as:

1. USA
2. England
3. Australia
4. Canada
5. Finland
6. Sweden
7. Scotland
8. Argentina
9. NewZealand
10. Belgium
11. Ireland
12. France
13. Colombia
14. Germany
15. Denmark
16. Greece
17. Brazil
18. Norway
19. WestIndies
20. Wales
21. Chile
22. Netherlands
23. Italy
24. Spain
25. Portugal
26. Mexico
27. SouthAfrica
28. Switzerland
29. PuertoRico
30. Japan

As an additional check, you can see that our international ranking is compatible with the results of Super Brawl of Roller Derby, Road to Dallas and relatively compatible with the European Championships (as the latter was a single elimination tournament, precise compatibility is less of an issue). (The Blood & Thunder World Cup Tournament ranking is not, as the effects of the tournament are convolved with the power rankings themselves.)

[The source code for the inference tool is available from: https://code.google.com/p/ranking-chain-inference/ ]

Posted in Articles | 1 Comment

Skate Names, Or: Why we need to stop worrying and love Derby.

Note: this article was first written more than a year ago, back when there was a steady tide of articles about “Why I am not skating under a derby name”. The uncritical acceptance of the arguments presented by most, even those who liked skate names, annoyed me, so I wrote the first version of this response. Due to various issues, it failed to get published the first time around, and the article grew to a size where having it on the Scottish Roller Derby Blog seemed to not fit with policy. I present it here as an artifact of a time when derby was a little less self-confident than it is now, perhaps.

Over the past couple of years, the “Derby Names” debate has become a growing topic of conversation in the Roller Derby community. It might just be my selective reading, but the majority of the published articles on the issue take (or are spun as taking) an anti- stance; a proportion not bourne out by the comments resulting, most of which are pro derby names. It is especially unfortunate that most of the true anti-derby-names articles repeat arguments that are seriously flawed and, in some cases, based on statements that are simply untrue. This includes the coverage of the issue in the celebrated “Derby, Baby” film, where the comments from the anti- side are never directly challenged by quotes from the pro- side (although both sides are allowed time).

This article, then, aims to partially redress the balance of the debate. I will try to show why none of the provided arguments against Derby Names are without problems, and why many of them should be dismissed out of hand.

At this point, I should make a distinction: some of the arguments made against derby names serve as reasonable personal reasons not to take a derby name; they do not serve as reasons for the sport as a whole to drop them. I will flag arguments which only work on the personal level when I come to them (and I have no issue with people deciding, for themselves, to skate under their real names).

With that extensive introduction out of the way, let’s get started.

Argument 1: “No Other Sport Has Athletes Perform Under Aliases”

This argument is the most often repeated justification for why Derby doesn’t get prime-time viewership, especially in the USA. It is also, of course, completely untrue. Most of the commentators are presumably American, and are thinking that all other sports are like the trifecta of American Sports – Baseball, American Football and Basketball – and, in fact, I suspect this argument could be more truthfully stated as “No Other American Sport Has Athletes Perform Under Aliases”. In this form, the statement is true.

However, there are many more sports in the world than the USA’s favourites (none of which has much traction outside the US). The World’s Favourite Sport, “Association Rules Football”, or “Football” as everyone but Americans and Australians call it, has a couple of nations who famously play under affectionate nicknames, the most well-known of which is, of course, Brazil.

Football does not have a widespread culture of aliases of course, but there is a far closer match to the position of Roller Derby in existing sports: the Japanese sport of Sumo.

Sumo, which is still wildly popular in Japan, requires all wrestlers to compete under a pseudonym, or shikona, which is regulated by the sport’s governing body. As in Roller Derby, shikona often incorporate puns or references to the personality the wrestler wishes to express; for example, the American Henry Armstrong Miller wrestled as Sentoryu, which means “fighting dragon” but also sounds like the pronunciation of his home town, St Louis, in Japanese (as in Derby, not all puns are as good as others). Shikona are taken very seriously, and any suggestion that they brought the sport, or the individual wrestlers, into some kind of disrepute would be considered shocking and offensive.

Of course, by even addressing this argument seriously, I’ve already missed the most fundamental objection to this position: why does it even matter if a sport does something that no other sport does? One of the glorious things about humanity is that we express ourselves through diversity. Complaining about that is simple parochial neophobia, and should be opposed on principle.

Argument 2: “Derby Names Make Derby Look Less Serious”

This second argument is a valid personal reason for not taking a derby name, and as promised I flag it as such. On the other hand, the general form of the statement makes some implicit assumptions that I find corrosive to the spirit of Roller Derby itself.

We can paraphrase Argument 2 as the “and when I became a man, I put away childish things” position. As a starting point of our deconstruction of the problems with this position, we can’t do much better than C. S. Lewis’s own extension of the Biblical quotation: “When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up”.

Worrying excessively about other people’s perceptions is a phenomenon of adolescence, both in individuals and in communities; becoming aware of others awareness of us invokes anxieties about one’s projected self-image. As with individuals, the healthiest groups are those who get through adolescence by realising that other people’s perceptions are really not as significant as they may feel.

The revived roller derby had as one of its core principles that it was a sport “for the skaters, by the skaters”. This quickly became a wider message about an inclusive sport, which welcomed women of “all shapes and sizes”, and which challenged traditional images of sporting women. Having confidence in yourself, regardless of if this fits into traditional norms, is thus an essential aspect of roller derby as a sport and a culture. It’s one of the primary reasons stated by new skaters for their interest in joining up, and it’s one of the most positive aspects of the culture for everyone involved.

It is therefore somewhat ironic that the reaction of some to increased media attention has been to worry about how roller derby can best fit in and be “like all the other grown-up sports”. No, derby names are not the most serious of roller derby’s paraphernalia. The question is: why should we care that they’re not? Being a well-balanced adult is about appreciating your ability to have fun, and, indeed, to express yourself, as much as it is about responsibility and social conventions. That means feeling able to skate under your own name if you feel it best represents you; it also means being able to skate under a pseudonym, without people condemning you for making roller derby look “silly”. Skate names are one of the first things that new skaters seem to think about; certainly when interviewing new leagues, it was clearly a question that all of the skaters had already been considering. Most relish the opportunity to express themselves in choosing a name which means something to them, or which helps them to adopt the right attitude on track.

All long standing fans of roller derby are well aware of the skill and professionalism that exists at the top levels of play (and, indeed, which is aspired to at all levels). We are aware of this despite the trappings which the author would have discarded. I am not convinced that this is a demonstration of tremendous perspicacity on our part; what people forget about the few negative newspaper articles about derby is that some people are paid to have dismissive opinions about things. The less reliable newspapers and the wider media routinely publish all kinds of nonsense about elements of non-mainstream culture, but no-one of any consequence pays any attention to them, and no-one sane actually tries to pander to their delusions. You can’t, as Abraham Lincoln noted, please all of the people all of the time; all you can do is not displease any of them by not risking anything any of the time. It would be a much diminished world if we never took the chance on risk.

Argument 3: “You’re not taking sufficient Pride in your Sport if you perform under a Pseudonym”

Clearly, this argument is similar to the above, although distinct (a comedian might not want to appear serious, but still takes pride in their work, for example). As such, it shares with the above argument a certain validity as a personal reason for not taking a skate name.

As a general statement, however, it falls very very short of being reasonable.

Argument 3 appears to be making a moral argument that should stand apart from sports, and apply to any particular human endeavour on the world stage. If it shows insufficient pride (or, as expressed mainly by Americans, “patriotism”) to play a sport under a pseudonym, should the same not apply to acting under a pseudonym, or performing music?

Clearly, the British Actors’ Union “Equity” would have issue with this position. As Equity requires all actors registered with it to have unique professional names, it is often the case that a performer will either modify their existing name (“Richard Grant” to “Richard E Grant”, where the E is a fictitious initial) or take on an entirely different one for performance (“David MacDonald” to “David Tennant”) in the interests of uniqueness. I imagine that David Tennant didn’t particularly feel a lack of pride in his numerous Best Actor awards just from their lack of  “David MacDonald” blazoned upon them.

Similarly so in the world of music. There are countless examples of popular, and successful, and sometimes even patriotic, musicians who perform under names which are certainly not those with which they were born. Gordon Sumner, for example, will be much better known to the world as “Sting”. As with Tennant, above, I suspect that Sting doesn’t mind that his Grammy Awards come with his performance, not his birth, name.

And, to belabour the point, so much more so now in the various Internet-mediated industries is the pseudonym considered normal. Markus Persson is still respected as an enormously successful games developer after Minecraft took off so unexpected, even though most of the world knows him as “notch”.

In fact, even the sporting world has its own counterexamples. Pelé, or “Edison Arantes do Nascimento” as his birth certificate would have him, is commonly regarded as possibly the most successful Footballer of all time. He is a national hero in Brazil, and, in fact, was once declared a Brazilian National Treasure partly to prevent him from playing for other nations. In the football culture of Brazil, at least, adopting an alias to play sport doesn’t demonstrate a lack of patriotism at all.

To suggest that a skater in high end roller derby cannot partake of the same cultural constructs as a performer at the top of their own career seems to be deliberately obtuse. Indeed, the special pleading that this seems to me to require feels slightly insulting to roller derby, implying that our skaters are less able to cope with a multifaceted identity than the rest of humanity. While there certainly is a double standard in Western society concerning Women’s and Men’s empowerment in identity choice, it doesn’t seem to me to affect the adoption of pseudonyms per se, although of course it does influence the acceptability of pseudonym presentation. While this is a significant issue, it’s also worthy of an article in itself, and outside of the scope of this one.

Indeed, even the now oft-cited example of international skaters at the World Cup “choosing to skate under their real names” is actually highly problematical for the argument: this phenomenon only actually occurred within Team USA (and Team Brazil, who didn’t have any skate names for most of their skaters anyway), with by far the majority of competitors perfectly happy to compete in a World Cup under their established derby names. The WFTDA 2012 survey backs this up: fully 89% of female skaters (and 93% of fans) did not agree that they’d enjoy roller derby more if competitors skated under their real names.  If people really worried about not being seen to be taking pride in their performance, wouldn’t this fraction be far higher?

Argument 4: “Offensive Derby Names Scare Off Broadcasters”

As with Argument 1, the first problem with this argument is that we take its implicit assumptions at face value. We are to assume that getting on National Sports channels, be it Sky or NBC Sports, should be a key aim of Roller Derby in the short to medium term, and that this is critical to the success of the sport.

This position is unsupported by available evidence. As the Derby News Network made clear in their first ever editorial [http://www.derbynewsnetwork.com/2012/09/pay_view_enemy_success ] it is not clear at all that the causal arrow points from Television to International Success. In fact, historically, the situation has been reversed: the inherently conservative sports broadcasting industry will ignore sports until they become so wildly popular that they have to pay attention to them.

DNN also make the beginning of a strong argument against the importance of Television in mass availability of bout footage. Before WFTDA.tv made all of the regionals pay-per-view-only (a decision that I concur with DNN on regarding as a retrograde step), footage of high-end Derby tournaments were as geographically available as they could be: by definition, anyone with a sufficiently fast Internet connection, anywhere in the world, could view them. DNN still provides international streaming of live bouts from across the world on precisely the same basis. The Blood and Thunder Roller Derby World Cup was watched by individuals across 50 nations. No Television company will ever give you that kind of distribution. (Despite my issues with their charging structure, I do much prefer WFTDA.tv (or DNN, or RDUK.tv) managing broadcasting themselves; it is much more in line with the “Derby Spirit”.)

There is an argument that the mainstream sports channels access a greater cultural breadth in their subscribers; there are many sections of society who are much happier with a television than a computer. While this is currently true, it is changing rapidly; recent polls show that an increasing number of people are cancelling their cable subscriptions in the USA to switch to internet streaming services, and the BBC’s iPlayer is in continual growth compared to its broadcasting arm. It’s a short hop across the internet from iPlayer to YouTube, or DNN, or rduk.tv.

In any case, the 2012 WFTDA survey clearly shows that the majority of new fans and skaters come from word of mouth or watching “Whip It”, and the rate of growth doesn’t seem to be slowing at present. The only thing that pursuit of the big media can bring to this is a veneer of “mainstream legitimacy”, and I’m not sure why Roller Derby needs a bunch of old wealthy white men to tell it that it’s doing good.

Now that the spectre of the Broadcasters has been cut down to size, we should address the other side of the question: Do offensive derby names scare them off [and what should we do about it]?

It seems unquestionable that some derby names currently in circulation would be problematic for broadcasters (especially in the USA, where television is a little more socially conservative) to announce on air. The bracketed clause I inserted into the above question suggest, however, that there’s more than one approach to this issue, rather than the false dichotomy that the original phrasing implies.

Yes, getting rid of Derby Names entirely would, clearly, solve the problem of offensive derby names, in the same way that banning cars would entirely solve the problem of people dying in car accidents.

This seems like an extreme solution to the issue, where simply introducing additional checks in the existing Derby Name submission system would serve. (I am assuming here that, at some point, the poor overworked maintainers of the Master Roster would actually get some proper help, as they clearly can’t cope with the rate of submissions at the moment anyway.)

Despite the offensive Derby Names issue, there are other aspects that might equally well dissuade the average broadcaster from showing Derby. It is certainly not clear that Derby Names, in themselves, are dissuading all broadcasters.

Recently, all of the major UK broadcasters (the BBC, ITV and Sky) have produced features on roller derby. None of them seemed particularly put-off by derby names (although, of course, the converse danger of being caricatured by the names is still present), and all gave roller derby a pretty positive take. In fact, Sky have subsequently produced more spots on individual bouts, broadcasting them on the Sky Sports channel (rather than as frivolous filler pieces, for example). This is not the action of a channel which is not taking a sport seriously because its players tend to skate under pseudonyms, nor one that is scared off by the potential for offence in those names. In other countries, coverage has been even better: after coming third in the Track Queens: Battle Royal tournament, Stockholm Roller Derby achieved a full page article in the sports page of one of Sweden’s daily newspapers. The headline even referred to Swede Hurt as, well, Swede Hurt.

On the basis of history, it is more likely that the mere fact of Roller Derby being a predominantly Women’s sport is a significant impediment to screening by large broadcasters. Back in 1989, the then Head of BBC Sport, John Bromley, partly justified poor coverage of women’s sports with the statement “There is no audience for Women’s Hockey”. Similarly, it is only relatively recently that Women’s Football has experienced anything resembling consistent coverage on UK television (of course, still a tiny fraction of the coverage given to the Men’s sport).

When women’s sports have been broadcast by the mainstream media, there has been a tendency to both patronise and sexualise them; while this situation is improving with time, it is still the case that, for example Men’s and Women’s tennis is treated somewhat differently in the media. Roller Derby is precisely the kind of sport that is most problematic for big media, as it, like Women’s Rugby, directly challenges the gender norms that conservative media would like to enforce – it’s okay to see women looking pretty while doing gymnastics, or even wearing short skirts while hitting a ball in tennis (although it’s apparently not fair to let them play as long as men do, and making noises from effort causes comment), but women actively hitting each other is transgressing into traditionally male domains. This is historically far more of a problem for conservative outlets than some offensive derby names are, and something which requires society to catch up, rather than roller derby to shift itself to conform to an outdated set of cultural values.

Posted in Articles | Leave a comment

A Contract for Roller Derby Videography

I’ve been moving from photographing roller derby (which now so many people do that it’s difficult to believe that you’re adding value to the discourse) to videoing bouts over the last 6 months to a year. (Although the first whole bouts I ever videoed were for Team Scotland before the World Cup.)

As part of this, if I’m going to video a whole bout, which is usually arranged in advance with the host league, I tend to give both attending leagues copies of a “bout DVD” – a post-produced version of the footage with additional toggleable overlays (Period/Jam indicator, scoreline, jammer names and status), credits etc. In return, I’ve been assuming that those leagues might allow me to put my own footage up on the internet for others to see.

As it happens, at least in the UK, policy on video footage of bouts seems to be highly embryonic and variable amongst leagues. (Compare to the USA, where the Top 10 leagues in each of the old Regions have huge amounts of their footage online, and extensive policies regarding video footage.) I’ve encountered positions all the way from “please put everything online as soon as possible – we want to be able to show people it” to “please don’t put footage online at all, for at least period X” (where X has been anything from a month to a year). And of course, there are the leagues who don’t have a policy on video at all, and need to consider what that policy is before responding (which is entirely reasonable, I should say). Of course, strictly, the footage is my own and I could do whatever I wanted with it, but in the interests of fairness and derby community spirit, I try to stick to the wishes of the leagues I’ve videoed at any particular point.

However, over time, as I’ve collected more footage, and interacted with more leagues, and as some leagues’ policies have evolved, it has become more difficult to manage all the conflicting positions that exist.

As a result, and also in an attempt to provide a basis for some kind of harmonisation of policy, I am reluctantly establishing a formal contract regarding video work I perform at bouts. I really dislike having to make this kind of step, as I feel it in some way tarnishes the derby community ethic, but recent events have made other approaches untenable.

So, effective from June 2nd (because it’s unfair to force Dundee and Swansea to respond to this kind of thing at less than 24 hours’ notice):

The Formal Contract of Video Work (version 1)

Definitions

League A is the host League, the League who has hired and organised the venue for the Bout.

The Bout is one or more consecutively held contests of the sport of Roller Derby (as defined under WFTDA or other commonly accepted rule-sets), held in the venue organised by League A.

Leagues B are the other League or Leagues involved in the contests comprising the Bout.

The Videographer is the Person responsible for operating video recording machinery, engaged to record the Bout for League A.

The Reproduction is a processed account of the Bout, arranged from footage recorded by the Videographer.

Section A

I, the Videographer, being requested to video a Bout hosted by League A, and featuring other Leagues B, will endeavour, to the best of my ability (limited by equipment failure, and acts outside of my control) to capture a complete and representative record of the events of the Bout. The League A accepts that this will require the Videographer to establish a position allowing a good view of the playing area and the scoreboard, and that the Videographer may require equipment, for example a tripod, to enable quality footage to be captured. The League A additionally accepts that this may require the Videographer to enter the hall containing the playing area before the Audience arrives, in order to set up. The Videographer accepts that there will be limitations on the positioning of his equipment for safety and other reasons, and will endeavour to minimise disruption to both the Bout and the Audience by the placement of his equipment. The League A also accepts that this record may not include footage of Time-Outs or other periods when Play of the Bout is suspended – it will, minimally, endeavour to cover all Jams in all Periods of the Bout.

Section B

After the Bout is completed, the Videographer will endeavour to produce a processed, high-quality Reproduction of the Bout’s events (with consideration of the issues raised in Section D), including additional enhancements at his discretion, in a format acceptable to the League A and Leagues B (but limited to reasonable formats, and defaulting to DVD-Video). Copies of this Reproduction will be made available to all participating Leagues (A and B), in reasonable numbers (1 to 3 per league).

Section C

At a reasonable Delay after the date of the Bout, the full bout footage (sans footage covered under Section D) will be made available on the Public Internet, on a video hosting platform of the Videographer’s choice. (At present, this is YouTube.) The Delay will be the longest of all Delays requested by the involved Leagues A and B; each League may request a Delay of a maximum of 8 calendar weeks or the day after the next public bout featuring their represented team, whichever is shorter. Longer delays may be requested in exceptional circumstances, but should be robustly argued for.

Other than the public release, no other copies of the footage will be made available to any groups or individuals other than Leagues A and B, except with the permission of Leagues A and B.

In the event that the Bout is a “Closed Bout”, not open to the public, Section C is waived, and bout footage will not be made public unless with the agreement of all Leagues A and B.

Section D

In the event of injury occurring to a skater during the bout, or other exceptional events, the Videographer will obey the current standard of behaviour undertaken in the Derby Community: he will cease recording at the point that the injury (or other circumstance) becomes evident (e.g. when the jam is signalled to an early halt due to injury on the track, or other reasonable indications), and not resume recording footage (except for incidental footage not including the Bout itself – for example, the Announcers entertaining the crowd) until the Bout itself resumes formally.

The Reproductions provided to the Leagues A and B will include all footage recorded by the Videographer, except in the exceptional case that the recorded footage involving an injury is particularly traumatic (in which case, the Videographer will make the footage available to the home League of the injured skater and defer to their position on the footage’s inclusion), or exceptional cases where non-Bout-related footage includes matter which should be elided for legal or other reasons. In all cases, the Leagues A and B will be provided with reasoning for the elision of all removed footage containing material involving the Bout itself.  The Public copy of the footage will, by default, have jams containing a traumatic event such as an injury requiring stoppage of the bout excised from the record. (The home League of the injured skater reserves the right to make a different decision on the disposition of the footage, if they so wish).

Any comments on this draft contract are gratefully received: this is, of course, a work in progress.

Posted in Uncategorized | Leave a comment