Algorithmic bias doesn’t start with the algorithms, it starts with the bias. That bias comes in two basic forms, one more active and one more passive; one about what is present and one what is absent. Both forms matter and often both come together. If we examine a data set, we might see clear differences between groups but be slower to spot – if we spot it at all – skews caused by the representation of those groups in the data set in the first place. If we survey bus passengers, we may find out important things about the needs of women travelling with small children (and their pushchair and paraphernalia), but we may overlook those who have been discouraged from travelling that way at all. That’s a very simple example, many are more subtle than that – but the essential point is that bias of absence is pervasive.
This post systematically identifies and addresses those biases in the context of transport. It draws heavily on the approach of Caroline Criado Perez’s book, Invisible Women: Exposing the Data Bias in a World Designed for Men, illustrating the general point with pointers to a vast range of data and analysis. It should be compelling reading for anybody involved with transport planning, but it’s included here for two other reasons as well.
The first is that it provides a clear explanation of why it is essential to be intensely careful about even apparently objective and neutral data – the seductive objectivity of computerised algorithmic decision making is too often anything but, and why those problems won’t be solved by better code if the deeper causes discussed here are not addressed.
The second is prompted by a tweet about the post by Peter Hendy. He is the former Transport Commissioner for London and is currently the chairman of Network Rail, and he comments
This is brilliant! It’s required reading at Network Rail already.
That’s good, of course – a senior leader in the industry acknowledging the problem if not quite promising to do anything about it. But it’s also quite alarming: part of the power of this post is that in an important sense there is nothing new about it – it’s a brilliant survey of the landscape, but there isn’t much new about the landscape itself. So Hendy’s tweet leaves us wondering when it becomes acceptable to know something – and when it becomes essential. Or in the oddly appropriately gendered line of Upton Sinclair:
It is difficult to get a man to understand something, when his salary depends upon his not understanding it!