The problem of the AI black box has been around for as long as AI itself: if we can’t trace how a decision has been made, how can we be confident that it has been made fairly and appropriately? There are arguments – for example by Ed Felten – that the apparent problem is not real, that such decisions are intrinsically no more or less explicable than decisions reached any other way. But that doesn’t seem to be an altogether satisfactory approach in a world where AI can mirror and even amplify the biases endemic in the data they draw on.
This interview describes a very different approach to the problem: building a tool which retrofits interpretability to a model which may not have been designed to be fully transparent. At one level this looks really promising: ‘is factor x significant in determining the output of the model?’ is a useful question to be able to answer. But of course real world problems throw up more complicated questions than that, and there must be a risk of infinite recursion: our model of how the first model reaches a conclusion itself becomes so complicated that we need a model to explain its conclusions…
But whether or not that is a real risk, there are some useful insights here into identifying materiality is assessing a model’s accuracy and utility.