Crime maps: how useful?
New online crime maps for England and Wales have just been published. They seem to show numbers of crimes for single streets. If you're in England or Wales, you'll probably have seen all the fuss about them in the media. But what do they actually tell us about the risks of crime?
The new maps certainly aren't perfect. One issue is that the data aren't actually for single streets; they are for crimes "on or near" the street in question, and you aren't told how near. Where I live, all the shoplifting incidents from the local shops seem to have been allocated to a street, not actually in the local shopping area, where the main feature is a retirement home. Some of the other problems are described in a good article on the Guardian's website.
My own main concerns are all to do with comparisons. First, the information is counts of crimes, and no rates are given. So there's no real way to tell whether 8 crimes, or 5 burglaries, "on or near" a particular street, is a lot or a little. Five burglaries in a month in a street with 15 houses is pretty bad. The same number of burglaries in a long road of 200 houses wouldn't be marvellous, but it wouldn't be as bad. The standard statistical way to compare these numbers would be to use rates - crimes per house, or (more likely) per 1,000 residents, or something of the sort. But the new maps don't give the data in that form, they don't give the denominators (numbers of houses or inhabitants) that you would need to calculate the rates yourself, and you can't even work out the denominators yourself because the maps don't tell you exactly what area each crime count covers.
Another useful set of comparisons would be to see how the crime counts for an area change over time. You can't do that with the new maps yet, because they have data for only one month (December 2010). Worryingly, the Guardian article mentioned above suggests that, when the January data appear on the maps, the December data will disappear from the website. I hope not. But even if it's all still there, the comparisons over time need a bit of thought.
We've previously discussed (here and here) using the Poisson distribution to model crime data. Chance events that follow that kind of distribution are much more irregular than people usually expect. You get clusters of events, and then long periods with few or no events, without any change in the underlying rate.
There's a street near me that had 3 crimes in December. Supposing I look again the next month, and there are 6 crimes. Twice as many. Should I worry?
Well, if all I know are the figures, probably not. The point is that a figure of 3 crimes in a month is consistent with an underlying, long-term average crime rate of anything between about 1 crime a month and about 8.8 crimes a month (assuming that the crimes follow a Poisson distribution). That is, the 3 crimes in December might have arisen because the average rate is about 1 per month, and December was a bad month, or because the average rate is over 8 per month, and December was a good month, or perhaps because the average is 3 per month and December was an average month - or anything in between. If there are 6 crimes in January, it's quite feasible that the underlying rate hasn't changed, and hence that I shouldn't worry.
In fact, allowing for the fact that the January crime count is also subject to random variation, there wouldn't be clear-cut statistical evidence of an increase in the underlying crime rate in that street unless the January count was 12 or bigger. A change from 3 crimes one month to 12 the next month does sound absolutely huge, but that's how big it has to be for you to be confident that it goes beyond the inherent variability of the Poisson distribution.
Given all these flaws and problems, would it be better if the maps (and the underlying data) weren't published at all, as the Guardian article seems to suggest? I don't think so. The huge public interest, enough to crash the website on its first day, demonstrates just how interested people are in this kind of information. On the whole, I'd rather that some statistical information were made available, even if it is flawed and not always easy to interpret, than none. (Though, unhelpfully, it appears that some police forces have used the appearance of the new site to stop publishing more detailed crime information that was previously available.)
Let's hope some of the problems with the new website can be ironed out. There is also a hope that others will use the underlying crime data, downloadable from www.police.uk/data, to produce different maps, displays or reports. That has certainly happened in the USA - see for example maps for San Francisco here (my favourite), here or here, all of which get their data from the San Francisco Police Department. The underlying data available in the US is, however, considerably more detailed than the current English and Welsh version, but maybe that will change too. Given concerns about privacy here, I don't think we'll ever get anything quite as detailed as the US, though.
In short, then, I welcome this, but, in the new crime mapping world, be careful out there...