Another Meaningless Graphic: Another Meaningless ‘Fact’

Have you ever seen one of these? A classic example of an attempt to bamboozle you with utterly meaningless data.

This is from a website that, amongst many other things, promises to “outpace disruption“. Does anyone know what that means? Anyhow, here is the result of outpacing disruption.

meaningless histogram
A meaningless bar chart

This was all there was. There was no information giving context. Still, positive numbers must mean it is wonderful investment. You can hardly fail to make a bundle

Are you ready to part with your money yet? No?  How about if you knew this dazzling fact: what if I were to tell you that this product increases checkout speed (e-commerce) by 24%. Impressed yet?

Or perhaps, after you read the first posts on The Problem With Data you were asking things like, a 24% increase over what? How many? What period? Which currency? What language? How measured? Credit/Debit card? PayPal? Amazon Pay? Stored customer details? First-time transactions? Repeat transactions? Fibre broadband or 5meg FTTC, TCP to the residence? And on and on.

 

 

Type 3 data in action. The Guardian is at it again.

The purpose of this blog is to get behind the data stories we encounter. Understandably, most commercial data is sensitive and remains unpublished. This means I have to rely on publicly available mangling of the data to illustrate the points.

The article of 11th October 2018 carries the snappy title, “Profits slide at big six energy firms as 1.4m customers switch” (The 3 types of data are explained here)

I will stick to the problems with data and not make this a critique af the article, for its weaknesses alone. That is just churlish. Read the following and think of yourself being presented with a document like this and having to critique its worth as something to base your decision-making on.

This article encompasses the Type 3 data example so very well! It appears that the journalist has started with an idea and then worked backwards to mangle what Type 1 data they have to fit the idea they want to transmit to the reader. To be clear: this post is not written an opinion piece about the Guardian, but a critique of an article purporting to use Type 1 data  to support the ‘Sliding Profits’ hypothesis.

Before we go any further the Golden Rule of data has been broken. You simply mustn’t decide the answer, and then try to manipulate, mangle and torture the data to fit your conclusion. You must be led by the data, not the other way round. It is fine to start with a hypothesis and then test the data to see if that is true. It is a major credibility red flag when the conclusion is actually the initially assumed answer.

Red Flag

If the article is apparently a business article it is rather worrying when the journalist obviously doesn’t know the difference between profit margins and profit¹. These are two distinctly different ideas yet they are used interchangeably in the piece. Red flag number two (if the first wasn’t enough). Paragraph five manages to combine the margin’s of two companies with the profits of another and then – completely randomly – plugs in (excuse the pun) an apparently random reference to a merger and the Competition Commission.

Terms like the ‘Big Six’ are used but nowhere does the author bother to say who the Big Six are. Whilst it is a moderately common term it cannot be assumed that everyone knows who they are. This is sloppy reportage and another Red Flag for the reader. Sloppy here, sloppy elsewhere. Who knows? This is back to the Type 3 issue of how it is presented to you. In this case, so far, very poorly.

The energy market regulator, Ofgem, is cited as the source for the first graphic. The Y (vertical) axis is numbered with no qualification, the date and document that this is taken from isn’t mentioned. Type 1 data being mangled by the Type 3 data. Overall – poor sourcing and not worth the bother. You can dismiss graphics like this as you can reasonably assume it is a form of visual semiotic designed to elicit a feeling and not communicate any reliable Type 1 data to you. (Note the profits and profit margins even being conflated in the graphic title!)

Poor graphic.JPG
Poor graphic designed to mislead – taken from the Guardian article.

 

The final critique is the one that speaks to the concept of Type 3 data. The language used in the article is such a blatant attempt to skew the article away from reportage about how the entrant of challengers into the market place are affecting the profits, and profit margins, of the established players. I think the subsidiary point is about the fact that consumers aren’t switching suppliers as much as is expected. I had to read the article several times to distil those as the most likely objectives of the piece.

Finally, if you re-read the article and just look at the tone and, more specifically, the adjectives used you’ll be surprised. What I can’t work out is the author’s agenda. To just report such a muddle of data is one thing, most popular press has an agenda of some kind.

NB: I really hope the Guardian doesn’t just keep gifting such poorly written articles. I think I may look at the coconut oil debate next!

Continue reading “Type 3 data in action. The Guardian is at it again.”

Let’s break down today’s bad data usage – Yes Guardian newspaper, I mean you!

Overnight there was a report published in The Guardian newspaper entitled “Met police’s use of force jumps 79% in one year”. I see the hysteria on Twitter – being whipped up and added to by the usual suspects who revel in the dog-whistle approach to political discourse – about the use of force by the Metropolitan Police being used disproportionately against black people.

“The Metropolitan police’s use of force has risen sharply in the last year, with black people far more likely to be subjected to such tactics than anyone else, the Guardian can reveal.”

Firstly: this is not an attempt to take sides. The police may be guilty of the accusation. Without correct and fair anaslysis of data it is impossible to tell. See a previous post about how to approach stories like this.

Secondly: the purpose of this article is to interrogate the findings of the Guardian’s reporting of this story. If this undermines the story then so be it. Do not conflate that with an endorsement of the police in London, for I do not know enough to comment about them. This is about the use of data.

The main thrust of the article is the “79% in a year” claim. It is what has been seized upon and retweeted with vigour. Nowhere does it appear that the people getting all worked up over this selective quote  have actually looked into the data.

“On 39% of occasions in which force was used by Met officers in the first five months of the financial year, it was used on black people, who constitute approximately 13% of London’s population.”

The first thing that struck me about this piece was the language and imagery used. Whilst the language is not the data, the way it is used certainly serves to alert you to the fact that they may be glossing over details in the pursuit of shock value. The Guardian is (was?) a credible broadsheet with a left-of-centre bias. Nevertheless, now they are giving away their content for free, they seem to be leaning towards the ‘clickbaity’ style of reportage, and that is a pity. Look at the graphic they have used. It is fairly emotive stuff. A white man pointing a weapon at you.

taser front view

In the first paragraphs the article uses words and phrases like, “jumps, risen sharply, most likely, on average, approximately, raised alarm, receiving end, stark figures, police culture” and the like.

These are written efforts to engage the enraged response (metaphorically speaking) part of our brains rather than the rational analysis. The System 1 reaction and not System 2 as Daniel Khanemann calls it in his book Thinking Fast and Slow.

Arguably, the alleged disproportionate use of force by police officers against black people is so serious an allegation that it warrants slowing down, taking a deep breath and analysing correctly?

Let’s break down the critical analysis a bit by asking some questions, and making some observations.

  • The only reference to the data used is “Guardian analysis of official figures“. This alone should sound the loudest alarms ringing in your head and set you into sceptical analysis mode. What figures, analysed by whom, what is their expertise, what were the controls used, compared to what, is there (perish the thought that a journalist is anything other than scrupulously impartial) an agenda on the part of the presenter of these figures?  [I think I may have found the data being used. See the bottom of the article for links. It certainly isn’t acknowledged in the article. This might lead a cynic to wonder if it may be being taken out of context and the journalists don’t want this easily checked for fear of undermining their credibility.]

 

  • Many people think of the word black being interchangeable (perhaps incorrectly) with people of colour. It turns out that the Guardian even mentions that ‘Asians’ and ‘Other’ are not part of this classification.

 

  • Are the figures generated by the ethnicity initially recorded by police codes for radio use or the self-defined ethnicity – 16+1 versus 9 – codes used by the subjects, even if they differ from the officer assesment. It doesn’t say.

 

  • In the last paragraph of the piece the most convoluted attempt at figures is used;  we witness the groups ‘Asian’ and ‘Other’ being rolled together to make a ‘52%’ claim sound more shocking. They need to decide how they portray things and stick to it. Previously the paper excluded ‘Asian’ and ‘Other’ from the ‘Black’ category and instead let them sit outside along with ‘White’ in order to use the five month and 79% figure on which the outrage is based.

 

  • There is no indication if these figures are split between reactive (responding to calls from the public), or proactive (the officers see something that they decide to investigate further). Proactive interventions are carefully considered by officers, they rarely steam in like you see in the movies. Things like back-up availability, whether they are single-crewed (and far more vulnerable), priorities like previous calls, outstanding paperwork (yes, really, there is a lot), their caseload and so on. Proactive policing is where a racist would shine as they would be able to target black people if that was their aim. From there they would need to engage and at least claim a veneer of credibility for their choice to use force. That wouldn’t last long as everyone would need to be in on it. These days that is very difficult.

 

  • Debra Coles from the charity Inquest is reported as saying: “This also provides yet more evidence about the overpolicing and criminalisation of people from black and minority communities. It begs important questions about structural racism and how this is embedded in policing practices.” – From other remarks in the article, when the Metropolitain Police were approached and asked for their view, it sounds like after losing some 20k officers the police are rarely proactive, mostly reactive. If only they had the time and resources to ‘overpolice’ anywhere.

 

  • What if, and I am trying to steer away from political and social commentary here for it is not my intention, the police respond to more incidents in places where there is a greater proportion of black people? I X amount of interactions involve use of foprce then is stands that the use of force against blacks is more likely. There is no doubt there is historical antipathy towards the police amongst much of the black community, especially in London. Previous generations of the Met (and other forces) have not been known for their even-handed approach towards the black community. Young men (for it is predominantly males)  in groups often feel that their masculinity is being challenged if an authority figure like a police officer lawfully requires them to do something. What if this leads to more physical  resistance which in turn leads to force having to be used? What if the white people, the Asians and the Others are more compliant when dealing with the police? What if, what if, what if? The fact is that these figures do not seem to be presented in a holistic manner. By that I mean controlling for variables such as age, gender, location, time, weather, changed police priorities, changed dynamics of interaction due to cuts in resources and so on.

 

  • The phrase ‘use of force’ is misused by the journalists and politicians. The police use a very specific definitionb, and it is not what the ordinary person thinks it may be. A voluntary handcuffing is a use of force. You know, the kind where the officer says something like, “for my own protection I am going to handcuff you.” and the subject complies. Perhaps a single-crewed female arresting a large male and having to drive him to custody herself. Merely drawing Captor (CS) spray needs to be recorded as a use of force. No one was sprayed, situation calmed down. Same as the drawing of a baton. Force is also shooting someone dead. There is a wide definition of force. Force, in police recording terms, does not mean taking the suspect to the ground in a violent bundle.

 

  • The whole method of recording has changed, a fact the paper skips neatly over. Too complicated to explain I imagine. The simple fact is that comparing these new figures generated and recorded one way with the past where they were not recorded in the same way, if at all, is simply invalid. It is far too soon to tell.

 

  • The politician, David Lammy MP, famous for trying to whip up stories like this to create indignation – I say this merely because he is a public figure who regularly tortures data or chooses to use tortured data –  betrays a lack of understanding when he talks about the criminal justice system and the police. The police in London are merely one small part of this national system. Saying there is systemic racism at each stage of the system in a piece targeted at the police in London does smack of trying to score wider points and not, in my opinion, worthy of inclusion. It weakens any point trying to be made. It is good to have Lammy on board for a bit more clickbait type appeal though. He has a large Twitter following and retweeted the article almost immediately. Surely not because he is mentioned in it.

 

  • As Matt Twist of the Met Police said, “…the figures should not be compared with population demographics. He said: “The collation of these figures is still in its early stages, and as this is new data, there are no previous benchmarks to compare it to. Therefore any conclusions drawn from them must be carefully looked at against this context, and should only be compared with those individuals who have had contact with officers, rather than the entire demographic of London.” You may think he is a police stooge but it does not make his statement incorrect.

 

  • The paper even says it is comparing  FY 2017/18 to FY 2018/19. This means that from April 6th 2017 to April 5th 2018 and similarly for 18/19. This is important because the new recording system was introduced from April 2017. The data being quoted is April to August 2017. It is being compared with April to August 2018. What happened to the seven months in between? Does this show a steady rise instead of a jump? Has anything else changed in this time? For example: the new system may not have started well and overlooked items or officer engagement was not what it should be, resulting in pressure from Borough Commanders down to record more accurately, leading to an apparent jump in incidents when it is actually a rise in adherence. Just ignoring a seven month gap is concerning. Why? An oversight or intentional?

 

Images are worth a thousand words. Misleading images are still far more impactful than poor descriptions. I reproduce this because it is a howler of a poor and misleading graphic.  The article is using the Financial Year for measurement, hence the mention of 2019 whilst we are in 2018.  Laying out the same images of London side-by-side implies that a comparison is about to be demonstrated.  However, the left hand image uses Westminster and mentions five other boroughs, none of which are referenced in the right hand image. This makes the image of questionable value, other than adding to the devaluation of credibility.  The source attribution should say that this is where the data was sourced, not presented in this way. I rather implies that this graphic is from the Met Police. It isn’t.

I think this was rather twisted to produce the graphics: Met Police Use of Force information.

howler of a bad graphic

Interestingly, the Met Police data gives this caveat, albeit buried on the third tab of their use of force stats page and not linkable by a URL. It explains areas where the data may be misinterpreted. The journalists don’t bother to tell us if they have taken this into account or not. We’ll never know.

CoversheetSo you can see that this article – and many stories on many topics – is riddled with inconsistencies. To me, I just dismiss it because it hasn’t got some of the basics right. It may be speaking a degree of truth, but that truth is devalued by the poor presentation.

Data is fine. Data is useful. Data is just digital exhaust. Data without context is just numbers and means nothing.

As I said previously, “Data only helps you take a problem apart and understand its pieces. It is not suited to put them back together again. The tool needed for that is the brain.”

Try to dissect stories quoting numbers. Be they in the press or someone making a commercial claim in order to influence your actions.

 

Here are some likely data sources for the story and for you to use when reading these type of stories using numbers to give credibility to their assertations.:

UK Government crime statistics

Metropolitan Police data

Office for National Statistics

UK Data Service video

 

PS: Anecdotally: I have known many types of officer. From 6’3″ tall Senegalese immigrants, who started as a PCSO in London and is now policing rural Oxfordshire, to short white people that are Reading natives who are born and bred and police there. They vary in their attitudes and actions because they are people. The huge majority want to make their communities better places. I do have no doubt that amongst them there are a few racist thugs, albeit a tiny and ever-decreasing amount. A bit like regular people I suppose. 

 

PPS: There may be typos. I try very hard to proofread. I am a terrible typist though.