Red Flags & Sacred Cows

Here follows a cautionary tale. I name the culprit, not because I have an axe to grind or it is particularly unique, but it suits the example being made.

To repeat other posts on here: when someone starts quoting facts and figures at you and citing studies, it is entirely reasonable – and very sensible – to ask some probing questions. The figures are usually being used to sell you something. Be that an idea, credibility, services that the provider of the figures can also come and fix, at a price, naturally, or just in support of their existing position on a topic.

This entire topic is made much more challenging when very emotive topics are being commented on. Race, Gender, Diversity and Inclusion are today’s Sacred Cows. These topics always seem to make many people uncomfortable, whilst trying to appear as if they are just fine with it. They often deal with this by ensuring that they say nothing, thereby keeping their head below the parapet. An unintended consequence is that lack of enquiry means that statements with regard to the Sacred Cow go unchallenged.

labels

Twenty years ago there were few, if any, consultancies that were offering to help companies address issues that can arise as a result of various forms of discrimination. Many seem to think that because they are positioning themselves as experts in the field it puts them beyond reasonable criticism and examination. Please can someone help me understand why that elevates them beyond reasonable scrutiny and criticism?

A big problem with Sacred Cow topics is that any criticism of anything to do with them – in this case, the use/misuse of data – is tantamount to trying to undermine their very raison d’etre. It isn’t at all, it is all about the data. Data doesn’t care about any of these issues. To conflate the two seems as if it is a tactic to draw one’s eye away from the data and try and shame you into ceasing with the questions.

Where you should have a problem is when data is used to misrepresent issues. Whether intentionally or unintentionally, the mishandling of data can make problems appear very different from what they actually are. A simple example is in the analysis of raw data. If certain variables are not measured during collection and then controlled for during the analysis, or sometimes data collected in a specific area produces results that are then remarked upon and treated as a general finding with to qualifications added to them.

Back to the Red Flags though. The fact that it is a sensitive topic should prevent you from asking about the provenance of the data. If someone clasps their hand to their mouth and asks how could you possibly question a respected pillar of the industry, sometimes an author etc, then remind them about speaking truth to power.

Recently, I saw a post on LinkedIn from one of the founders of Pearn Kandola LLP Which read:

“A third (32%) of people who have witnessed racism at work take no action, and a shocking two-fifths (39%) of those said that this was because they feared the consequences of doing so*. If our workplaces are to become genuine places of safety, it’s vital that the government acts quickly to curb the use of NDAs to hide instances of harassment, whether it be racist, sexist or otherwise. RacismAtWork UnconsciousBias

*According to our own research at Pearn Kandola LLP

All well and good on the face of it. Nothing wrong with citing your own research, providing you can back it up. I was interested to learn more, so I asked if the research was published, what the sample size was, where and when it was collected etc? There has been no reply. Judging by many of the comments this has been accepted without criticism or interrogation by many, a worrying indication of a lack of critical thinking. Another area of concern when data is being reported and should also raise a little red flag in your mind is the use of words like shocking. I can only imagine this is to try and increase click through. It detracts from data and sounds more like a Daily Express ‘weather armageddon’ type headline.

Sacred Cow

If the data is robust they ought to be delighted to publish it and open it up to examination. After all, if it is robust enough to underpin public claims that are made then there is no reason why it ought not to be open to examination by a third party.

To question data means that you are thinking. Whatever the topic, there should be no Sacred Cows, especially not the data.

Advertisements

AI, ML & DL – A Bluffer’s Guide

AI, ML and DL are our attempts to get machines to think and learn in the way that we can. Get that right and you’ll take the power of the human multiplied a million-fold, to have a breathtakingly capable machine. Probably our new robot overlords but we’ll cover that later. Whilst I do not have any issue with these developments, and do believe it is both attainable and useful, we are not there yet. To date we have these incredibly fast calculators that are essentially linear and binary. These are our modern computers. There are boffins in labs developing non-linear and non-binary counting machines but they are not here yet. This means that we are left with the brute force approach to problem solving. Run the right algorithm (at least to start it is provided by a   human) and you can get the giant calculator to supply an answer, often the correct one but f not then it can learn from its mistakes, rewrite the algorithm and try again. (By the way: that is ML/DL in a nutshell) Machine learning and AI.jpg Here is a definition of ML: Machine learning is the study of algorithms and mathematical models that computer systems use to progressively improve their performance on a specific task. That’s it. It is a computer learning to improve and tweak it’s algorithm, based on trial and error. Just like we learn things. No difference. Here is a definition for AI: Artificial intelligence, sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals. However, AI is where things can really come unstuck. The aim is to get machines to think as we do. In a non-linear way. Human beings deal exceptionally well with ambiguity and we have an ability to match things up like apparently different words and images. Have you ever been transported back in time, in an instant, by a song clip or a smell? That is  human, no one taught you to do that. A computer could conceivably do that but only if it had previously been instructed to do so. It can do it so very fast you would be forgiven for thinking it was natural. It is not though, it is programmed to do it. Sure, it might have learnt to improve its own algorithm (Machine Learning again) to do that based on observations of human behaviour. It is still just mimicking what it sees as the appropriate behaviour, there has never been that spontaneous connection that you experienced that transported you to another time and place, even fleetingly. A recent high-profile example of AI and ML going a little bit awry and showing bias is in this article here. “Amazon Reportedly Killed an AI Recruitment System Because It Couldn’t Stop the Tool from Discriminating Against Women“ Well worth listening to the video and understanding the unconscious bias exhibited by the builders of the algorithms. There are efforts to remove the human biases that the machines learn from and perpetuate. But what is Deep Learning, I hear you cry? It  can simply be differentiated from Machine Learning as when the need for a human being to categorise all the different data inputs is eliminated. Now the machine (still only  the really fast calculator). Think self-driving cars, drones and many more much duller things. Presently, we humans need to be involved in the categorisation. There is even a Data Labelling factory in China to use humans to ‘teach’ machines what it is  that they are seeing. Equitable, Just, Neutral and Fair are components of moral behaviour that reside in the interpretation of the present societal norms, and not everyone agrees with them. Different cultures can have quite different views on a correct moral choice. Remember this when someone is trying to argue about the infallibility of computers. They can only be programmed with lagging data and they will always reflect us and our biases. For better or worse. bias see-saw.jpg

Data Ethics For Business

We exist in an increasingly data driven world. More and more, we are encouraged or directed to ‘listen to the data’ above all else. After all, the data doesn’t lie. Does it?

bigdatawordmap-1264x736-672x372

Data Ethics in business is the name of the practice used to ensure that the data being used to make high-value commercial decisions is of the highest quality possible. However, there is a catch. Human beings are the catch. We have  gut-instinct, prejudices, experience, belief systems, conditioning, ego, expectation, deceit, vested interests etc. These behavioural biases all stand to cloud the data story, and usually do.

A high-value commercial decision does not necessarily have immediate financial consequences. Although, in commercial terms, a sub-optimal outcome is invariably linked with financial loss. In the first instance, the immediate effects of a high-value decision can be on organisational morale or have reputational consequences.

responsibility

When a high-value decision is to be made there are invariably advocates and detractors. Both camps like to believe that they are acting in the service of a cause greater than themselves. Occasionally, some of the actors cloud the story because their self-interest is what really matters to them, and they try hard to mask that with the veneer of the greater good. Hence the term ‘Data Story’, because behind the bare numbers and pretty graphics  there is an entire story.

The concept of conducting a pre-mortem examination of the entire data story to model what can go wrong is becoming more important for senior decision makers. It is getting increasingly difficult to use the traditional internally appointed devil’s advocate as, due to the inherent complexity of understanding a data story, this function needs to be performed by subject matter experts. Although the responsibility for decision-making always falls on the Senior Management, they want to do it with a full breakdown of the many facets of the data story.

BigData-wordcloud-2

 

In order to achieve this, individuals with a unique blend of talents, experience and inquisitiveness must be used. People with absolute objectivity and discretion, who don’t rely on inductive reasoning. Ones who are robust enough to operate independently, diplomatically and discreetly and have executive backing to interrogate all the data sources, ask the difficult questions and highlight any gaps, inconsistencies, irregularities. From this they can provide a report for the Executive Sponsor(s) with questions to ask and inquiries to make so a well-informed decision can be made.

After all, when there is  lots at stake, no one wants to be remembered as the person that screwed-up and tried to blame the data?