Being a “Data Something or Other” seems to be the new thing. These are the neophytes that can often be really smart but uninhibited by experience. I have been away for five years in academia and, unsurprisingly, little has changed. This article is a discussion piece, not about the management of data in an organisation, a topic far beyond my expertise. Instead, it is around the interpretation of data.
Having worked in the IT and consulting space from the early 90’s I can reflect on some of the trends that were – in the breathy and slightly conspitratorial tones used by the advocates, as if they were sharing the code to Fort Knox with you and you alone – a ‘new paradigm’. Some of the many terms I recall are: Dotcom Boom, Knowledge Management, CRM, eBusiness, eCRM, ERP, Web 2.0, Network Computing, Distributed Computing, Big Data and the like. The challenges in business remain the same. The ever delicate balance of growing profit whilst increasing operational efficiency.
It is a truism to say that everything changes but everything stays the same. Moore’s Law ensures that the power of computers and what we can do with them remains incredible, frightening to some and ever changing. Take data: this is a gnarly topic indeed. Data this, data that, chop it, model it, torture it, misuse it and then base critical choices on it. Critical choices that can make you or cost you money.
The key thing to remember is this:
Data science is not the arbiter of truth. We need to translate it in a much broader societal context, and when we do so, we start to understand that data only helps you take a problem apart and understand its pieces.
It is not suited to put them back together again. The tool needed for that is the brain.
I think of data as the digital exhaust (Hristova et al) of something – a company, a market, a society. It was always there but only relatively recently have we managed to combine capture and analysis with such powerful computing abilities. We are like kids with a new toy, and many people seem to be seduced by the clever tech. Yes, it is really cool. It isn’t the new reality though. It is a new way of putting a lens to the existing reality.
With a knowledge of what has happened, we want to know what will happen. Predicting the future IS indeed the Holy Grail, I get it! Whilst future prediction works well for closed systems, it isn’t so great when you put people into the mix. Understanding the people element is crucial. Tricia Wang talks eloquently about the human insights that are missing from Big Data. In fact, there are many excellent TED talks on data.
What troubles me is that they all tend to be siloed, or from a single perspective. Ethnographers look at everything through the lens of their ethnographic training. In commerce the same holds true for economists or psychologists. With the rare exceptions, academia is the same. Practioners in a single discipline view everything through the very particular lens of their area of expertise. I have experienced experts ‘pooh-pooh’ any sort of inter-disciplinary approach, mostly because it would mean them stepping out of their narrow, but very knowledge-rich, comfort zones.
I was incredibly fortunate to have two professors as supervisors, for my undergrad and postgrad work, twho didn’t start their academic life in the fields they ended up specialising in. That meant that they brought an incredible richness and diversity to their advice. Consequently I looked at the tasks I had through a much wider variety of lenses as I do to this day.
Data needs contexualising. That richness, indeed ‘thickness’ that Wang refers to is crucial. Understanding how and why people behave the way they do should not just be determined by a psychologist or ethnographer alone, as all issues are seen as psychological or ethnographic one. It is natural, it is what they are good at. Having a historical, sociological and political perspective can only serve to make your analysis richer and thicker still.
‘The numbers’ are presented as the irrefutable everything, but how they are arrived at, not so much. How the data is collected is absolutely crucial. The process of research that generates the data is so often skipped over, because it is a very hard thing to do well. Without sound research methodology, all the rest of the output falls apart as it simply isn’t reliable.
Qualitative, quantative, question style, face to face, postal, Internet, the recruiting pool, the profile of the respondents (age, gender, ethnicity, sexuality, religion, income etc), the time period over which the research was conducted, the sample size, significance, P values, Spearman’s Rho, dummy variables, independent and dependant variables, the dropouts, researcher bias/influence, the question design or was it merely observations by the researcher, consent, original intent of the research, transparency, analysis methods, software use, hypotheses, null hypothesis, incentives, the original data set, and repeatability and and and…
That is just the start of the researcher’s lot, but it ought to give you clues as to just how much data can be misinterpreted, misconstrued, misreported and misleadingly presented in the hands of non-experts.
If you are relying on the use of data to back up your experience, judgement and a willingness to take risks (as I am sure Apple did when it launched its first iPhone) then you may want to consider that data is more nuanced than many imagine.
If you want to get the most from your data and make the best decisions, you need a lot of different people with different skills, or you need fewer people with a wider range of skills. However, there are not that many arch-generalists out there with specialist knowledge and experience. When you are deciding what to do with the latest bit of insight you have been presented with, you’d be well advised to seek them out.
With thanks to Rob Briner. This sums it up perfectly. The politician’s fallacy.