Big data analytics can help to predict how costs in an organisation are likely to change. It’s not hard to see how. If big data predicts that a certain number of customers will shift, say, from one sales channel to another, if we know the number and we know the probability of it happening, we should be able to calculate the impact upon cost.

But for many there is a problem. Businesses often have enough data with which to make the prediction but they lack the data about costs to demonstrate the gains or losses to be expected. The information you need could be dotted all over the place with costs hidden deep within all encompassing cost codes.

Which makes decision making difficult. Do you push ahead and make decisions based upon an incomplete set of cost data? Or do you hang back until data quality improves?

The answer is to push ahead wherever possible, but only after putting in some temporary but robust fixes for the data that’s missing.

At a recent data science event in the UK the hosts presented results from some work underway within four of the major social housing providers in the UK. They showed, for example, that a resident who is in employment will, on average, demand fewer repairs to their property, and therefore cost less to serve, than someone who isn’t. This might be because of less wear and tear on the property.

Only if it’s what I want to hear

For those deciding whether to invest in helping people to find work, this is a useful result. Analytics using past data can predict how many more people might find a job. It’s now possible to compare the cost of the investment with reductions in the cost to serve.

A second result however showed that greater levels of digital inclusion (i.e. more residents online) pushes repairs costs up. Those residents with an active email account (a useful marker for digital inclusion) will call, on average, for more repairs each year than those without. Reporting through email is probably easier.

For those providing digital inclusion services, this is less than helpful. If greater inclusion seems to push costs up the value of those services may be called into question. Best to park the result in a drawer somewhere.

Although one result is perhaps more popular than the other, both pieces of analysis are flawed and for the same reason. Both rely upon the data that was available at the time rather than data which, had it been available, would have yielded a more robust result.

E.g. in the latter example the count of repairs undertaken for people who were online was not a count of repairs made by email. It was simply a count of repairs from people who have an email address. It could be that those phoning regularly for repairs were more likely to be asked for an email address as part of the effort to keep customer records up-to-date than those who were not. The data therefore makes it appear that those with an email address call for more repairs.

Also both analyses were limited to repairs data. The effects on other parts of the business were not considered because relevant data wasn’t available. E.g. People who are employed may lose entitlement to benefits and therefore be at greater risk of falling into arrears.

It helps to understand activity

In order push ahead and to make more robust decisions these shortcomings in data need to first be addressed. But it doesn’t mean waiting for systems and processes to catch up. By the time that happens the world will have moved on.

Working together we can build a simple activity catalogue that spans the whole business and populate it with estimates of activity times and frequencies. It has to be put together with care to ensure the integrity of the data captured (the author can provide some pointers to how to get this right). Once built it will provide very useful information for a whole range of purposes e.g.

  • Names and locations for all the key activities across the business;
  • Names for the main drivers for each activity (making it clear for example which would be affected by customers shifting channels);
  • The approximate cost of those activities today;
  • The number of times those activities happen today.

Armed with this data, it becomes easier to include analysis that might have otherwise not been possible e.g. the impact of greater digital inclusion upon call volumes into the Contact Centre, or the greater ease with which customers can book appointments.

There is of course a compromise. Decisions will be based partly upon estimates of costs for activities rather than actual postings of costs through account codes and cost centres.

But it seems to be far better to make decisions now that are based upon information from all relevant parts of the business than to either wait for systems to eventually provide the data needed or to make a decision when only part of the story is known.

Paul Clarke
Develin Consulting