Big data: size doesn't matter, it's what you do with it

"Big data is not actually about the data. The revolution is not that there’s more data available. The revolution is that we know what to do with it now. That’s really the amazing thing." It's a quote from Gary King, political scientist and head of the Institute for Quantitative Social Science. In full honesty, we tend to overuse it in presentations and conversations regarding the buzzwordization of the field of data science. However, it is the best representation of how we think about data at The Reference.
Big data – the revolution in research, computing, storage and networking that enables humanity to turn data into value

As I explained in one of my other articles, data science is the field of turning data into value, one of the biggest aspects of Big Data.

The ever-growing capacity of storage, networking and processing speed has enabled companies such as Facebook, Google to become what they are today. Furthermore, we have seen the rise of several new services that are all driven by an underlying stream of data: it has become the resource for many tools and services at the frontiers of innovation. uses your explicit preferences to recommend you movies and TV shows. How do they earn money? They (anonymously) sell your preferences to advertisers. ZipRecruiter leverages its data to improve its job matching algorithm and Realo uses freely available data to determine housing prices.

Monetizing data

These pioneers have shown the world how to monetize data. Their advantage is that their organizational structure is lean and mean; they can pivot fast and can rely on an influx of talent and venture capital. However, nothing should stop incumbent companies from monetizing their data and rethink their business. Some examples: Self-driving-cars (which are just data-processing machines on wheels) are becoming the main focus for General Motors and in partnership with Lyft they are shaping a business model best described as mobility-as-a-service. In layman's terms: "the spotify of transportation". Or what about Philips' lighting-as-a-service? Maybe Thyssenkrupps predictive maintenance?

Start small

A common misunderstanding is that you need to centralize all your data to get started. Another myth is that you need a lot of data: clickstream data, logs of service center conversations, every email opened and read, real-time prices of your organization and its competitors, 3rd-party traffic and weather data, etc. In our jobs as data consultants there's one advice that keeps returning: start small. The 'right' data is more than sometimes better than 'a lot of' data. Often simply connecting data – even spreadsheets – from two departments already results in new insights, contradicting the assumptions of the status-quo. While simple demographics like age, location and gender seem trivial, using it for predictive modeling can already produce good results in terms of churn prediction, personalization and recommendation. If you have boring transaction data, why don't you run a clustering exercise, to come up with some interesting segments or personae?

Get started

You are at the steering wheel. You know your company, its stakeholders, its customers and its competitors best. However, The Reference can be your catalyst. At the bottom of this blog post your can download a selection from our data ideation cards. By sitting around the table with a couple of colleagues, drawing cards and discussing how it can apply to your company, you will be surprised about the ideas you can come up with.



Don't miss out

The Reference has its office in the heart of Manhattan.
“I want to wake up in that city that never sleeps, and find I'm king of the hill, top of the list, head of the heap” – Frank Sinatra