Sorry, you need to enable JavaScript to visit this website.

The Big and Small Of Big Data

The Big and Small Of Big Data
June 29, 2015


It’s not often that weekend reads stay with you for longer than a few days. But Viktor and Kenn’s work on Big Data was no ordinary read. It transformed my take on the subject, so I felt it was worth delving deeper. Here is a snapshot of how Big Data is changing business models and creating new ones!

Data: A New Class of Assets

Technology companies are making heads turn all over the globe for quite some time now. The innovation epicenter is not just Silicon Valley but the techno hubs that are developing in other parts of the world too. But what is really driving such tremendous response to these tech giants? Not long ago did we see Facebook create history with its $104 billion IPO (May 2012). Before the IPO, Facebook was just a fun site, but with a significant $ 6.3 billion USD worth of assets. However the market valued it at about $100 billion higher than the net worth. The remarkable difference can be attributed to the intangible value that data holds. With over 10 percent of the people on this planet on board, Facebook is sitting on a gold mine. It has monetizable data points- likes, comments, shares, etc., which can be used by different businesses to understand their customers better, and in turn, serve them better. A new class of intangible assets is emerging. We see this unadjusted value of data being settled in the dizzying market capitalization figures.

Diminishing utility doesn’t hold for data

A unique aspect of this asset is that its value doesn’t diminish with use. The same set of data can be used for multiple applications. For instance, Google uses the text string that users input in its search engine to also develop a spell check for Google docs. Another example is the data collected by the Google street view car being used for its self-driven cars, in addition to Google Maps. The data is shared across projects without compromising the quality of the results. The probable uses of a certain data set are numerous; it is near impossible for the primary user of data to exhaust all of these alternative uses of data. This is not true of any other kind of resource or asset fueling a business, or the economy in general.

The Three Segments

Like in the case of most tech phenomena, different businesses are dealing with Big Data in varied capacities. Some are in a strategic position to take advantage of the collection of data, for instance, telecom players or retail giants like Walmart. Let’s call them data holders or collectors. Others are at the helm of innovation with Big Data, which means they possess the ‘Big Data mindset’ – the data innovators. And there is another category that acts as a bridge between these two classes. They understand what the data innovators create or suggest and make it work for the rest of the world interested in Big Data. They are the data specialists. The consultancies, analytics providers, technology vendors - all fall in to this third category.


Today, when most of us are still making sense of predictive analytics for our business context or figuring out how Big Data will work for us, or why we need data science when business intelligence serves most of our needs, innovators and specialists have become demigods. But this paradigm will reverse as the adoption of Big Data becomes more common. That is when the power will switch to those who control the access to data - telcos or retail giants. In addition to internal application, they will be in a position to lease data out for use to other businesses. A Case in Point: The Google acquisition of ITA Software in 2011 hints at the impact data holders will create moving forward. ITA Software controls access to a significant set of data in its industry and hence is of interest to Google.

Certain companies like Amazon and Google, given their scale and technical expertise, are everything at once - holders, specialists and innovators.

A Renaissance

Big Data adoption calls for a change of mindset. Its sheer enormity makes the ‘what’ more important than the ‘how’ or the ‘why’, which is the prime tenet of predictive analytics. Causality is not the focus of Big Data analytics. This is in complete contradiction of our innate scientific curiosity. The explosion of Big Data analytics may take the attention away from underlying reasons behind a mechanism. When correlations suffice, will we be putting in effort to understand causality? When market basket analysis on Big Data can tell us beer and diapers sell together, do well need to know why? Probably not.

As innocuous as it seems, the proliferation of this mindset can stunt intellectual inquisitiveness. To make the most out of it, we have to be wary of over emphasizing the significance of Big Data. As Viktor and Kenn put it in their book (a bible for anyone interested in Big Data), had Henry Ford used Big Data analysis to understand customer needs, he would have probably ended up with ‘faster horses’. Intuition and innovation are key ingredients of most breakthroughs and can’t be overruled by Big Data. Big Data should be an enabler in the process.

Big Data Governance

On a more pragmatic note, individuals’ privacy is going to be another major concern arising with widespread Big Data application. When data becomes priceless, businesses will go an extra mile to procure it. Even today, prying eyes watch every move we make. Facebook knows what we like, Google knows what we browse, and Twitter knows what is on our mind. To top it all, our telecom service providers know where we are, and who we are connecting with. Collectively, it is an incredible amount of information and can be more than what our closest friends or family would know about us. In recent times, Facebook and Google have been questioned on their misuse of data. All this points us to the fact that violations of privacy with the advent of Big Data is not unthinkable.

A possible solution to this privacy concern can be anonymization of data, i.e. removing personal attributes that can be used to trace data back to the individuals it comes from. But so far, it has not been fool-proof. AOL and Netflix are a few names that have burnt their hands experimenting with anonymization. Also, with the possibility of secondary usage of data, privacy concerns will be heightened. It calls for a more robust resolution.

A better option can be about bringing in more accountability from the users of data. It will prevent misuse, as the user will have to become answerable for any harm that the use of data can or does cause to the individual. It will also take care of the privacy issue in the secondary usage of data.

In a nutshell, some kind of governance mechanism has to be developed to get the maximum out of the technical revolution that we will undergo with Big Data. But the good news is that such issues are surfacing and eliciting action as the adoption of Big Data is increasing. Most certainly, the right measures will come in to place as relying on Big Data becomes second nature.

After all, Big Data is just a tool which is going to be just as good as its users, its masters.