The term big data is still being defined. It is unclear which data sources will eventually be included under the umbrella term of big data and which will not quite make it. As much as people do not like the term, which has become very clear with the negative coverage that it has received, I do not think it is going anywhere. In fact, the most likely scenario appears to be the inclusion of an increasing variety of data under the greater big data term, with more specific term categories appearing under this umbrella over time.
Here, we discuss the standing of ‘boring’ data and how it should help to bring other sexier datasets alive.
Boring data, in this example, is data that has been around for decades and really just reflects some everyday action. But, now is currently being quantified using more modern methods and being made available for more real-time analysis, often times together with social media.
When traditionally collected and analyzed, such boring datasets are normally not considered as fitting in with the sexy (and trendy) world of big data. People usually equate big data with twitter and facebook, sources that did not exist before the turn of the millennium. Social media is the classic example of frequent updates creating difficult to tame streams of information.
But what happens when previously existing boring data begins to be collected using new modern approaches and new analytical techniques are utilized to extract more insights from them? And, that very same data is used to add more value to things like social media? Should we categorize such data as big data?
Let’s take car parking information as an example. This is really boring stuff. Cataloguing the timing of when cars enter and leave parking garages is really mind numbing. It’s not like a tweet where an individual can offer a relevant insight, some emotion, or an opinion. And, it’s not like a facebook like which automatically implies support for something like a product or company. But this boring data, this data that has been around for decades, can add significant value to the larger picture.
Imagine that this car park information is from a local shopping mall. Knowing the timing of car parking information, combined with other data, would add a considerable amount of value to the overall understanding of shopping behavior. Aggregating this data would add considerable value for those wanting to understand the dynamics of that shopping mall. Wouldn’t this data also add to an economist’s analysis of the local economy too? And, it would be in much more real-time than is currently available.
Taking it one step further, I assume that car license plate tags will be cross-referenced to owners and stored over time. They will know how often you go to the mall, how long you stay, and at what time. Your social media check-ins and comments atop such data will allow them to further understand the individual’s habits and behavior so that better predictions can be made.
One shopping mall worth of parking data does not seem like a lot of data, but the tweets occurring within that mall on the same day would also be relatively small. Once you pull together data from many malls or other shopping venues on a state, national, or global basis, you begin to see that such data is no longer small data, but starts to add up.
The ‘boring’ tag can also be dropped. It only seems boring because it has been around for decades and on the surface does not appear to add up to much. But placing the data in context with other datasets, we can see how such data can be used.
From a finance or financial market perspective, we can use such data to:
• Improve revenue forecasts for companies near such a parking garage,
• Gain better understanding of foot-traffic and the timing of same at particular locations,
• Aggregate such data to measure regional activity in very close to real-time (at least much quicker than through traditional methods),
• Gain a perspective of a company’s prospects perhaps before management,
• Better gauge things like proper interest rates and/or risk associated with certain loans (like commercial mortgages),
• Better understand consumer behavior,
• Greatly improve predictive modelling.
This post is not about the uses of parking data. It is just one example of data that many see as boring but that will actually help to revolutionize the interpretation of the data that everyone is excited about. Through contextual understanding, investors will be able to better understand economics and finance much better than previously thought.
It is somewhat ironic that it could very well be the boring data that will unravel much of the value hidden in social media.
In the end, I think many of these older boring datasets, as they come on-line en masse and in a real-time fashion, will fall under the big data umbrella term, which frankly may just further infuriate its current detractors.