Yellow Elephants and Pink Unicorns Don’t Tell The Real Big Data Story
Big data and Hadoop (Yellow Elephants) are so synonymous that you can easily overlook the vast landscape of architecture that goes into delivering on big data value. Data scientists (Pink Unicorns) are also raised to god status as the only real role that can harness the power of big data — making insights obtainable from big data as far away as a manned journey to Mars. However, this week, as I participated at the DGIQ conference in San Diego and colleagues and friends attended the Hadoop Summit in Belgium, it has become apparent that organizations are waking up to the fact that there is more to big data than a "cool" playground for the privileged few.
The perspective that the insight supply chain is the driver and catalyst of actions from big data is starting to take hold. Capital One, for example, illustrated that if insights from analytics and data from Hadoop were going to influence operational decisions and actions, you need the same degree of governance as you established in traditional systems. A conversation with Amit Satoor of SAP Global Marketing talked about a performance apparel company linking big data to operational and transactional systems at the edge of customer engagement and that it had to be easy for application developers to implement.
Hadoop distribution, NoSQL, and analytic vendors need to step up the value proposition to be more than where the data sits and how sophisticated you can get with the analytics. In the end, if you can't govern quality, security, and privacy for the scale of edge end user and customer engagement scenarios, those efforts to migrate data to Hadoop and the investment in analytic tools cost more than dollars; they cost you your business.
Unfortunately, vendor tools in data management capabilities to make data meaningful and oriented for ease of analysis and use, trusted, and secure are still in their infancy for big data. Without rapid improvement, application developers will be stymied in their ability to create evolutionary, innovative, or disruptive capabilities that make their organizations competitive.
It's time to demand more than rapid development of features to piecemeal gaps in our big data management and governance needs or being satisfied with early releases that take time and significant skill to deploy. For big data to take hold, it needs to fit the objectives, skills, and agility requirements of more than the 1%. It is only then that we will have a true data (INFORMATION and INSIGHT) democracy.
Get acquainted with the patterns of data governance for big data: Do Data Quality The Big Data Way.