Most Hadoop-related inquiries from Forrester clients come to me. These have moved well beyond the “What exactly is Hadoop?” phase to the stage where the dominant query is “Which vendors offer robust Hadoop solutions?”

What I tell Forrester clients is that, yes, Hadoop is real, but that it’s still quite immature. On the “real” side, Hadoop has already been adopted by many companies for extremely scalable analytics in the cloud. On the “immature” side, Hadoop is not ready for broader deployment in enterprise data analytics environments until the following things happen:

  • More enterprise data warehousing (EDW) vendors adopt Hadoop. Of the vendors in my recent Forrester Wave™ for EDW platforms, only IBM and EMC Greenplum have incorporated Hadoop into the core of their solution portfolios. Other leading EDW vendors interface with Hadoop only partially and only at arm’s length. We strongly believe that Hadoop is the nucleus of the next-generation cloud EDW, but that promise is still three to five years from fruition. It’s likely that most EDW vendors will embrace Hadoop more fully in the coming year, with strategic acquisitions the likely route.
  • Early implementers converge on a core Hadoop stack. The companies I’ve interviewed as case studies indicate that the only common element in Hadoop deployments is the use of MapReduce as the modeling abstraction layer. We can’t say Hadoop is ready-to-serve soup until we all agree to swirl some common ingredients into the bubbling broth of every deployment. And the industry should clarify the reference framework within which new Hadoop specs are developed.
  • The Apache community submits Hadoop to a formal standardization process. In the middle of the past decade, the service-oriented architecture (SOA) world didn’t begin to mature until industry groups such as OASIS and WS-I stabilized a core group of specs such as WSDL, SOAP, and the like. The “Big Data” world badly needs a similar push to standardize the core of Hadoop so that vendors and users can have assured cross-platform interoperability. As it is, silos reign in the Hadoop world, with most users rolling their own bespoke deployments and most vendors forking the Apache code base to suit their needs.

Catch me at the Hadoop Summit in Santa Clara later this month. Hadoop isn’t soup yet, but I look forward to dipping my spoon into some tasty concoctions simmering on the industry’s front burner.