Big Data: Does It Make Sense To Hope For An Integrated Development Environment, Or Am I Just Whistling In The Wind?
Is big data just more marketecture? Or does the term refer to a set of approaches that are converging toward a common architecture that might evolve into a well-defined data analytics market segment?
That’s a huge question, and I won’t waste your time waving my hands with grandiose speculation. Let me get a bit more specific: When, if ever, will data scientists and others be able to lay their hands on truly integrated tools that speed development of the full range of big data applications on the full range of big data platforms?
Perhaps that question is also a bit overbroad. Here’s even greater specificity: When will one-stop-shop data analytic tool vendors emerge to field integrated development environments (IDEs) for all or most of the following advanced analytics capabilities at the heart of Big Data?
Of course, that’s not enough. No big data application would be complete without the panoply of data architecture, data integration, data governance, master data management, metadata management, business rules management, business process management, online analytical processing, dashboarding, advanced visualization, and other key infrastructure components. Development and deployment of all of these must also be supported within the nirvana-grade big data IDE I’m envisioning.
And I’d be remiss if I didn’t mention that the über-IDE should work with whatever big data platform — enterprise data warehouse, Hadoop, NoSQL, etc. — that you may have now or are likely to adopt. And it should support collaboration, model governance, and automation features that facilitate the work of teams of data scientists, not just individual big data developers.
I think I’ve essentially answered the question in the title of this blog. It doesn’t make a whole lot of sense to hope for this big data IDE to emerge any time soon. The only vendors whose current product portfolios span most of this functional range are SAS Institute, IBM, and Oracle. I haven’t seen any push by any of them to coalesce what they each have into unified big data tools.
It would be great if the big data industry could leverage the Eclipse framework to catalyze evolution toward such an IDE, but nobody has proposed it (that I’m aware of).
I’ll just whistle a hopeful tune till that happens.