Oh No, Not Another 2.0 — Database 2.0? Data Warehousing In The Cloud!
Boris Evelson’s latest post on free BI got me thinking about another type of freedom.
Boris commented on the newly announced beta of a gratis, lightweight, Panorama-powered BI/OLAP-engine add-on to Google’s hosted apps. You know, whenever anybody mentions BI/OLAP, I think of analytical databases, hence data warehousing (DW). And when my thoughts turn to DW, I often wonder when these dimensional data stores will be let loose from their earthly tethers and begin to float free in the SaaS cloud. This is no blue-sky speculation, but rather an inevitability in a world shifting to subscription-based SaaS for on-demand delivery of all infrastructure and application services. Where database services are concerned, this trend even has a name in popular circulation: Database 2.0 (aka "cloud databases").
Let it be known that Google is one of the pioneers in Database 2.0, though they haven’t tooted their horn or done anything particularly special in this regard (smaller SaaS solution providers such as Trackvia, DabbleDB, and Zoho have more full-featured Database 2.0 offerings than Google, albeit not particularly BI/OLAP/DW-focused). A year or two ago, Google went open beta (still in that phase, actually) with a hosted database service called GoogleBase. Now, from what I’ve seen, GoogleBase is not a general-purpose transactional or analytical database. And it’s certainly not a DW or data mart in the clouds. Instead, GoogleBase seems to be an online repository — or rather, depository — into which external parties submit structured data for Google to crawl and index deeply for access from Google’s big whompin’ search engine.
Even more noteworthy is Microsoft’s recent foray into the Database 2.0 space — a move that some might consider a "validation" of this approach in the eyes of enterprise I&KM professionals. Microsoft has just rolled out a beta of its hosted SQL Server Data Services. The vendor has started to host services that have heretofore have been available only from SaaS partners. This is, of course, a key piece of the Redmond WA-based vendor’s begrudging effort to push more solutions into a Microsoft-hosted SaaS cloud. However, from what I can see so far, Microsoft is simply hosting a subset of the functionality of its general-purpose RDMBS platform for OLTP and OLAP. However, Microsoft has not specifically optimized SQL Server Data Services for OLAP, unlike any truly scalable BI/OLAP/DW platform.
Back to Google for a sec. What I fully expect from them in the coming year or two — and from every SaaS cloud everywhere before long — are feature-complete, hosted, subscription-based DW services for high-performance, high-volume, complex analytics. Naturally, this cloud should be called DW 2.0. It should leverage the full virtualized, distributed, scalable, grid-enabled computing fabric that the Googles of this world can bring to bear on the very largest structured data sets, most resource-intensive query-processing tasks, and richest visualizations imaginable. Per Boris’s suggestion, it could even serve as a supremely scalable BI, data mining, or predictive analytics "sandbox" for developers and power users who have no other speedy, cost-effective alternatives for procuring the necessary horsepower for various projects and production requirements.
I second Boris’s challenge: Google should consider integrating the Panorama OLAP-engine add-on (remember, it’s just a beta) with a more analytics-enabling future version of GoogleBase (which is also still a beta). In so doing, Google — if it eventually decides to go into full production with all this — would be able to offer full-featured DW and BI services on a hosted platform that is as infinitely scalable as the concatenated string of Os inside the ever-extensible company name that displays within its multipage search-result screens. I also share Boris’s concern that whatever hosted OLAP/BI/DW services Google eventually offers may lack enterprise-grade metadata management, data cleansing, data-source connectivity, security, and other key features.
I also expect Microsoft to evolve SQL Server Data Services in the DW 2.0 direction, an effort that no doubt would intensify if Mr. Ballmer succeeds in grabbing Yahoo. I’d like to see Microsoft cross-synthesize SQL Server Data Services with any hardware-partner-powered OLAP-acceleration approaches it may or may not be developing under its DW appliance initiative. At the very least, I’d like to see Microsoft provision some seriously scalable DW horsepower in its data center, perhaps through a partnership with Teradata.
Clearly, DW 2.0 services will need to be an order-of-magnitude more powerful than what we’ve come to expect under the first generation of SaaS-based BI/DW offerings on the market. Whether dedicated to a single customer’s requirements or divvied up on a shared-tenant basis, DW 2.0 could be the biggest, baddest, most virtual DW "appliance" of them all. And it would be another key step in the progressive virtualization of the entire SOA stack, apps, middleware, hardware, and data services across the Enterprise 2.0 or Web 2.0 fabric.
Oh yes, yet another 2.0 — or two — for you. Wouldn’t it be interesting if Google and/or Microsoft acquired a DW appliance vendor? I would not be at all surprised if announcements such as these precipitated from the cloud of pregnant possibilities.
And is it too far-fetched to imagine that Microsoft might turn around and acquire Teradata if the Yahoo takeover falls through? My crystal ball’s still a bit cloudy on the matter.
But, hey, I’m free to speculate.