Jameskobielus By James Kobielus

This has been the season for splashy vendor announcements in the high-end data warehousing (DW) market, and they’ve become progressively more disruptive and game-changing with every passing month. Though to the casual observer this may seem like a vendor-push game of competitive one-upmanship, it is in fact being driven by ever more challenging requirements coming from Information and Management (I&KM) professionals. I&KM pros in many verticals are implementing ever more scalable DW platforms to manage inexorable growth in BI and advanced analytics workloads.

Extreme scalability is the new uber-theme in today’s DW arena. Forrester customers will see an in-depth discussion of best practices in this regard in a forthcoming report that I’m authoring, for publication late in this quarter. My ongoing research has shown that many enterprises, service providers, and e-commerce companies are pushing the DW scalability envelope. DW deployments are pushing into the hundreds or thousands of terabytes of usable data, which is being persisted across tens, hundreds, or sometimes a thousand or more nodes. In addition, I&KM professionals are routinely pushing the needle on other DW scalability and performance metrics, including user/query concurrency, high-volume data-load speeds, and mixed-query workload management.

DW vendors are racing as fast as they can to support these extreme-analytics requirements, both through a more total focus on appliance-based packaging and through aggressive scale-out of those offerings.

In mid-summer, Microsoft acquired DATAllegro, leaving no doubt that the Redmond, Washington-based vendor planned to reconstitute its DW stack–with SQL Server 2008 at its heart–as a shared-nothing massively parallel grid of finely optimized appliance nodes. That, of course, is a work in progress, under Microsoft’s ongoing “Project Madison,” which will result in a commercially available DW appliance platform in the next 1-2 years.  But Microsoft’s progress on this road map is worth watching, because no vendor is better positioned to deliver cost-effective massively parallel DW solutions into the vast mid-market–when the day comes.

Late last month, Oracle raised the competitive stakes when it and HP jointly announced the immediate general availability of a massively parallel, high-end DW appliance: the HP Oracle Database Machine with Exadata Storage Server. As I noted in a prior blog post, this new uber-appliance now enables Oracle to sell a pre-optimized petabyte-scale DW solution to its customers who may, heretofore, have turned to, say, Teradata, for a solution to address their most high-volume, high-performance analytics requirements. No, this new HP/Oracle product does not come cheap. But, then again, what I&KM pro expects an investment in a high-end enterprise DW (EDW) platform to come in at less than seven figures?

Now, this week, Teradata–still the acknowledged scalability king in the EDW market–has made the most disruptive announcement of all. At its annual Teradata Partners conference, the vendor announced immediate general availability of a new platform: the Teradata Extreme Data Appliance 1550.

What’s most noteworthy about this product (which is configured with the same core Teradata DBMS and tooling as its other DW platforms) is how “extreme” it is on two key dimensions. On the scalability dimension, the Teradata 1550 can scale-out to 50 petabytes of usable data across 1,024 nodes, which represents–hold your hats–five times the capacity of the vendor’s heretofore high-end 5550 platform. But it’s on another dimension, extreme affordability, that the new 1550 platform truly takes the DW industry to a whole new level. Teradata is offering the 1550 for as low as $16,500 per usable terabyte–which is less than one-tenth the list price of the 5550 and in the same general ballpark as some low-end, startup DW appliance vendors, such as Greenplum, Dataupia, and (the now temporarily off the market) DATAllegro.

Yes, those who search for caveats and fine print on the Teradata 1550 announcement can certainly find them. Teradata has stated that the lowest-capacity configuration of a 1550 is approximately the same price as a comparably configured 5550, with the extremely low cost (per usable terabyte) pricing coming into play as you scale up your investment in the 1550. Also, the 1550 lacks some mixed-query workload management features found in the 5550 (in other words, the 1550 and mid-market 2550 have the same constrained subset of the full Teradata workload management functionality). And the 1550 may not perform some very large table scans as efficiently as the 5550, due to the approach that the vendor used to scale the 1550 to such great heights.

But it’s an extremely impressive announcement, no matter how you spin it, and it radically shakes up the DW market. Through this and other recent announcements–most notably the mid-market-targeted 2550 platform–Teradata has shown that it can innovate and compete head-to-head with its fiercest competitors. In the year since it was spun off from NCR, Teradata has aggressively defended its scalability crown while making a credible claim for the equally important distinction of affordability king.

Or, rather, Teradata can now claim to be extremely affordable, on a per-usable terabyte basis, if your requirement is for a petabyte-size DW platform. In other words, if you’re a giant telecommunications carrier doing high-volume call-detail-record processing or, like the eBays of the world, a Web 2.0 company doing real-time clickstream analysis in the cloud.

So, yes, there is some serious fine print to read on this latest Teradata announcement. It’s not truly mid-market-affordable. But I have no doubt that the new generation of DW-in-the-cloud providers are already considering the 1550 as their core platform for mid-market-focused, subscription-based analytics services. Just as they’re considering the new HP Oracle uber-appliance for those same requirements, as well as petabyte scale offerings from Greenplum and other pure plays.

What do you think? Are you taking a second look at Teradata now that it has rolled out a full family of DW appliance offerings at all major price-points, from very high-end to very low-end? If you’re pushing the petabyte envelope, does the announcement of the low-cost-per-usable-terabyte 1550 make you any less inclined to evaluate the comparatively pricey HP Oracle Database Machine with Exadata Storage? Is Teradata in danger of cannibalizing sales of its 5550 and 2500 platforms with this new extreme scalability/affordability offering, which, keep in mind, has the same core technology inside?