No No No: The Curiously Absent Architecture of Postmodern Analytic Databases
Some say architecture is destiny. But it seems as if the world of analytic databases is moving on many fronts in all directions with no particular destination. That, in fact, is probably a good working definition of “postmodern.”
That’s the only reasonable conclusion you can draw from the sprawling mess of a movement known as “NoSQL.” It defines itself by what it is not, which would be easy to do if all it were rebelling against were something well-defined, such as the need to use Structured Query Language (SQL) to access, query, update, and manipulate data, or the need to store and manage that data in third-normal-form relational databases.
But no. The NoSQL movement seems to fancy itself a catchall for all things “next-generation database,” though even there it is not clear what exactly they’re rebelling against. Check out this “NoSQL” scoping definition on the movement’s principal website: “Next Generation Databases [are] non-relational, distributed, open-source…horizontally scalable… modern web-scale databases….schema-free, replication support, easy API, eventual consistency, and more.”
Whew! If that non-definition didn’t leave you gasping for breath, the unruly menagerie of “No SQL” database approaches listed on the website will completely rob you of your oxygen supply. Apparently, this movement includes an unholy host of old and new database approaches, including wide column store/ column families, document store, key value/ tuple store, eventually consistent key value store, graph databases, object databases, grid database solutions, and XML databases.
And just when you thought they’d closed the lid on this overstuffed steamer trunk, the NoSQL folks, for good measure, jam in not just one miscellaneous segment–“other NoSQL related databases”–but the super-duper-plus-plus-miscellaneous category of “unresolved and uncategorized” databases.
In my book, none of that amounts to an architecture. And it’s not much of a “marketecture” either. I’m inclined to think of it as more of an “anarchi-tecture”: an alluring void where an architecture should be.
I like to think of NoSQL as the Black Rebels Motorcycle Club of the database world. You know: the gang of which Marlon Brando’s character was the leader in the 1953 film “The Wild One.” That’s the flick where Brando’s Johnny Strabler responded to a girl’s question, "What're you rebelling against, Johnny?", with the immortal answer: “Whaddya got?"
If NoSQL is truly a rebellion against the database status quo, then its gang needs a more up-to-date roadmap of the landscape across which they’re cruising. It’s odd that they position their approaches as an alternative only to traditional, row-based, third-normal form, structured relational databases. What they’re missing is any mention of dimensional, columnar, in-memory, and inverted indexing databases (what NoSQL calls “wide column store/column families” is not the same as established columnar databases from Sybase, Vertica, and others). In recent years, these have become the primary alternatives to relational databases in the enterprise arena, and in fact the deployment roles for relational continue to narrow. See my recent blogpost for a more detailed discussion of these alternatives.
But excuse me for picking on the NoSQL community. This issue of unfocused rebellion has bedeviled other segments of the analytics industry in recent years. When Rob Karel and I blogged last year on “Whatever Happened To EII?,” we were describing a movement that one might reasonably call “NoDW,” and which had by that time withered away from lack of any coherent unifying architecture.
Nowadays, most vendors in that space have elevator pitches that focus more on what they don’t do—providing platforms for building and optimizing DWs—than on what they do. As a result, they cruise collectively under no single banner name—not enterprise information integration (EII), not virtual DW, not data federation, not data virtualization, not data abstraction, not data mashup, not information as a service, not Web data services….not anything specific.
In a best practices report in late 2008, I attempted to capture the sprawl of use cases into which these “NoDW” technologies have been deployed. In the real world, they often supplement and extend DWs, rather than replace them outright. The best you can do is point them at any requirement for which two specific DW architectures–centralized or hub-and-spoke—are not optimal.
Feeling dizzy? If you’re a vendor in these markets, trying to define a clear marketecture and differentiation against a field of phantasmic architectural abstractions, I feel for you. If you’re a user trying to harness this hissing field of felines into a high-performance corporate resource, you’d better hold onto your trusty Clydesdales.
Until they mature , no “NoSQL” database is fit to function as your company’s primary analytic workhorse.