It’s The Dawning Of The Age Of BI DBMS
Over the years we’ve learned how to address the key business intelligence (BI) challenges of the past 20 years, such as stability, robustness, and rich functionality. Agility and flexibility challenges now represent BI’s next big opportunity. BI pros now realize that earlier-generation BI technologies and architecture, while still useful for more stable BI applications, fall short in the ever-faster race of changing business requirements. Forrester recommends embracing Agile BI methodology, best practices, and technologies (which we’ve covered in previous research) to tackle agility and flexibility opportunities. Alternative database management system (DBMS) engines architected specifically for Agile BI will emerge as one of the compelling Agile BI technologies BI pros should closely evaluate and consider for specific use cases.
Why? Because fitting BI into a row-oriented RDBMS is often like putting a square peg into a round hole. In order to tune such a RDBMS for BI usage, specifically data warehousing, BI pros usually:
- Denormalize data models to optimize reporting and analysis.
- Build indexes to optimize queries.
- Build aggregate tables to optimize summary queries.
- Build OLAP cubes to further optimize analytic queries.
Unfortunately, there’s one basic problem with these approaches: It’s impossible to build denormalized data models, indexes, and aggregates for every possible query that users will execute during the lifetime of a database. So BI pros must pick their battles and optimize the RDBMS based on current or near-term expected usage. But in today’s fast-paced business environment, “near-term” may mean days or even hours, requiring BI and data warehouse (DW) pros to spend a significant amount of their time doing nothing more than constantly optimizing and reoptimizing these databases.
Additionally, these row-oriented RDBMSes offer minimal support for:
- Unstructured content.
- Diverse data structures that often result in ragged, sparse, and unbalanced product hierarchies — a nightmare for data architects to model and users to query.
To address these challenges, both mainstream BI vendors and startups introduced databases designed specifically for BI reporting and analysis use cases. These databases are very specialized and not meant to support a wide variety of use cases — unlike their bigger, older, more versatile, jack-of-all-trades RDBMS cousins. Additionally, not all BI-specialized DBMSes are created equal, and inconsistent or conflicting vendor marketing messages only add to customer confusion. We recommend considering four types of BI-specific DBMS: columnar RDBMS and in-memory, inverted, and associative index DBMS. The applicability of and use cases for each specialized DBMS vary greatly depending on multiple factors, including but not limited to database size, data structure complexity, speed of requirements change, and a requirement to organize and report on unstructured content.
Read our detailed research on why, where and how we recommend using:
- Columnar RDBMS (like HP Vertica, ParAccel, and Sybase IQ)
- In-memory index DBMS (like Microsoft PowerPivot, QlikView, and Tibco Spotfire)
- Inverted index DBMS (like Attivio and Endeca)
- Associative DBMS (like Saffron Technology and Splunk)
We’ve also produced a detailed Excel-based model that attempts to compare implementation and ongoing support efforts of these four BI-specific DBMSes baselined against row-oriented RDBMS-based BI projects.
As always, comments are more than welcome.