Mike Gilpin 2009 Casual Head Shot - Edited




It’s been almost two years since I last wrote about this topic, but since then this trend has continued to accelerate. I have not had an opportunity to do another survey myself, but have seen:

  • Anecdotally among many clients doing SOA, more than half are also creating and managing one or more canonical information models for their SOA and/or information management strategies. These are all focused on “data in motion,” not “data at rest.”
  • Surveys from other sources have shown 50-60% of those doing SOA are creating a canonical information model (increased from the 39% rate our 2007 survey found). Last week I saw data shared informally by a major vendor of SOA suites, from a survey of hundreds of their customers (all of whom are doing SOA), showing more than 60% are creating a canonical model.

So what’s behind this growing trend? The forces we identified in our original research piece are all still in operation, but to give a quick view, stories I’ve heard typically go like this:

  • We have thousands of XML Schemas (XSDs) about the place, rapidly proliferating out of control – with each representing the information model as the local team sees it for their application interchanges or service interfaces.
  • From the point of view of an individual developer or small team, it’s not a big deal, but the lack of a canonical/common model is a huge obstacle to any integration or interoperability we require across multiple applications.
  • Our issues with schema governance are exacerbated by the rapid evolution of the industry schema standards with which we must comply.
  • The lack of interoperability is especially painful when:
    • We’re integrating with one of our ecosystems of B2B partners.
    • We’re integrating/automating a cross-functional business process, like order-to-cash, or order-to-provision.

When these folks try to establish a canonical model, results vary:

  • If the work happens in a context where industry standards like SID, Acord, or FPML can provide a starting point, the effort tends to succeed. This is true even when those standards have not previously been adopted by that enterprise.
  • Where such industry standards don’t exist, it’s often much harder to get enough agreement among the interested parties to get the effort off the ground.
  • And since two years ago I’ve seen one other interesting dimension to the problems of establishing a model: the need for a federated approach. In very large organizations with multiple business domains, it sometimes turns out that it’s not possible to establish one canonical model. Instead, multiple domain models are necessary, interlinked with one another and with an enterprise-level canonical model. These domains may reflect different external ecosystems, such as securities trading participants, as opposed to customers of a wholesale bank, or international banking exchange operation.

Fortunately, since I last wrote, the state of the art has moved on, with more tools coming on the market, as well as evolution of the tools I mentioned in the earlier piece. These included (from those mentioned in the 2007 document):

  • Enterprise architecture tools. Casewise, IDS Scheer, MEGA International, Proforma, and Telelogic led the EA tools market (IDS Scheer has since been acquired by Software AG, and Telelogic by IBM). But some folks are relying for their information modeling needs on vendors like Embarcadero that made the transition from data modeling tools to EA tools more recently.
  • Tools embedded in ESB, Information-as-a-Service, or BPM suites. Major vendors of ESB, IaaS, and BPM suites often include information modeling tools as part of their solution. For example, TIBCO ActiveMatrix, Composite Software Composite Studio, Red Hat MetaMatrix Enterprise, and IBM Information Server (which includes semantic technology from the acquisition of Unicorn Systems) can be good options if you’re using those suites for multiple other parts of your SOA or IaaS strategy.
  • Independent specialist vendor tools. For the most advanced modelers, especially when semantic technology is required, tools from specialists like Contivo, Metatomix, or Revelytix are a good solution. Other independent tools include Progress Software’s DataXtend Semantic Integrator.

Since then I’ve also heard of others, like TopQuadrant’s TopBraid Suite. Oh, and my colleague Dave West has written a great report on the ways that semantic technology is being used by application developers nowadays. Dave and I are doing more new research in this area, about both canonical modeling and semantic technology. So please help us with our research:

Are you creating a canonical model? What tools and techniques are you using to drive your success (whether based on XSDs, or semantic technology, or both)? What issues have you encountered along the way? Can we interview you for our research?

Please comment here if possible, or Tweet with hashcode #ForrCanon, or email me (if you must retain confidentiality) at mgilpin@forrester.com.