Exposing The Data Mesh Blind Side
This post won’t be popular and will certainly be controversial. But there is an elephant in the room that needs attention. Data mesh has a blind side.
Working in and leading data management and governance teams is like living in a perpetual group therapy session. Technology complaints arise, but the underlying friction is social in nature. How do we work together? What are our roles and responsibilities? What do our data consumers need? Why does everything take so long and is so hard? Data mesh is addressing this through its sociotechnical principles: domain-oriented, data-as-a-product, self-service, federated, computational data governance. It puts the soft side of data and business outcomes first. The Thoughtworks definition is:
Data mesh is a sociotechnical approach to share, access, and manage analytical data in complex and large-scale environments — within or across organizations.
The technical aspect is more architecture than tool or platform, with almost a religious mantra of, “Data mesh is not about technology.” And that, my friends, is the blind side. Here’s why.
The number one question I get from data architecture and engineering teams is, “How do I implement data mesh?” This question has less to do with the practice and more to do with taking the principles and creating a data product that is composed into an insight solution. Arguing that data mesh is not technology misses the point that without technical implementation considerations, it is just another ivory tower data governance effort of talk and committees (albeit federated).
While technology is finally catching up with what we want to do with data, particularly analytics and AI, technology only facilitates data and translates data mesh soft artifacts to deployable products. The distributed, in-motion, experience- and outcome-based, digital solution only works when data capabilities exploit the decoupled nature of compute, storage, and state. That is certainly part of a cloud strategy, but it carries into all nodes and edges of the digital ecosystem and metadata architecture to execute on context and controls. That is a highly sophisticated and complex paradigm. It means that the “data as a product as a data mesh” principle requires the same first-class status as the socio-principles because it is where technology exists.
It’s true that you don’t buy data mesh. Data vendors bring in data mesh messaging and value propositions and have the founder of data mesh, Zhamak Dehghani of Thoughtworks, present at their summits and webinars, creating confusion between solution and architecture. In reality, the right lens of these demonstrations is how to use any technology to satisfy data mesh principles and business opportunities. There is not a single modern environment that does not have a data fabric foundation. But there are environments where traditional analytic architecture patterns and data fabric capabilities are not appropriate in operational scenarios and create the same limitations as traditional data warehouse blueprints. And this is where data mesh really starts to make sense.
Domain-oriented, self-service, and federated, computational data governance is measured on outcomes, service-level agreements, and user experience. Data as a product instantiates this with tangible, consumable, interoperable, and portable components. It is the composition of these products that creates the solution. The product is not the solution. And thus, to avoid the elephant in the room (technology) is to incur higher cost, longer development times, and continued shelf status and technical debt behind our digital environments.
The community is waking up to this. Early presentations on the definition of data mesh concentrated on domain-oriented and data governance principles. The addition of data as a product (what we produce) and self-service (how we work) is a needed addition to move data mesh from academic to pragmatic and realize return on data.