Data Mesh In 2023 And Beyond
I spoke with the chief architect at a food service wholesaler and distributor about data mesh the other day. He is preparing a data strategy that helps the organization modernize its approach to data and data science. His challenge is the decentralized nature of managing and governing data in a holding company of several other companies and divisions, so of course he is interested in data mesh in 2023. But he is left floundering. Technology companies, system integrators, service providers, and industry analysts all have conflicting opinions and recommendations on what to do and what to buy.
The conversation in the market is everything from “Data fabric is going to devour data mesh” or “Data mesh is going to go away” to “Data mesh is about four principles of domain ownership, federated computational governance, self-service, and data as a product.”
What’s a leader like him supposed to do? Let’s look into the crystal ball for 2023.
- Data mesh is not going away anytime soon. Data mesh forces organizations to think of data from the use case and business value down to the data, not the other way around. As firms mature and accelerate digital and AI investments, they will focus more on business-value-driven data product creation. Domain ownership defines the context of the data product domain. Self-service will continue to grow. And new data and AI regulations expand, federate, and decentralize governance.
- Federated computational governance is the 2023 weak link. Decentralization brings a maverick quality to building data products. While that might work for a “Top Gun” pilot, breaking things due to lack of risk assessment and management frameworks for data and data science is a recipe to crash and burn. The challenge is to execute data governance well within decentralized teams (pods) while flying in formation with global and business areas’ policies and controls. Federation needs coordination supported by peer review.
- Domain ownership shape-shifts. Organizations will realize about six months to a year into their data mesh journey that ownership is tricky with decentralization and a matrix of business and technology roles. Instead of single-point ownership, ownership is granular and distributed. Domains will change based on multiple owners for the scope and data model, the governance and standards, context and utilization, and the sources. Ownership of data products is shared across creators and stakeholders of the product itself, pipelines, APIs, data sets, lifecycle, and performance.
- Data mesh pivots toward real-time use cases. Business intelligence and analytics is the most common starting point for data mesh. Technical, stewardship, and analyst resources exist, and they will benefit from an approach to become an insight-driven business. These roles already have a baseline of data literacy into the four principles (ownership, governance, self-service, data products). Insight on a screen (analysis, dashboards, reports, visualizations) is too slow for today’s decision-making timeline, however. Intelligence needs to embed into digital experiences, process automation, and partner experiences to see significant, tangible, and measurable impact on outcomes. Thus, projects and initiatives focused on operational, real-time, and edge use cases will shift data mesh efforts deeper into business and application development.
- Data fabric technology finally gets a contextual distributed orchestration layer. Early data mesh papers leaned hard into microservice architecture (service mesh) and event-driven architecture (event mesh). Both architectures run a flurry of services, messages, and streams across a distributed ecosystem — especially when AI is introduced. As such, pipelines today contain hundreds if not thousands of queries and schemas (topic areas). And the data layer and application layer, while decoupled, are natively coordinated and orchestrated through pub/sub (publish subscribe). Technically, data mesh is the orchestration layer. As a product, data mesh is what ensures that the technology serves and tunes data and insights for the consumer, unique point-in-time value, and the best outcome.
- Data mesh as technology is only going to get more confusing. A new concept that is buzzworthy and of interest to customers gets product marketing teams at data management software vendors all excited. Data mesh in 2023 is that bright shiny object now. Except that millions of dollars have been spent in naming data technology as data fabric and selling data fabric as a solution. In fact, Forrester has conducted evaluations on data fabric for nearly two decades. At the same time, data mesh believers voice strong opinions that it is not technology. They see defining data mesh as technology as interfering with the principles that define a practice to monetize data — which is why chief architects are confused. Though I see a future when data mesh is a technology you buy based on conversations with research and development teams and startups, for 2023 and the next several years, we will continue to see conflicts in definition, message, technology, and value.
What’s a chief architect, or any data leader, supposed to do? Remember that the four principles of data mesh independently are areas you’ve already made investments in and have competency and experience with. What data mesh does is provide a framework to unify the principles and practices as a standard operating model to make decentralization easier and more effective. Keep in mind, however, that as business stakeholders and operations teams expand design thinking practices for digital, AI, and edge, their requirements have downstream impacts on what and how you build data and intelligence capabilities. In the end, ignore the marketing and messaging right now. Take steps in data mesh that address partners, practices, and platforms equally to design data for the customer experience, value, and outcomes expected.