Data Mesh Architecture: Decentralised Data Ownership at Scale
Abstract
This paper presents implementation strategies for data mesh principles in large enterprises, addressing domain ownership and federated computational governance. Our research demonstrates how organisations can transition from centralised data platforms to distributed, domain-oriented data ownership models while maintaining data quality, security, and analytical capabilities.
Introduction
Modern enterprises are increasingly recognising data as a strategic asset, yet they often struggle with monolithic data architectures that centralise data ownership and create bottlenecks. Data Mesh has emerged as a paradigm shift in data architecture and governance, addressing these challenges by decentralising data ownership and treating data as a product. Coined by Zhamak Dehghani in 2019, Data Mesh is defined as “an analytical data architecture and operating model where data is treated as a product and owned by the teams that most intimately know and consume the data” (Thoughtworks, 2023). In essence, a Data Mesh organises data by business domains rather than technology silos, enabling domain teams to manage, serve, and govern their data autonomously at scale. This paper explores implementation strategies for Data Mesh principles in large enterprises, with a focus on domain ownership and federated computational governance, and discusses how this approach can be applied in business and government contexts. It draws on industry examples and case studies (e.g., mining, public sector) to illustrate best practices and governance considerations for a successful Data Mesh.
Data Mesh Principles in Large Enterprises
Data Mesh Principles in Large Enterprises
A Data Mesh rests on four core principles: domain-oriented decentralised data ownership, data as a product, self-serve data infrastructure, and federated computational governance (Dehghani, 2020). These principles collectively reimagine both the technical architecture and organisational model for enterprise data. In a traditional centralised data lake or warehouse approach, a single data team owns pipeline development, schemas, and access control for the entire company. This often leads to scalability issues: as data sources proliferate and use cases diversify, a centralised team becomes a bottleneck, slowing time to insight and stifling innovation. By contrast, under a Data Mesh, each business domain (e.g., marketing, finance, operations) takes ownership of its data pipelines and datasets, producing “data products” that can be shared and used across the organisation (Caserta et al., 2023).
Domain ownership means those who know the data best, typically domain-aligned teams, are responsible for its quality, accuracy, and availability. They design data products with the consumer in mind, much as software teams build APIs for reuse. Each data product is a curated, trustworthy dataset (or data service) with clear metadata, documentation, and lineage. Critically, these domain teams operate within a self-serve data platform that provides common infrastructure: for example, standardised data storage, processing engines, discovery tools, and pipelines. This platform, often built using modern cloud-based technologies, acts as an “invisible backbone” enabling domains to publish and consume data products easily without needing to build everything from scratch. The data lakehouse paradigm, which combines the flexibility and scale of data lakes with data warehouse-like management and governance, can serve as a technological foundation for a self-service platform (IBM, 2023). For instance, a data lakehouse environment might use cloud storage with table formats that support ACID transactions and fine-grained access control, ensuring that domain teams can reliably share data. Such an architecture was applied in a government setting at Tourism Tasmania, where a data lakehouse integrated disparate data sources (including duplicate datasets) into one unified platform. This system, built on enterprise-grade infrastructure, runs autonomously and securely, demonstrating how a robust central platform can empower different departments to share data accurately and safely, a stepping stone toward a full Data Mesh approach in a public-sector organisation.
Domain-Oriented Data Ownership and Data as a Product
Domain-Oriented Data Ownership and Data as a Product
Placing data ownership in the hands of domain teams is a fundamental tenet of Data Mesh. In practice, this involves reorganising how data responsibilities are assigned. Instead of a single data engineering department handling all data pipelines, each domain (such as sales, supply chain, or customer experience) has a cross-functional data team embedded within it or aligned to it. These teams are responsible for publishing “data products” that serve both their own domain’s needs and wider enterprise analytics. Treating data as a product implies several best practices: data products should be discoverable, addressable, trustworthy, self-describing, interoperable, and secure, essentially, fit for consumption by others. Just as product teams consider user experience, data product teams must consider the “data consumer experience,” ensuring documentation, metadata, and quality metrics are available so that other teams (or even external partners) can easily find and use the data.
Domain-oriented data ownership can significantly improve agility and scalability of data initiatives. Business units no longer wait in a long queue for a central team to deliver reports or new data pipelines; they can iterate on their data products quickly in response to changing needs. This was evidenced in a large mining company that shifted to a Data Mesh approach: it had hundreds of siloed operational databases and suffered months-long development cycles for analytics. After empowering domains to own and publish their data, the company was able to develop new analytics use cases seven times faster than before, with dramatically reduced data engineering effort and improved data reusability (Caserta et al., 2023). Similarly, other large organisations have reported faster time-to-market for analytics and increased innovation when domain experts (who best understand the data’s context) take ownership. By engaging directly with their data, domain teams also raise their “data IQ” over time, becoming more savvy about data management and analytics (Caserta et al., 2023). This cultural shift, getting business units “skin in the game” with data, can drive a more data-driven decision culture across the enterprise.
However, adopting domain ownership is not without challenges. Many organisations discover that simply declaring domains in charge does not automatically resolve issues like duplicate data, inconsistent definitions, or a lack of documentation. A high degree of coordination is still needed. For example, domains must agree on certain standards (common data definitions, master data identifiers, event timelines, etc.) so that their data products can interoperate. It’s common to find gaps, such as missing taxonomies or undefined data lineage, when first attempting domain-centric sharing (Caserta et al., 2023). Therefore, moving to this model often requires substantial data governance support and change management (more on that below). Moreover, not every domain will immediately have the skills to manage data pipelines or apply software engineering practices to data. Upskilling and possibly staffing with data engineers or analytics engineers in each domain team is necessary. Thoughtworks practitioners advise that organisations “start with the operating model” changes on day one, rather than treating Data Mesh as purely a technology project (Beyer, 2024). In other words, the success of domain ownership hinges on restructuring teams, roles, and incentives, aligning them with the product mindset and providing the necessary training and resources.
Federated Computational Governance: Balancing Autonomy with Standards
Federated Computational Governance: Balancing Autonomy with Standards
Decentralising data ownership does not mean a free-for-all. On the contrary, effective governance becomes even more crucial in a distributed model. Data Mesh employs a federated computational governance approach, which can be described as a “hub-and-spoke” model for data management (Caserta et al., 2023). In this model, a small central data governance team (the hub) is responsible for defining global policies and standards, such as security controls, privacy compliance, data quality criteria, and interoperability formats, while the domain teams (the spokes) are responsible for implementing and adhering to these standards within their own data products. The term “computational” implies that these governance rules are not just written in policy documents, but are embedded in the platform and automated via code. For example, standard definitions for metadata and data classification can be enforced through the data platform, so any new data product automatically includes required tags or documentation before it is published (Caserta et al., 2023). Similarly, access controls and privacy rules (like GDPR compliance) can be centrally defined but applied programmatically whenever a domain team exposes a dataset, ensuring consistency across the mesh.
This federated approach allows for “shared responsibility”: certain aspects of governance remain centralised for consistency, while others are delegated to domains for flexibility. Global standards might cover areas such as authentication/authorisation (to ensure the “right people have access to the right data”), regulatory compliance, and enterprise-wide data definitions (Tartow & Mott, 2022). Domain-level governance, on the other hand, lets each domain decide how to meet those standards in practice. For instance, every domain must measure data quality, but the specific metrics and thresholds may differ between a finance domain and a marketing domain based on their context. As Tartow and Mott (2022) note, some governance elements like data security and compliance must be enforced mesh-wide, whereas things like data quality definitions or internal transformations are largely left to domain discretion. The “computational” element comes from automating these controls: using infrastructure-as-code, CI/CD pipelines, and data pipeline frameworks to bake governance checks into the data product lifecycle. This might involve automated schema validation, lineage capture, or anomaly detection scripts that every domain’s pipelines include by default. Automation is key to governance at scale, manual processes would not keep up with the speed and autonomy of dozens of domain teams deploying data changes daily.
A practical example of federated governance in action can be seen in the Greater Manchester Combined Authority (GMCA) in the UK. GMCA implemented a Data Mesh to integrate health and social care data across multiple local organisations (Say, 2022). They established a central data catalogue and standards hub that pushes out a common data vocabulary (so everyone uses the same language for key concepts), while each participating agency maintains its own data in situ. The data never all flows into a single warehouse; instead, domains (local authorities, healthcare providers, police, etc.) expose data through the mesh with role-based access controls. The central “hub” enforces that, for example, all data shared must conform to standards like clinical terminologies (SNOMED) or interoperability formats (FHIR) in healthcare (Say, 2022). Meanwhile, each domain keeps ownership of its source data and controls who can query it. This hub-and-spoke model allowed GMCA to minimise data movement (a key privacy and efficiency concern in government) while still sharing “one version of the truth” across agencies. It illustrates how federated governance, supported by technology, can enable compliance and consistency (critical in a government setting) without reverting to total centralisation.
Implementation Strategies for Data Mesh at Scale
Implementation Strategies for Data Mesh at Scale
Transitioning to a Data Mesh in a large enterprise is a journey that involves technology, process, and cultural transformation. Experts stress that leadership and organisational buy-in are paramount. A common anti-pattern has been trying to implement Data Mesh in a bottom-up fashion, for example, one enthusiastic domain team building a few data products on their own, without broader executive support (Beyer, 2024). This often hits a wall when attempting to scale beyond the initial team, as other departments may resist changing their ways of working or sharing their data. Securing top-down buy-in (C-level and business unit leaders) creates a mandate for cross-departmental collaboration and resource allocation to support the mesh initiative (Beyer, 2024). In practice, many companies set up a Data Mesh program or transformation office to coordinate efforts, including representatives from business domains and central IT/data teams. This ensures that operating model changes (like new roles, governance bodies, and funding models for domain data work) are designed intentionally rather than ad hoc.Starting with the operating model means clearly defining domains, roles, and responsibilities early. Organisations should identify the key domains that will produce data products and appoint data product owners or domain data leads for each. These individuals interface between the domain and the central governance bodies. It’s also crucial to define how funding works: domain teams will need budget and people to handle their data products, some companies have shifted to a model where business units fund their data capabilities, with central IT providing the platform as a service. Training and upskilling are another critical piece: domain teams might need education on data engineering best practices, product thinking, and how to use the self-serve platform tools. Establishing communities of practice or “data guilds” across domain teams can help share knowledge and maintain alignment on standards.
From a technology standpoint, implementing Data Mesh does not require one specific tool or vendor, but rather a set of capabilities. In many cases, an existing data lake or cloud data warehouse is repurposed into part of a self-serve platform. For example, an organisation may already have a data lakehouse (on AWS, Azure, or another cloud), which can be extended with data catalogues, governance tooling, and APIs to allow domain teams to publish and find data. The platform team, typically a central data engineering group, should focus on providing easy-to-use infrastructure: things like template pipelines, unified authentication, data lineage tracking, and a portal where data products are catalogued. This is analogous to an internal developer platform, but for data producers and consumers. A sound strategy is to pilot with a couple of domains first, building a few high-value data products end-to-end on the new platform and governance framework. This pilot can demonstrate quick wins (e.g., significantly faster delivery of an analytics project) and flush out any technical or process kinks on a small scale before broader rollout.
Crucially, there is no one-size-fits-all adoption path. Every enterprise has a unique context, legacy constraints, regulatory requirements, skill distributions, and culture, which will influence its approach. A recent study of large Swiss organisations found that all of them were adopting Data Mesh principles to some degree, but “in different ways and to a different extent”. Notably, many companies kept certain governance functions centralised (or even increased centralisation there) while decentralising data production responsibilities, largely due to the constraints of regulation in their industries (Winter & Hackl, 2025). This suggests that while the vision of Data Mesh is universal, the implementation must be tailored. Highly regulated sectors (finance, government, healthcare) might implement stricter central governance controls and adopt a gradual decentralisation of ownership. By contrast, a tech company or digital retailer might push more aggressively to full domain autonomy. Contextual factors, such as the degree of existing data maturity, regulatory environment, and organisational structure, will dictate the right balance between central and local control. Leaders should evaluate these factors and possibly use maturity models or frameworks to guide their adoption strategy. Reference models from industry case studies (banking vs. manufacturing, etc.) can provide benchmarks, but ultimately each enterprise’s journey will be unique.
Case Studies and Industry Applications
Case Studies and Industry Applications
Although Data Mesh is a relatively new concept, several industries have begun to report success (and lessons learned) from its adoption. In addition to the mining company example mentioned earlier, the financial services sector provides compelling use cases. Dutch bank ABN AMRO, for instance, embraced a data mesh approach on its Azure cloud platform, reorganising data teams around domains like Payments and Mortgages to drive better business decisions (Microsoft, 2022). By treating these domain datasets as products and enabling easier data sharing across the bank, ABN AMRO aimed to increase the speed of analytics development while maintaining strict compliance and security controls. Another financial example is Fannie Mae, a U.S. mortgage finance company, which built a data mesh architecture leveraging cloud data sharing technology to enable self-service analytics (Fannie Mae, 2021). They focused on creating domain-specific data pipelines with clear ownership, all running on a scalable lakehouse platform, to break down long-standing data silos in the organisation. Early indicators from such projects suggest improvements in productivity for analysts and data scientists, as well as more reuse of trusted data across teams.
The public sector stands to benefit significantly from Data Mesh as well, given the notorious silos between government departments. We saw the Greater Manchester case where a regional authority used data mesh principles to integrate data for social care. Similarly, consider a tourism department in a government (such as Tourism Tasmania): traditionally, data about visitors, accommodations, events, and marketing might reside in separate systems managed by different units or even external agencies. By implementing a centralised data lakehouse as a foundation, Tourism Tasmania was able to bring together disparate data sources into one platform, eliminating many duplicate datasets and creating a single source of truth for analytics. Building on that foundation, applying Data Mesh principles would mean empowering each functional domain (e.g., marketing, research, regional offices) to curate and publish their portion of the data (visitor stats, campaign data, operator information) as products in the platform. Governance in a government scenario is critical, here, a federated model ensures that sensitive data (like personally identifiable information of visitors or citizens) is protected via central policies, while still allowing domains (subject matter experts in tourism data) to manage their datasets. The Tourism Tasmania case highlights how automation and accuracy can be achieved in a data platform: the system was designed to run autonomously, with data pipelines updating on schedule and validation checks to ensure data quality, reflecting the “computational governance” ideal. Government IT environments often require enterprise-grade security and compliance (e.g., auditing, disaster recovery, strict access control), which were built into the platform from the start. This aligns with Data Mesh’s emphasis that the self-serve infrastructure must handle cross-cutting concerns (security, compliance, etc.) so that domain teams can focus on their data content rather than these technical complexities.
Another industry example is e-commerce/retail, which often has many customer touchpoints and product lines (domains). Retail giant Adidas publicly shared about its Data Mesh journey, where domain teams like e-commerce, retail stores, and supply chain were each made responsible for their own analytics data, under a central governance umbrella (Adidas, 2021). Adidas reported that this shift helped them deliver data insights more quickly to various business functions and fostered a culture of data ownership among business analysts. The life sciences industry also provides an instructive case: McKinsey (Caserta et al., 2023) describes a pharmaceutical company that attempted a Data Mesh. They had the technology in place but struggled with organisational buy-in, as business units initially clashed over which data products and use cases to prioritise or centralise. This led to a mid-stream pause and taught the lesson that, beyond tech, alignment and change management are vital. Eventually, by focusing on cross-domain collaboration and clearly communicating the benefits (e.g., faster research data access for scientists), the life sciences company was able to resume and make progress.
These case studies underscore that Data Mesh can be applied in various contexts, from government social services to global banks, but success hinges on tailoring the approach to the organisation’s needs and ensuring robust governance. Many early adopters report positive outcomes such as faster development cycles, increased data reuse, and greater business engagement with data. At the same time, they caution that one must invest in the “soft” aspects (culture, communication, training) and not view Data Mesh as merely deploying a new tech stack.
Challenges and Best Practices in Governance and Implementation
Challenges and Best Practices in Governance and Implementation
Implementing Data Mesh is not without challenges. Cultural resistance is often a major hurdle: business units may be wary of taking on data responsibilities (“not my job”), and central IT may feel threatened by loss of control. To mitigate this, strong executive sponsorship and clear communication of the vision are needed. Leaders should articulate how Data Mesh will empower domains and benefit the company, while also making it clear that accountability for data quality and value is being distributed as a strategic decision. Many organisations establish a cross-functional governance council or steering committee to oversee the mesh rollout, including domain representatives and central data officers. This body can arbitrate any conflicts (for example, disagreements on standard definitions or priority of shared investments) and keep the program aligned with business goals.
Data governance itself must evolve in a mesh. A best practice is to codify as much as possible: if you require every data product to have an owner, a description, and certain quality metrics, bake those checks into an automated pipeline or a catalogue entry form that won’t allow publishing without them. Self-service data catalogue tools are handy here, as they provide a user-friendly way for domain teams to register and document datasets, while allowing enterprise metadata managers to oversee taxonomy and lineage. Additionally, monitoring and observability of data pipelines across the mesh are crucial, the platform team should provide common monitoring dashboards so that issues (data delays, quality drops) in any domain are visible and can trigger alerts. This keeps the mesh healthy and trustworthy.
Another challenge is avoiding duplication and data silos in a different form. If not coordinated, domain teams might inadvertently create overlapping data products or use different sources for the same data, leading to consistency problems. To prevent this, the federated governance group should establish a process for data product lifecycle management: evaluating new data product proposals, encouraging reuse of existing data where possible, and deprecating or consolidating duplicate datasets. In effect, governance in a Data Mesh has to play a conciliatory role, allowing flexibility but nudging teams towards collaboration. The Greater Manchester example, for instance, shows how a central catalogue and standard definitions helped disparate agencies speak a common language (Say, 2022). Likewise, an internal enterprise could enforce using a central reference data (like a master customer ID domain) so that each domain’s customer-related data can join correctly with others.
Security and compliance are non-negotiable areas in enterprise data governance. Federated governance means domain teams must be educated and accountable for following regulations (such as privacy laws) in their domain, but the organisation should provide clear guidelines and automated tools. One best practice is implementing a unified identity and access management system at the platform level, e.g., using single sign-on and role-based access control that spans all data tools, so that access policies set by the central team are uniformly applied. Domains then simply tag data with the appropriate sensitivity level, and the platform handles the rest (masking sensitive fields, restricting access to authorised roles, etc.). This reduces the chance of human error in compliance.
Based on early adopters’ experiences, a few key best practices have emerged:
- Secure executive buy-in and a top-down mandate: This ensures every domain and department understands the importance and urgency of the change (Beyer, 2024).
- Invest in domain data literacy: Provide training, hire or develop data product owners and engineers in each domain, and foster a community to share knowledge.
- Don’t neglect the central data platform: It should be reliable and developer-friendly. If domain teams struggle with tooling, they will revert to old habits. A strong central platform team is an enabler, not a blocker, in a Data Mesh.
- Implement incremental governance: Start with a light but essential set of global policies (security, metadata, key definitions) and add more as the mesh matures. Overly strict rules at the outset can stall progress, whereas too few rules lead to chaos.
- Monitor and measure success: Track metrics such as number of active data products, reuse rates of data across domains, time to deploy new data pipelines, and user satisfaction. This helps demonstrate value to stakeholders and course-correct if needed.
Finally, it is important to emphasise that Data Mesh is not a silver bullet. As Winter and Hackl (2025) found, every large company may interpret and apply Data Mesh differently. Success depends on adapting the principles to one’s context. For some, a partial mesh (with a hybrid centralised-decentralised model) might be optimal. For others, a full organisational restructure is justified. The common thread is that Data Mesh, at its heart, is about making data more accessible, more aligned to business needs, and scaling data management through decentralisation, while avoiding the perils of anarchy through strong, federated governance.
Conclusion
Conclusion
Data Mesh architecture offers a promising framework for enterprises seeking to balance agility and control in their data strategy. By decentralising data ownership to domain teams and treating data as a product, organisations can unlock innovation and speed at scale, enabling business units to access and leverage data more directly. At the same time, federated computational governance ensures that this decentralisation does not compromise on enterprise-wide standards, quality, or security. The implementation of Data Mesh is as much an organisational transformation as a technical one: it requires leadership support, cultural change, and a robust self-service infrastructure. The experiences of early adopters across industries, from banking and e-commerce to government agencies, demonstrate both the potential benefits (faster delivery, greater data value realisation) and the need for careful planning (governance frameworks, operating model redesign). For senior leaders and academics alike, Data Mesh provides a rich arena for exploring how large organisations can become truly data-driven. By embracing domain-oriented thinking and modern governance best practices, enterprises can scale their data capabilities in a sustainable, federated manner. As this paradigm continues to evolve, future research and case studies will no doubt refine how we best implement Data Mesh, but its core vision of decentralised ownership at scale is set to play a key role in the data architectures of tomorrow.
References
Beyer, K. (2024). 10 recommendations for a successful enterprise data mesh implementation. Thoughtworks. Retrieved from Thoughtworks. Caserta, J., Dubois, J.-B., Roggendorf, M., Roth, M., & Srinidhi, N. (2023). Demystifying data mesh. McKinsey Digital. Retrieved from McKinsey. Dehghani, Z. (2020). Data Mesh Principles and Logical Architecture. MartinFowler.com. Retrieved from Thoughtworks. IBM (2023). Data Lakehouse vs. Data Fabric vs. Data Mesh. IBM Think Blog. Retrieved from IBM. Say, M. (2022). Greater Manchester harnesses Data Mesh for social care. UKAuthority. Retrieved from UKAuthority. Tartow, C., & Mott, A. (2022). Data Mesh: Federated Computational Governance. Starburst. Retrieved from Starburst. Winter, R., & Hackl, T. (2025). Exploring Data Mesh Adoption in Large Organizations. Issues in Informing Science and Information Technology, 22, 011-024. Retrieved from InformingScience.