Explaining Data Mesh Like Even A 5-Year-Old Will Understand
Decentralization is the word that perfectly defines this new digital era. It is defined as,
decentralization
/ˌdiːsɛntrəlʌɪˈzeɪʃn,diːˌsɛntrəlʌɪˈzeɪʃn/
noun
the transfer of control of an activity or organization to several local offices or authorities rather than one single one.
What does it have to do with data management? Well, everything.
Until now, large enterprises have been managing data by storing it in a centralized location, ingeniously known as a ‘data lake’. Basically, it is a lake of data where you send all the valuable information you collected through your department-specific processes. However, to retrieve some part of it to study and improve your process, you will need to go through a team, who are practically the owners of the data.
So, it is like saving your hard-earned and sweat-laden money in your bank account. But withdrawing the money requires tedious steps.
The problem with this was that the teams were not able to drive value out of the data they were collecting. The centralized control of the data became a major roadblock.
Enter Data Mesh.
What is Data Mesh exactly?
Data Mesh is a strategic approach to data management where you decentralize data and organize it in an architecture where each team will handle its own data pipeline. All these different domain-specific data pipelines are linked to a universal interoperable layer enabling the implementation of standard policies and syntaxes.
To explain it with our stretched analogy, it is like your digital wallets - where each one of you will have direct access to the money saved in your bank accounts.
(The ‘5-year-old will understand’ claim probably ends here.)
Four core principles of Data Mesh:
Recommended by LinkedIn
Domain-driven Data Ownership
A particular domain or business function is responsible for the management of the data it produces. It involves assimilating, transforming, and provisioning data for end users.
Data-as-a-Product
With domains owning the data, they transform the information for end users to use the same to drive business value, making it ‘data-as-a-product’.
Self-Service Data Infrastructure
Although domains are owners of their data pipelines, you will have a platform engineering team to provide the infrastructure requirements.
Federated Computational Governance
In the Data Mesh approach, data governance becomes an inherent part of domain workflows to standardize the implementation of standards.
Benefits of Data Mesh
Improved data quality: With teams taking ownership of the data they use, it will lead to better data quality and reliable decision-making.
Increased agility: With the decentralization of data, dependencies on centralized data teams will reduce, making the organization agile with faster decision-making capability.
Enhanced collaboration: It facilitates cross-functional collaboration to share data across teams, driving innovation.
Increased transparency: it outlines the expectations for data producers and consumers helping increase transparency across the enterprise.
Improved data literacy: It propagates teams to upskill to better use data through data literacy.
Before you rush to implement it at your organization, it is crucial to understand a few basic things.
Data Mesh is primarily a necessity for large organizations with multiple spread-out teams that work as sub-entities. However, if yours is a smaller organization, data mesh might just be an added overhead.