Data gravity is a term coined by software engineer Dave McCrory to describe the effect huge datasets can have on the applications and services surrounding them. Just as the gravitational pull of planets pulls smaller objects towards them, the growing lakes of data every business is amassing also impact the applications that use these massive datasets.
Speaking to Silicon UK, Dave McCorry, Digital Realty’s VP of Growth and Global Head of Insights and Analytics, explained: “Data gravity is the single greatest challenge facing all companies, today and into the foreseeable future. In a world where digital transformation is happening at an unprecedented pace, data has become the lifeblood of modern economies, providing businesses with valuable insights to drive growth, innovation, and competitive advantage. However, data is very much a double-edged sword. Its sheer volume and complexity can weigh businesses down, making it difficult for them to extract valuable insights from it – the phenomenon we know today as data gravity. This is only going to get worse as the digital economy continues to grow and thrive, and as society’s reliance on digital services and devices deepens.”
McCrory continued, “Data Gravity is an obstacle, but it’s not insurmountable. To solve it, we must start by looking at our underlying digital infrastructure and ensuring its fit for purpose. Data centres are the central nervous system of our economy, providing the foundations on which the digital economy is built and continues to grow. As Data gravity becomes more prominent, so does data centres’ role. Data centres provide businesses with the flexible infrastructure and resources to store and manage vast amounts of data effectively, from access to hybrid cloud solutions to high-speed networks and scalable storage solutions. With this, data centres can help businesses navigate the challenges posed by data gravity while enabling them to better leverage their data to take advantage of emerging technologies like artificial intelligence and machine learning.”
Think about the last time you performed a search on a dataset. The larger the dataset, the longer this will take. You are experiencing the effect of data gravity – slowing your ability to work efficiently. Also, data gravity can impact latency, which can be vital for some applications in communications and autonomous vehicles, for example. Any lag can have a cascading effect on the end user. To reduce these effects, the rise of edge computing has shifted the burden of managing large datasets as speed is closer to the data’s users. IDC predicts the Global Datasphere will grow from 33 Zettabytes (ZB) in 2018 to 175 ZB by 2025.
Seagate’s Chief Executive Officer Dave Mosley says: “We are at the beginning of an era where both data creation and data utilisation are forecasted to grow rapidly over the next decade. While some industries are more prepared for digital transformation than others, all businesses need to be ready to act on a solid digital strategy to be successful in the data age.”
And the quantities of data that businesses will collect are set to explode once again as IoT (Internet of Things) and AI add to the vast data lakes already in existence, potentially expanding their gravitational pull on associated applications and services. The cost of ever-expanding data storage capacity must also be considered.
“Consider this: you’re trying to distribute your AI and Machine Learning applications across multiple locations or cloud platforms or even to edge computing environments,” Peter Greiff, Data Architect Leader EMEA at DataStax, explained to Silicon UK. “In a traditional set-up, these applications rely heavily on your centralised Data Warehouse (DWH or gravity centre) to function around data. Moving these applications away from the DWH, where the bulk of their required data resides, presents complexity.
“For example, if you attempt to deploy your real-time fraud detection system in a different location or cloud than your DWH, the complexity of reconfiguring the systems to access and process all that necessary data effectively could skyrocket. This is a clear manifestation of data gravity. The system is being ‘pulled’ towards the large, relevant datasets in the DWH it was originally designed to interact with, rather than using data where the applications are.”
All businesses need to be agile. The data they use to make strategic decisions must be manageable and accessible. There is also the issue of security, which data gravity can impact. “At the moment, a common concern amongst customers and businesses is data security and privacy,” explains Ian Cowley, Head of Data Engineering at Ensono. “Data gravity impacts data security and privacy as soon as a company begins to scale. At a larger scale, there is a need for greater control over how and where data is being used. Data gravity only becomes a problem if the usual data security and privacy best practices aren’t already being adhered to – but heavier data can amplify potential security issues if they aren’t.
“There are ways to navigate these concerns, such as building your systems on the cloud where possible. In addition, cloud services are much easier to unpick than on-premises solutions, so if and when it does come to unpicking a complex web of dependencies from a heavy dataset, being able to do so on the cloud can speed up the process.”
Cowley concludes: “Conway’s Law states that organisations will subconsciously design systems that mirror how they communicate and interact internally. This is a truism that we see bore out time and again with data in organisations. Larger organisations and their legacy systems are far more likely to struggle with data gravity compared to the agile, product or domain-focused approach of VC-funded smaller firms.
Data gravity is a fact of business life that no enterprise can escape. However, data gravity’s effects on your company can be managed with a detailed strategic approach to data resources and the IT architecture in place.
The hybrid cloud has become an essential business resource. In the context of data gravity, hybrid cloud services have become the norm as enterprises embrace these services. However, the data architecture must be flexible enough to manage the quantities of data at rest and in motion.
As businesses see the cloud as an infinite data storage service, the potential effects of data gravity must be considered. Ensono Digital’s, Ian Cowley, outlined how data gravity and data migration must be managed in tandem: “Data gravity is a key issue to manage when a company is going through a transformation or migration programme. There is, unfortunately, no silver bullet. Big bang transformations are rarely successful, and often require huge upfront investment if an organisation – for example – is looking for a rapid migration of its systems from mainframe to the cloud. It can often be more effective to approach it progressively, moving across applications bit by bit. This can lead to situations where data might be in two places at once for short periods, but longer term it’s a more cost-effective way to manage the complexity.”
The Data Gravity Index published by Digital Reality states: “Data Gravity forces a new architecture, one that inverts traffic flow and brings users, networks and clouds to privately hosted enterprise data. With this new architecture, Data Gravity barriers are removed, and new capabilities are unlocked.”
“The real impact of data gravity lies in their decision-making capabilities,” concludes DataStax’s Peter Greiff. “If they cannot migrate the data easily, they may not be able to take full advantage of superior analytics tools or AI services offered by the new provider. Consequently, their ability to draw rich insights from the data could be compromised, leading to a weaker decision-making base. As a result, data gravity can significantly affect not only IT decisions but also the strategic decision-making capacity of a company.”
All data analysts agree that data gravity can’t be removed from any business, but its effects can be managed. A holistic approach is needed to consider an enterprise’s entire data estate to identify where the results of data gravity are being felt. Then a data gravity strategy can be created with data and IT infrastructure specialists. All businesses are now data businesses. How each enterprise manages its data is now of commercial strategic importance.
Anoop is an industry thought leader and public speaker in the Data & AI space with 20+ years of experience in conceptualizing, designing & delivering complex analytics programs across Europe, Nordics, and USA. He is based in Oslo, Norway, and heads the data & analytics practice in Europe, Nordics and Africa at LTIMINDTREE. He is an expert in business outcomes using data and AI and help large enterprises to scale AI and monetise Data.
“The growing importance of data in recent years has led to the concept of ‘data gravity’ – the idea that, as datasets become ever larger across every aspect of an organisation, they begin to slow down processes and access to insights, opens you up to new security concerns, results in poor customer experience and the ability for that company to scale digital businesses to new regions or users A good example is the interconnectivity of data – as more data is accumulated, it becomes increasingly interconnected. This makes it difficult to move components without impacting other parts of the system. Data gravity is an increasingly difficult challenge for organisations, and managing it effectively requires careful planning and risk management.”
“A good example of how data gravity can limit a company’s options and influences is in the process of mergers and acquisitions. The organisations may need to combine their data into a single system, but if one firm has more data than the other, the weight, or gravity, of that data may make it difficult to merge the systems. If one firm has a detailed, complex database system it may be costly to migrate all that data to the system used by the other company, which may influence the decision-making process of the newly formed organisation – they may opt to select the system with the larger dataset to reduce disruption.”
The pandemic expedited the rate of digital transformation as organisations sought to remain agile in the face of rapid economic and societal changes. However, a key example of data gravity could be if an organisation struggles to adopt new technologies or respond to changing business needs promptly and efficiently due to data location constraints. The underlying reasons could be high data volumes with complex interdependencies, heavy reliance on a specific vendor, high cost and complexity of data transfer etc. Therefore, before beginning any widespread digital transformation, it is wise to evaluate your data management strategy, including data location, increasing your flexibility and chances of enjoying a successful change.”
“Data gravity is not all bad – in fact, there are numerous ways in which an organisation can benefit from large data sets via interconnectivity. Enterprises in every sector can benefit by investing in data analytics, allowing them to extract valuable insights from their accumulated data. One of our leading European retail clients, for example, is leveraging the power of their interconnected data to have a clear view of consumer behaviour, helping them to make decisions related to inventory and boosting customer satisfaction. For a leading Asian bank client with millions of customers, we have enabled hyper-personalised customer engagement and campaigns using several machine learning algorithms on underlying data.”
“The concentration of a large amount of data in one location can increase the risk of security breaches and data loss. Any breach could lead to the disclosure of customer information, resulting in heavy financial penalties for the organisation and a long-standing impact on reputation. As data becomes more concentrated, security leaders need to have a clear understanding of its structure to ensure they can determine the required access controls, data encryption, monitoring and detection tool, and incident response plans.”
“One of the biggest reasons behind the increase in data gravity in recent years has been the IoT. The massive growth of connected devices and sensors has generated huge data stores which need to be managed and processed. This is only set to increase in the next decade as devices evolve and the workforce continues to operate on a hybrid basis. We will also see an increase in the use of generative AI across multiple different sectors as organisations develop innovative ways of harnessing the power of this exciting technology. This, in turn, will necessitate the storage and manipulation of large volumes of data required to run these applications.”
“Cloud providers can play a crucial role in helping organisations to manage data gravity by offering flexible and scalable distributed cloud architecture approaches. The distribution of cloud resources across geographic locations allows organisations to manage data closer to where it is being used, thereby reducing latency, and improving performance. Cloud providers also offer data tiering based on the frequency of required access, allowing organisations to increase cost efficiencies as well as comply with regulatory requirements.
“For example, one of our banking customers relies on cloud-based distributed architecture to store and process its vast amounts of data spread across multiple countries and LOBs, allowing it to offer digitalised banking services with personalised recommendations to millions of users. Data from different locations can then be accessed via a ‘data marketplace’ allowing Amazonification of data assets with easy search. A global consumer goods client is using a data marketplace solution, allowing its diverse teams to spread across brands, and regions to seamlessly discover and consume data products through a single platform, allowing collaboration, automation and re-use of existing capabilities.”
“Any industry that generates and relies heavily on large amounts of data is susceptible to the effects of data gravity. It is particularly the case in regulated sectors such as financial services, in which banks and insurance organisations generate large amounts of transactional data which need to be analysed to manage risk and detect fraud. Healthcare organisations must process and store sensitive patient data. Both sectors are subject to stringent regulatory requirements, so it is vital that organisations counteract the potential impact of data gravity by ensuring stringent security measures are implemented. With the emergence of Industry 4.0, the manufacturing industry also generates and relies heavily on data from sensors and production systems, including machine data, quality data, and supply chain data.”
“It is often wrongly assumed that data gravity can be managed by keeping it all in one location. For many organisations that is simply not possible. You should work with your cloud provider to adopt a hybrid cloud strategy allowing data to move between locations. Furthermore, data gravity is not just a storage issue. It also impacts processing, analytics, and storage – factors of critical importance for organisations of all sizes.”
“The future of data gravity is likely to be shaped by several trends and developments, which will have significant impacts on organisations and industries. These include emerging technologies such as 5G, which will enable faster data transfer and processing, while edge computing could enable organisations to process data closer to where it is generated. Overall, the future of data gravity is likely to be characterised by a complex and rapidly evolving data landscape, in which organisations will need to be agile and adaptable to succeed. Those that can effectively manage and secure their data will be well-positioned to thrive in this environment, while those that fail to do so may struggle to keep up with their competitors.”
Fourth quarter results beat Wall Street expectations, as overall sales rise 6 percent, but EU…
Hate speech non-profit that defeated Elon Musk's lawsuit, warns X's Community Notes is failing to…
Good luck. Russia demands Google pay a fine worth more than the world's total GDP,…
Google Cloud signs up Spotify, Paramount Global as early customers of its first ARM-based cloud…
Facebook parent Meta warns of 'significant acceleration' in expenditures on AI infrastructure as revenue, profits…
Microsoft says Azure cloud revenues up 33 percent for September quarter as capital expenditures surge…