Making Sense of Big Data

Why Collect Data?

Data is your business’s most precious commodity. Finding value and actionable insight in Big Data are vital components of all successful enterprises

Microsoft President Brad Smith recently stated, “We will begin this next decade with 25 times as much digital data on this planet as we had in the year 2010.” Big Data is now a fact of life for all businesses.

According to research from Qubole, 44% of businesses now have data lakes in excess of 100TB. Information on this scale provides a number of opportunities: Customer insights are a crucial product when large datasets are used to support a wide range of services.

How businesses utilise the information they have to inform innovation, support customer service and expand their market reach, are key drivers across all sectors and industries.

“In our borderless world, data are the coins of the realm, says McKinsey. “Competing effectively means both collecting large amounts of data and developing capabilities for storing, processing, and translating the data into actionable business insights. A critical goal for most companies is data diversity—achieved, in part, through partnerships—which will enable you to pursue ever-finer microsegmentation and create more value in more ecosystems.”

Using data to offer highly focused personalised services is a crucial use of Big Data, as Michael Glenn, Market Intelligence Analysist at Exasol explained to Silicon: “Finance is the standout sector in how they use data analytics for activities. Banks have reached incredible levels of granular personalisation.”

Glenn continued: “They can explore customer demographics, credit card statements, transactions and point of sale data, online and mobile transfers and payments and credit bureau data. This enables banks to discover similarities that lead them to define tens of thousands of micro-segmentations in their customer base and to build ‘next product to purchase’ models that increase sales and customer retention.”

Using Big Data to reveal the small data about customers and their behaviour, is a core component of Big Data analytics. Once the large datasets that exist today are appropriately managed and integrated, robust analytical engines can then show your business value that has until now been hidden from view.

Seeing patterns in Big Data

Once data is collected, analysing these large datasets then becomes the challenge businesses face. Indeed, the Qubole data shows not enough dedicated tools are available to find meaning in the information present. Analytical systems are needed, but so is the talent to use these tools effectively. Big Data skills are clearly lacking with 83% of respondents to their survey stating they find it difficult to recruit into these roles.

Big Data and the cloud have become close bedfellows. As enterprises have migrated to the cloud and created hybrid cloud deployments, the data that is contained in these spaces has rapidly expanded. Managing these environments efficiently then becomes a business priority.

“Big Data technologies can be difficult to deploy and manage in a traditional, on-premise environment,” says Jessica Goepfert, Program Vice President, Customer Insights and Analysis at IDC. “Add to that the exponential growth of data and the complexity and cost of scaling these solutions, and one can envision the organisational challenges and headaches. However, the cloud can help mitigate some of these hurdles. Cloud’s promise of agility, scale, and flexibility combined with the incredible insights powered by BDA (Big Data Analytics) delivers a one-two punch of business benefits, which are helping to accelerate BDA adoption.”

How businesses approach the analysis of the data they collect will inevitably lead them to AI – especially Machine Learning. A level of automation is necessary simply because human operatives would not be able to see the hidden value Big Data can deliver to a business. However, enterprises are struggling to connect AI with their datasets to reveal the value they know is contained in their data. Combining Big Data with existing business pipelines and supply chains are also pressure points.

Says Caroline Carruthers, Chief Executive of Carruthers and Jackson: “There are obvious technical challenges presented by managing a huge volume of data. However, apart from those, one of the big challenges for businesses is finding the right questions to ask of data to understand where they need to change their practices. It’s also crucially important for businesses to get the right talent through the door with the ability to provide insights from the data.”

Using data to drive innovation is now a core goal of all businesses as they strive to transform. MicroStrategy discovered 46% of their respondents to the state of global analytics report say they have been able to identify and create new product and revenue streams, and 45% of organisations are now using data and analytics to develop new business models.

Taking action

The masses of information all business now collect present a clear security risk. The large fines handed out to British Airways (£183m) and Marriott International (£99m) by the ICO (Information Commissioner’s Office) illustrate the importance of collecting and managing data with active security measures in place.

GDPR certainly proved to be a force for change, particularly regarding the personal data of consumers. Businesses, though, still need to understand how their data landscapes are continually evolving. Their reaction to potential security breaches needs to be flexible. And in a world where masses of data are moving to and from the cloud, protecting information at rest and in motion is critical. Big Data needs big security.

In their report to support the Big Data LDN, Cloudera revealed: “Today, the responsibility for data is spread thinly across several C-level roles. Fewer than 10% of CDOs are responsible for their organisations’ data, followed by 7% of CIOs, 5% of CMOs and just 4% of CFOs.

“The only role which saw an increase in data responsibility was the new title of CPO (Chief Privacy Officer), with 5% now reporting to the CEO. In contrast, CDOs, who held the most responsibility for data in 2017, saw their data ownership halve in 2018 to just 27%. With UK organisations putting a range of roles in charge of data – each approach will inevitably be different and have differing business impacts.”

By 2025, IDC predicts that nearly 60% of the 175 zettabytes of existing data will be created and managed by enterprises versus consumers (compared to just 30% created and managed by enterprises in 2015).

MicroStrategy concluded: “With the nexus of data, AI, cloud and other technologies, along with the consumerization effect on enterprise technology, analytics is now poised for a paradigm shift. The new design point for BI (Business Intelligence) won’t just be for, data scientists, analysts, and developers. It will be for every person, every process, and every device, delivering insights in the context of the work being done and able to be shared and consumed to bring about a more Intelligent Enterprise.”

Exasol’s Michael Glenn also explained to Silicon: “The future of Big Data is going to get much bigger, but people will stop talking about Big Data because it will just be a facet of life. In the long term, companies will modernise their databases. While prescriptive analytics, data democratisation and the rise of natural language business intelligence will give all employees the capability to be truly data-driven. Decisions that aren’t ‘data-first’ will be a rare event anywhere within organisations.

“As for the more immediate short-term, database modernisation will be commonplace as companies will realise that high-yield data strategies cannot move beyond the experimental phase without a high-performance database that is enterprise-wide. Also, dynamic pricing and customer micro-segmentation will become table stakes. Finally, those operating with a supply chain that doesn’t use analytics for supply/demand optimisation will fall behind as their margins will shrink in comparison to more mature companies.”

The question to all businesses is not why to collect data, but can you afford not to? Big Data needs big solutions to collect, store and manage these vast datasets that will only continue to grow. Businesses, though, have a great opportunity – unprecedented one: To use this information to make their businesses more agile, competitive and profitable. Data is the new currency. Your business needs to spend this currency wisely.

Silicon in Focus

Eran Brown, EMEA CTO, Infinidat.

Eran Brown, EMEA CTO, INFINIDAT
Eran Brown, EMEA CTO, INFINIDAT

Eran Brown is EMEA CTO at Infinidat. He has extensive experience in the storage sector, combined with detailed knowledge of data infrastructure. As EMEA CTO, Eran engages across organisational functions to get customers what they need to grow their business. Prior to this role he was a Senior Product Manager at Infinidat, during which he lead the launch of the company’s NAS and iSCSI capabilities, as well as networking, compression and security features. Before coming to Infinidat, Eran worked as a Pre Sales Engineer for NetApp for 8 years, where he covered multiple verticals and territories.

Can you give an example of an industry or sector that is making great use of Big Data today?

I would say online B2C businesses are leading the way. Some online companies have simply allowed transactions to happen without changing the user experience of interacting with their company, so have missed out on a critical opportunity to redefine their value and perception. Take online gaming, for example.

An industry that collects a lot of data to understand who their audience is, what they like and dislike, and how to best attract them to a game. Most of them use in-game ads, and microtransactions and those can and should be tailored to the individual player. Are they willing to pay for a game? Without customisation, based on data collection, the gaming companies are shooting in the dark!

What are the challenges of collecting and using Big Data across a business?

There are multiple challenges. Let’s start with culture. If you can instil data collection and analysis into the business culture, data will be collected as early as possible (long term data usually enables better analysis) to allow A/B testing. This is valuable as you can then measure any customisation and map it to the right audience.

Next, we need to decide on the underlying infrastructure. Small companies often start in the cloud but later, as they want to grow and/or get external funding they realise they need to improve EBITDA (Earnings before interest, tax, depreciation and amortisation) and other financial parameters, and look to move from OpEx to CapEx. If the instrumentation required to change from one backend to another is not put in place from the beginning, it’s expensive to add it afterwards. The same applies for transitioning between different cloud providers.

Can you plan growth in advance? We see customers investing a lot in node-based Hadoop / Elastic / Splunk clusters only to realise 18-23 months, that they are over-utilised on one resource and under-utilised on another.

The most common example is Hadoop nodes with a lot of data that is barely using their CPUs. Another is that compute-intensive clusters are often configured with too much storage as they needed to add more nodes to gain compute power, and now they have 100s of kWh, cooling and rack space wasted.

Designing your compute to be disaggregated from your data tiers, to enable each to grow separately is critical, and most customers only get that when they have already bought dozens of unnecessary nodes in their cluster. I once worked with a customer to collapse a seven-rack cluster into 1.5 racks just by removing this inefficiency.

How is AI and Machine Learning helping businesses make the best use of the Big Data they have?

We see a lot more data crunching in the field than true AI/ML, but it’s definitely growing and will keep growing, as both the technical skills and as the business units realise the value.

In a world where the user experience (UX) is key to success, finding usage patterns that enable customising of the UX is a key to increasing adoption and improving user interactions. Other use cases we see are, of course, AI for operations (AIops) where operations are optimised through finding the patterns that lead to issues in the infrastructure.

We use it ourselves to look at over five exabytes of global customer capacities to proactively push recommendations and alerts to the IT team whenever we identify anomalies.

My favourite example is the ability to detect in near real-time an increase in latency in the storage fabric – outside of the storage array. These incidents usually result in application / database people blaming the storage, as they don’t have the ability to distinguish between fabric and storage latency, and this will lead to storage people wasting a lot of time ‘proving their innocence’.

How are CTOs and CIOs approaching their use of Big Data today?

CTO’s and CIO’s are already aware of the power and advantages that Big Data analytics can provide the business units but have a hard time sizing the infrastructure.

If the project is a failure, how can I reuse the infrastructure? If the project is successful, will the business units ask to analyse more data (needing higher system capacity) or want more analysis of the same data provided (which requires more computing power)? Predictability is hard, and many CIOs start the project only based on what the project is initiated.

We’ll see wider adoption and, where on-premise solutions are designed correctly, they will prove more cost-effective than cloud providers. This will be even truer when the CIO’s can provide their business units instant growth when they need it and without breaking the cost model.