Top 6 Data Science Pain Points in 2021

Top 6 Data Science Pain Points in 2021

Data science decision-making has evolved in 2021 between platforms and infrastructure, code and no-code, automations and human augmented as developers race to build data-intensive applications with the fastest architecture and the most robust data enrichment.  

Unpacking the COVID-19 Pandemic¹ data scientists, database architects and machine learning engineers have confirmed that the data layer is the missing critical piece to address the next generation startups on a global scale. 

The data driven economy will continue to augment every organization. Similar to how the 2010s focused on every company as a technology company, the 2020s will be defined as the one for data-powered startups, according to a recent Harvard Business Review survey.  

Source: www.humainpodcast.com

Businesses built on the back of analytics outperform those without a data strategy and more companies are leveraging data to earn their place in this digital economy².

Despite the adoption of data by enterprises, challenges abound.

To begin with, there is too much data around and companies face challenges harnessing the power of data. As it would, this explosion of data brings concerns including data organization and applying outcomes to business use cases. 

Other problems including poor understanding of business context by data scientists, lack of C-suite support for data science and different sources of data further compound these problems.

Organizations also lack education on execution of data analytics in their operations. Security is another pain point facing the data science industry and comes with the current data deluge including unclear governance policies around data privacy. Companies are increasingly facing security threats because of these data volumes. Let us explore some roadblocks facing data science:

1. Data Integration

There is too much data out there and confusion reigns as to how companies can use this data for their own value. Companies face hurdles getting information in a centralized system for reference and decision-making. From data collection, data cleaning, data learning and data execution, challenges still exist regarding use of data in a unified approach.

An internal data system is critical for organizations focused on using data to drive their business and become profitable. Integrating data in a single repository is beneficial and less costly but unfortunately, most companies have data scattered and this means missing important insights. At the same time, the lack of a data strategy³ compounds this problem for enterprises as they fail to take advantage of data for their own good.  

2. Explainable Models and Bias

The data science industry faces the bias problem in AI models. Technology should be cohesive but on the contrary, bias in ML models is on the rise. A good example of the bias is the instance at Amazon where their AI powered recruitment tool discriminated against female candidates in favor of male candidates.

The transition to creating explainable models around ML and AI is slow and most enterprises have not taken action to address these challenges. Ethical concerns around artificial intelligence and machine learning persist and lack of responsibility in technology system. The drive to make machine learning explainable continues to evolve but the industry has a long way to go.

 3. Business Problem Evaluation

Lack of understanding the business problem is a common experience among data scientists where they focus on building models and conducting data analysis without understanding the business context. Instead, the data scientist must understand the business problem and develop solutions based on data. Spending time on datasets alone does not translate to understanding the business problem.

A checklist for data scientists enables them to understand key problems, communication and engaging with executive management. Building workflows and implementing them without proper communication often leads to inaccurate results that do not coincide with the business problem. 

A recent Gartner research found that data scientists execute models without enough communication with management and contradicts business goals. Both sides should understand the business problem after which data scientists can derive insights from the data.   

4. Data Scientists vs. Technical Experts

Data scientists and technical experts including data engineers work on similar projects in teams and sometimes differences in working models creates friction between them. For instance, a data scientist can approach a problem from a different perspective compared to a machine-learning engineer. Companies face challenges streamlining these differences and if not well handled, they could hamper productive outcomes from teams.

Data driven organizations understand the importance of data engineers and data scientists and work to address these problems by enhancing communication. Developing collaborative systems is a good step that organizations can take to create synergy between data scientists and data engineers. By making these adjustments, data scientists and data engineers can understand the projects they are working on and collaborate across the board.

Source: www.humainpodcast.com

5. Communication with the C-Suite

Data scientists have technical understanding compared to those not in the field and the problem comes during the implementation of solutions. The executive team understands the importance of data science teams in their organization but lack of communication about the technical metrics of data science to the C-suite leads to misunderstanding about business goals. 

Executive boards find it difficult to implement recommendations from data scientists because they do not understand data science and connection to their business. Data scientists can solve this problem by practicing effective communication such as trend reports and data visualization techniques. These communication tools speak to non-technical members of the organization and help them make sense of data science projects.

6.Talent Gap

The talent problem ranks among the top three constraints facing the data science industry with companies lacking the right candidates. Hiring among data science roles according to Zip Recruiter increased in 2020 and the job site company estimates more than 70% rise in hiring in 2021. As demand for data scientists increases, companies without a good budget to hire a data science team find it difficult to navigate in these territories.

A data science team supports an organization in scaling products and bringing them to market by using data driven decisions. Small enterprises settle for less qualified data analysts whose skills are limited compared to the data scientist. The future looks bright for these small businesses without budgets to hire data scientists as more graduates from data science boot camps enter the market. The reliance on university graduates to fill data science roles in organizations has created this gap and in 2021, companies are shifting to hands-on graduates from boot camps.  

Better Solutions for the Data Science Industry

The first step towards solving a problem is admitting there is one. Data scientists should accept that businesses struggle to understand analytics and technical requirements of the profession and this is where data literacy should start. Data scientists should engage with executive management and iron out differences for them to operate in line with the business metrics and performance requirements.

Companies should also redefine their data science policies such as distributing roles among data science teams and communication to deliver results for the business. Oftentimes, companies hire few data scientists and are burdened with tasks including data collection, data cleaning and data analysis which become burdensome. Education institutions should also participate in this revolution by teaching students market demand skills such as management of big data to make them compatible with the real world.

Works Cited

¹ COVID Pandemic, ² Digital Economy, ³Data Strategy, Technology Systems, Data Scientist, Data driven organizations, Data Science Roles, Scaling Products  


KAMRAN AHMED

I am an Enterprise Data Management, Data Governance, Data Modeling Experienced Professional | As a Team Leader, I ensure the highest data quality, security, and compliance standards.

2y

I am writing almost after a year of the article being published. One of the problems is to change or fix the legacy concepts/practices. Let me explain. The process is running, and the owner knows the problem and the process quality problem. Because they know the workaround to fix the issue consequently they do not want to fix the problem. It could be because they have a workaround or maybe because they never think of the solution, or maybe they think by keeping old practices, they remain relevant.

Like
Reply
Jon Loyens

Chief Product Officer at data.world

3y

I think so many of these problems will actually resolve if we start focusing more on how we get stakeholders to work together more. The data engineer, scientist, analyst and domain experts at business need to actually iterate and build empath for each other's work in meaningful ways. Right now, too much of this is like software development in the late 90s/early 00s where people throw ideas over the fence and expect some magic to come from engineering 6 mo's down the road. Agile's big unlock wasn't delivering software faster (or as some detractors position it, avoiding deadlines or commitments) - it was getting all the stakeholders to work more closely together. We need the same things to happen with data and analytics work. On the integration front - I see a lot of improvement happening here - cloud data warehouses and easily reasoned about transformation pipelines built on tools like dbt, start to make the technical aspects of this less and less of a thing. I hope that this makes it easier for the technical teams to raise their heads up and start paying more attention to how they interact with other teams and deliver data prodcuts.

Haibo Zhang

AML Model Validation and Analytics | FinTech and RegTech Advisor | Entrepreneur | Executive

3y

Excellent points! On one hand, it is critical for data scientists to understand the business problem, on the other hand, it is vital for the management, i.e. the consumer of any model outcomes, to spend a little bit more effort, and more patience, to get your data scientists to speak more about the Bias, Limitations, and Assumptions of the data and the model (I call it BLA). Often the model BLA is overlooked and management would make inaccurate business decisions. Like you wrote in the article, it is an effort from both sides: "Both sides should understand the business problem after which data scientists can derive insights from the data."

To view or add a comment, sign in

Insights from the community

Explore topics