A Crash Course in DataOps

Sonia M.

Executive Director, Product Management, IBM Data & AI

Published Dec 17, 2020

This article contains excerpts from the DataOps Flipbook White Paper which contains a more comprehensive introduction to DataOps. This technical white paper was published as part of the IBM World-Wide Community of Information Architects an affiliate of the IBM Academy of Technology (AOT).

DataOps Defined

DataOps is the orchestration of people, processes, and technology to accelerate the quick delivery of high-quality data to data users. DataOps promises to streamline the process of building, changing, and managing data pipelines.

Objectives

Its primary goal is to maximize the business value of data and improve a client’s experience in data delivery. It does this by speeding up the distribution of data for reporting and analytic output, while simultaneously reducing data defects and lowering costs.

DataOps applies the rigor of software engineering to the development and execution of data pipelines, which govern the flow of data from source to consumption. By delivering data “faster, better, and cheaper,” data teams increase the business value of data and customer satisfaction.

Usage

DataOps is used for building analytic solutions, including reports, dashboards, self-service analytics, and machine learning models. It emphasizes the effective collaboration across teams that handle different pieces of a data pipeline, while keeping an overarching view.

DataOps Dimensions

People – “DataOps Managers” are a collection of people and roles that lead the delivery, management and support of high-quality, mission-ready data at scale. They Consist of Data Engineers, Information Architects, and DataOps Engineers. “DataOps Consumers” are a group of people who ultimately turn data into business value. They consist of Data Scientists and Data Analysts.

Process - The overall DataOps function is a fusion of two different existing processes:

1) It leverages a “DevOps” based flow that defines the evolution, maintenance, orchestration and monitoring of the underlying development procedures.

2) A “Data management” based flow that defines the processing that needs to be carried out on the data assets being managed by the DataOps function.

Whether your organization is starting to develop a basic DataOps practice or sustain a more developed practice, it is important to baseline your team’s ability to deliver business-ready data fast and make a plan for improvement that aligns with creating business value. The figure below depicts a DataOps framework that contains steps, components and roles in data science and business intelligence use cases.

Technology - In the depicted DataOps framework, each step has a focused set of deliverables that require the right people, processes and technologies to satisfy client requirements with low errors, speed and efficient collaboration. A supporting toolchain with advantageous product features will enable DataOps pipelines to produce business value.

DataOps Steps

Implementing DataOps

There are several DataOps tools and platforms available. Some address specific aspects of data engineering while others concentrate on use cases like data science with a surface level focus on data management. The ideal tool chain will support DataOps end-to-end and consider its key principles.

A future paper will dive into more depth on how to implement a DataOps discipline using a supporting tool chain.

Christopher Bergh

CEO & Head Chef, DataKitchen: observe & automate every Data Journey so that data teams find problems fast and fix them forever! Author: DataOps Cookbook, DataOps Manifesto. Open Source Data Quality & Observability!

This is great, thanks for sharing!

2 Reactions

See more comments

To view or add a comment, sign in

See all

A Crash Course in DataOps

Sonia M.

Executive Director, Product Management, IBM Data & AI

DataOps Defined

Objectives

Usage

DataOps Dimensions

DataOps Steps

Implementing DataOps

More articles by this author

Insights from the community

Others also viewed

Validate data-driven decision making with DBT tool

''Unlocking Data Transformation Power: The dbt Revolution in Data Engineering"

Navigating the Chaos: Strategies and Technologies for Organizing Report Generation in a World of Disorganized Data

Data Vault 2.0 and Data Science

Wednesday Wisdom: Building Scalable Data Pipelines

Unlocking the Power of Data: The Future of Data Engineering

Unlocking Data Transformation Excellence with dbt - A Game-Changer for Modern Data Transformation

7 Great DataOps tools for your business

Zero-Code DataOps is the Future

An Introduction to Data Engineering: Building the Backbone of Modern Data Systems

Explore topics

DataOps Defined

Objectives

Usage

DataOps Dimensions

DataOps Steps

Implementing DataOps

Data Fabric - Its time has come

Aug 31, 2022

Why should you invest in DataOps?

Mar 4, 2022

Big Data Quality: Data Quality Today vs. 10 Years Ago

Dec 5, 2017

Women in Technology

Mar 6, 2015

Insights from the community

Others also viewed

Validate data-driven decision making with DBT tool

''Unlocking Data Transformation Power: The dbt Revolution in Data Engineering"

Navigating the Chaos: Strategies and Technologies for Organizing Report Generation in a World of Disorganized Data

Data Vault 2.0 and Data Science

Wednesday Wisdom: Building Scalable Data Pipelines

Unlocking the Power of Data: The Future of Data Engineering

Unlocking Data Transformation Excellence with dbt - A Game-Changer for Modern Data Transformation

7 Great DataOps tools for your business

Zero-Code DataOps is the Future

An Introduction to Data Engineering: Building the Backbone of Modern Data Systems

Explore topics