hands-on lab

Ingesting and Transforming Data Using Azure Data Factory

Difficulty: Intermediate
Duration: Up to 45 minutes
Students: 118
Rating: 5/5
Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.
Learn and validateUse validations to check your solutions every step of the way.
See resultsTrack your knowledge and monitor your progress.

Description

Azure Data Factory is a cloud-based data integration service that allows you to create, schedule, and manage data pipelines for ingesting, preparing, transforming, and publishing data. Data cleansing and preparation are essential steps in the data processing workflow, ensuring that data is accurate, reliable, and ready for analysis.

Organizations often deal with large volumes of data from various sources, which can be messy, inconsistent, and contain errors. Data cleansing involves identifying and correcting inaccuracies, inconsistencies, and missing values in datasets. By standardizing data fields, removing duplicates, handling missing data, and splitting datasets, you can improve data quality and ensure that your data is ready for analysis and reporting.

In this hands-on lab, you will learn how to standardize and cleanse data fields in Azure Data Factory.

Learning objectives

Upon completion of this intermediate-level lab, you will be able to:

  • Standardize and cleanse data fields in Azure Data Factory.
  • Identify and remove duplicate records in datasets.
  • Handle missing data by filling in default values or removing incomplete records.
  • Split data into multiple streams based on specified criteria.

Intended audience

  • Candidates for Microsoft Certified: Azure Data Engineer Associate
  • Cloud Architects
  • Data Engineers
  • DevOps Engineers
  • Machine Learning Engineers
  • Software Engineers

Prerequisites

Familiarity with the following will be beneficial but is not required:

  • Basic understanding of data processing concepts
  • Introduction to Azure services
  • Basic knowledge of data storage solutions

Environment before

Environment after

Covered topics

Lab steps

Logging in to the Microsoft Azure Portal
Creating Azure Data Factory Datasets
Creating Azure Data Factory Data Flow
Configuring and Running Data Factory Pipeline