DP-203: Data Engineering on Microsoft Azure

In this course, you will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.

«Magi Naumova is a brilliant instructor! The layout of the presentations and demos, along with her profound understanding of all of the technologies, made for a great engaging learning experience.» Course delegate

About the instructor - Magi Naumova

Margarita Naumova is a very well-known SQL Expert. Magi holds the highest possible SQL Server Technical Certification in the field – Microsoft Certified Master, making her one of the best SQL Server Experts Worldwide. Magi is also a Microsoft Data Platform MVP (Most Valuable Professional). She has more than 20 years of SQL Server and BI technologies consulting and training experience and is a trusted advisor for many large companies in SQL Server Platform Area.

Currently she works as a Managing Partner and Chief SQL Architect of Inspir-it AS, her own newly established Consulting Company in Norway. Margarita is a regular speaker at the largest IT events, SQLBits, SQL Saturday in Europe.

Read more about Magi at  Microsoft MVP website

This course is originally a 4 day official course from Microsoft. Glasspaper Learning has added 1 day extra with content to provide more time for examples, exercises, demonstrations and discussions.

Audience

The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure.

The secondary audience for this course includes data analysts and data scientists who work with an alytical solutions built on Microsoft Azure.

Prerequisites

To attend this class you should already have fundamentals knowledge of data workloads and technologies, equivalent to the Microsoft Azure Data Fundamentals certification.

Specifically completing:

Learning objectives

After completing this course, students will be able to:

  • Identify common data engineering tasks, Describe common data engineering concepts, Identify Azure services for data engineering
  • Describe the key features and benefits of Azure Data Lake Storage Gen2, Enable Azure Data Lake Storage Gen2 in an Azure Storage account, Compare Azure Data Lake Storage Gen2 and Azure Blob storage, Describe where Azure Data Lake Storage Gen2 fits in the stages of analytical processing, Describe how Azure data Lake Storage Gen2 is used in common analytical workloads
  • Identify the business problems that Azure Synapse Analytics addresses, Describe core capabilities of Azure Synapse Analytics, Determine when to use Azure Synapse Analytics.
  • Use Azure Synapse serverless SQL pool to query files in a data lake, Use Azure Synapse serverless SQL pools to transform data in a data lake, Create a lake database in Azure Synapse Analytics, Secure data and manage users in Azure Synapse serverless SQL pools
  • Analyze data with Apache Spark in Azure Synapse Analytics, Transform data with Spark in Azure Synapse Analytics, Use Delta Lake in Azure Synapse Analytic
  • Analyze data in a relational data warehouse, Load data into a relational data warehouse, Manage and monitor data warehouse activities in Azure Synapse Analytics, Analyze and optimize data warehouse storage in Azure Synapse Analytics, Secure a data warehouse in Azure Synapse Analytics
  • Build a data pipeline in Azure Synapse Analytics, Use Spark Notebooks in an Azure Synapse Pipeline
  • Plan hybrid transactional and analytical processing using Azure Synapse Analytics, Implement Azure Synapse Link with Azure Cosmos DB, Implement Azure Synapse Link for SQL
  • Process real-time data streams and integrate the data they contain into applications and analytical solutions with Azure Stream Analytics. Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics, Visualize real-time data with Azure Stream Analytics and Power BI.
  • Evaluate whether Microsoft Purview is the right choice for your data discovery and governance needs, Discover trusted data using Microsoft Purview, Catalog data artifacts by using Microsoft Purview, Manage Power BI assets by using Microsoft Purview, Integrate Microsoft Purview and Azure Synapse Analytics
  • Provision an Azure Databricks workspace, Identify core workloads and personas for Azure Databricks, Describe key concepts of an Azure Databricks solution. Use Apache Spark in Azure Databricks, Use Delta Lake in Azure Databricks, Use SQL Warehouses in Azure Databricks, Run Azure Databricks Notebooks with Azure Data Factory

Course outline

Module 1: Get started with data engineering on Azure

Module 2: Analyze data with Azure Synapse Analytics serverless SQL pools

Module 3: Perform data engineering with Azure Synapse Apache Spark Pools

Module 4: Work with data warehouses using Azure Synapse Analytics

Module 5: Transfer and transform data with Azure Synapse Analytics pipelines

Module 6: Work with hybrid transactional and analytical processing (HTAP) Solutions using Azure Synapse Analytics

Module 7: Implement a data streaming solution with Azure Stream Analytics

Module 8: Govern data across an enterprise

Module 9: Data engineering with Azure Databricks

Certification

This training will help you prepare for exam DP-203: Data Engineering on Microsoft Azure

By passing this exam you will earn the Microsoft Certified: Azure Data Engineer Associate certification.