MS 20775: Performing Data Engineering on Microsoft HD Insight

Important notice!

This course will be retired by Microsoft on 30 June 2019.

Please go to the 2 new courses that replaces this training:






The primary audience for this course is data engineers, data architects, data scientists, and data developers who plan to implement big data engineering workflows on HDInsight.


In addition to their professional experience, students who attend this course should have:

  • Programming experience using R, and familiarity with common R packages
  • Knowledge of common statistical methods and data analysis best practices.
  • Basic knowledge of the Microsoft Windows operating system and its core functionality.
  • Working knowledge of relational databases.

Course goals:

After completing this course, students will be able to:

  • Deploy HDInsight Clusters.
  • Authorizing Users to Access Resources.
  • Loading Data into HDInsight.
  • Troubleshooting HDInsight.
  • Implement Batch Solutions.
  • Design Batch ETL Solutions for Big Data with Spark
  • Analyze Data with Spark SQL.
  • Analyze Data with Hive and Phoenix.
  • Describe Stream Analytics.
  • Implement Spark Streaming Using the DStream API.
  • Develop Big Data Real-Time Processing Solutions with Apache Storm.
  • Build Solutions that use Kafka and HBase.

Read complete course description:


This course maps directly to exam 70-775, which is one of the exams leading to the MCSE Data Management and Analytics certification. 

Other relevant courses

24. February
3 days
Classroom On Demand
3. February
3 days
Classroom On Demand Startup guarantee