hands-on lab

Transforming Data With Apache Spark and Amazon EMR

Difficulty: Beginner

Duration: Up to 1 hour and 30 minutes

Students: 608

Rating: 5/5

Start lab

On average, students complete this lab in35m

Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.

Learn and validateUse validations to check your solutions every step of the way.

See resultsTrack your knowledge and monitor your progress.

About

Author

Description

Amazon EMR (formerly known as Amazon Elastic Map Reduce) is a big data platform that supports many popular open-source data processing frameworks, including Apache Spark. Amazon EMR simplifies the configuration, provisioning, and scaling of clusters for data analysis and processing workloads.

Learning how to use Amazon EMR will help anyone looking to understand how to perform big data processing in the real world.

In this hands-on lab, you will tour an Amazon EMR cluster, place data and a script in a location accessible to Amazon EMR, submit a workload to an Amazon EMR cluster, and examine the results.

Please note an Amazon EMR cluster takes approximately ten minutes to create and become usable. Please ensure you have enough time available before starting the lab.